Tanl Linguistic Pipeline

Tanl::POS::TrieNode Struct Reference

Trie to represent the suffices plus the additional tag counting information. More...

#include <SuffixGuesser.h>

List of all members.

Classes

struct  counts_iterator
 Iterator of a TrieNode. More...

Public Member Functions

 TrieNode ()
 Default constructor.
void set_tag_info (Counts *tag)
 tag info setter.
void serialize (std::ostream &out)
 Serializes a TrieNode object.
void serialize (std::istream &in)
 De-serializes a TrieNode object.
TrieNodeadd_char (Counts *legacy_counts, bool after_branch, int ix, int stop, std::string &word, TagID tag, int count)
 Recursive method that adds a word, char by char to the trie.
bool empty_node ()
 This method verifies whether the node is empty.

Public Attributes

Countstag_info
 Tag information. Can be null since it is optional.
bool terminal
 Indicates whether the Node represents a terminal node.

Detailed Description

Trie to represent the suffices plus the additional tag counting information.

The trie is designed defining a struct TrieNode that inherits from map<char, TrieNode*>. That way each node of the trie is as well a map that contains for each char a descendant. At the same time each node contains additional tag information, that may be null.

The trie defines an internal counts_iterator structure that allows us to iterate over the counts of the suffices in the trie.


Constructor & Destructor Documentation

Tanl::POS::TrieNode::TrieNode (  )  [inline]

Default constructor.

This constructor sets the tag information to null and the terminal flag to false;


Member Function Documentation

TrieNode* Tanl::POS::TrieNode::add_char ( Counts legacy_counts,
bool  after_branch,
int  ix,
int  stop,
std::string &  word,
TagID  tag,
int  count 
)

Recursive method that adds a word, char by char to the trie.

This method updates the trie of suffices by adding new branches or updating the already existing info about the tag counting.

For example if we have the EMPTY trie and we add to it the word dog with tag info: tag=1 -- value=20, the following trie will be obtained (each node has between parenthesis the tag counting info)

    Node (global = 20, map = {1:20})
    |g
    ---->  Node (null)
           |o
           ----> Node (null)
                 |d
                 ----> Terminal Node (null)

Now if we add the word slumdog to the previous trie with tag info: tag = 2 -- value 10, then the following will be obtained

    Node (global = 30, map = {1:20, 2:10})
    |g
    ---->  Node (null)
           |o
           ----> Node (null)
                 |d
                 ----> Node (null)
                       |m
                       ----> Node (global = 10, map = {2:10})
                             |u
                             ----> Node (null)
                                   |l
                                   ----> Node (null)
                                         |s
                                         ----> Terminal Node (null)
Parameters:
legacy_counts Tag info that was inherited from upper nodes.
after_branch Must be true when the add_char call is from outside the trie or when it is right after a braching operation.
ix At the begining is the end of the word
stop It indicates the position in the word that bounds the suffix. If it is 0 the method will add the whole word to the trie.
word String that we are trying to add to the trie.
tag Integer that represents the tag we are using.
count Is the amount of times word was found in the corpus tagged with tag
Returns:
A pointer to the updated trie
bool Tanl::POS::TrieNode::empty_node (  ) 

This method verifies whether the node is empty.

An empty is generaly composed by 0 children, no tag_info and it is not a terminal.

Returns:
true is the node is empty, false otherwise.

References tag_info, and terminal.

void Tanl::POS::TrieNode::serialize ( std::istream &  in  ) 

De-serializes a TrieNode object.

Parameters:
in The stream from which the object will be read

References Tanl::POS::Counts::serialize(), serialize(), tag_info, and terminal.

void Tanl::POS::TrieNode::serialize ( std::ostream &  out  ) 

Serializes a TrieNode object.

Parameters:
out The stream wherein the object will be written

References Tanl::POS::Counts::serialize(), tag_info, and terminal.

Referenced by Tanl::POS::SuffixGuesser::serialize(), and serialize().

void Tanl::POS::TrieNode::set_tag_info ( Counts tag  ) 

tag info setter.

While setting the new value, if an old value exists it is destroyed.

Parameters:
tag New tag info value

References tag_info.


The documentation for this struct was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
 
Copyright © 2005-2011 G. Attardi. Generated on 4 Mar 2011 by doxygen 1.6.1.