Tanl Linguistic Pipeline |
Read a plain text file, split into tokens. More...
#include <PtbTokenizer.h>
Public Member Functions | |
PtbTokenizer (std::istream *is, char const *lang=0) | |
Creates a new Tokenizer . | |
PtbScanner::Token const * | Current () |
Returns the current token. | |
bool | MoveNext () |
Advance to the next token and return true if there is one available. | |
void | Reset () |
Restart. |
Read a plain text file, split into tokens.
Provides an Enumerator interface, since it needs to look at next token to determine if there are more. An Iterator interface would be more cumbersome to implement.
Tanl::PtbTokenizer::PtbTokenizer | ( | std::istream * | is, | |
char const * | lang = 0 | |||
) | [inline] |
Creates a new Tokenizer
.
is | the stream containing the sentence to read. |