Tanl Linguistic Pipeline |
Base class for parsers. More...
#include <Parser.h>
Public Member Functions | |
Parser (WordIndex &predIndex) | |
virtual void | train (SentenceReader *sentenceReader, char const *modelFile) |
Train statistical model using sentences obtained through a sentenceReader , and save the generated model to modelFile . | |
virtual Sentence * | parse (Sentence *sentence) |
Parse the given Sentence sentence . | |
virtual void | parse (SentenceReader *sentenceReader, std::ostream &os=std::cout) |
Parse all sentences extracted by sentenceReader , sending output to os . | |
virtual void | revise (SentenceReader *sentenceReader, char const *actionFile=0) |
Produce a revision of a document parses, using either a model or an action file. | |
std::deque< Sentence * > | collectSentences (Enumerator< Sentence * > *sentenceReader) |
Collect sentences and replace unfrequent token attributes with UNKNOWN. | |
virtual void | showEval (int tokenCount, int las, int uas, int sentCount) |
Print accuracy estimates. | |
void | writeHeader (std::ostream &os) |
Write model header to stream. | |
Enumerator< Sentence * > * | pipe (Enumerator< std::vector< Token * > * > &tve) |
IPipe interface. | |
Enumerator< Sentence * > * | pipe (Enumerator< Sentence * > &tce) |
Alternative pipeline interface, that allows connecting directly to a SentenceReader. | |
virtual void | preprocess (Sentence *sentence) |
Preprocess sentence, e.g. | |
Static Public Member Functions | |
static Parser * | create (char const *modelFile=0) |
Create a Parser based on configuration and data in file modelFile . | |
static bool | readHeader (std::istream &is) |
Read model header from stream. | |
static std::string | procStat () |
Return a string of process statistics: time: user+sys elapsed, realtime elapsed, CPU usage, memory usage. | |
Public Attributes | |
WordIndex & | predIndex |
GlobalInfo | info |
Static Public Attributes | |
static IXE::conf< int > | featureCutoff |
Drop features which occur less than this number of times. | |
static IXE::conf< int > | lexCutoff |
Form or lemmas occurring less than LexCutoff are collapsed to Unknown. | |
static IXE::conf< bool > | verbose |
Control output. |
Base class for parsers.
Parse the given Sentence sentence
.
Reimplemented in Parser::ApParser, Parser::MeParser, Parser::MlpParser, Parser::MultiSvmParser, and Parser::SvmParser.
Referenced by Parser::ParserSentPipe::Current(), Parser::ParserPipe::Current(), and Parser::ParserPipePython::Current().
Enumerator< Sentence * > * Parser::Parser::pipe | ( | Enumerator< std::vector< Token * > * > & | tve | ) |
IPipe interface.
tve. |
void Parser::Parser::preprocess | ( | Sentence * | sentence | ) | [virtual] |
Preprocess sentence, e.g.
normalize tokens.
References Tanl::Token::links.
Referenced by collectSentences(), Parser::SvmParser::parse(), Parser::MultiSvmParser::parse(), Parser::MlpParser::parse(), Parser::MeParser::parse(), and Parser::ApParser::parse().
static bool Parser::Parser::readHeader | ( | std::istream & | is | ) | [static] |
virtual void Parser::Parser::revise | ( | SentenceReader * | sentenceReader, | |
char const * | actionFile = 0 | |||
) | [inline, virtual] |
Produce a revision of a document parses, using either a model or an action file.
If an actionFile
is provided, it must contain a list of actions, one per line, to apply to the parse trees, otherwise the actions to perform revisions are determined using the model.
Reimplemented in Parser::ApParser, Parser::MeParser, and Parser::MlpParser.
void Parser::Parser::showEval | ( | int | tokenCount, | |
int | las, | |||
int | uas, | |||
int | sentCount | |||
) | [virtual] |
Print accuracy estimates.
void Parser::Parser::writeHeader | ( | std::ostream & | os | ) |
Write model header to stream.
os |
Referenced by Parser::MultiSvmParser::train(), Parser::MlpParser::train(), Parser::MeParser::train(), and Parser::ApParser::train().
conf< int > Parser::Parser::featureCutoff [static] |
Drop features which occur less than this number of times.
Referenced by Parser::MlpModel::collectEvents(), Parser::MultiSvmParser::train(), Parser::MeParser::train(), and Parser::ApParser::train().
conf< int > Parser::Parser::lexCutoff [static] |
Form or lemmas occurring less than LexCutoff are collapsed to Unknown.