Tanl Linguistic Pipeline |
A Named Entity Recognizer. More...
#include <NER.h>
Public Member Functions | |
NER (char const *modelFile, char const *configFile=0, char const *POStag="POSTAG", char const *NEtag="NETAG") | |
Create a NER using model in modelFile and parameters from configFile. | |
void | train (SentenceReader *sentenceReader, char const *modelFile) |
Train the NER reading tagged documents from a corpus using sentenceReader and save model to file modelFile. | |
std::vector< Token * > * | tag (std::vector< Token * > *sent, NerEventStream *eventStream=0) |
Tag the sentence sent , in the context of an eventStream which holds global document information. | |
Enumerator< std::vector< Token * > * > * | pipe (Enumerator< std::vector< Token * > * > &tve) |
Create a pipe pulling from an Enumerator tve. | |
Public Attributes | |
IXE::conf< std::string > | resourceDir |
Directory containg resource files used by the Tagger. | |
IXE::conf< std::string > | language |
The language. | |
IXE::conf< int > | cutoff |
Feature cutoff. | |
IXE::conf< int > | iter |
Number of iterations. | |
IXE::conf< float > | alpha |
Accuracy value for termination. | |
IXE::conf< bool > | verbose |
Verbose output. | |
char const * | POStag |
char const * | NEtag |
Friends | |
class | NerPipe |
class | NerPyPipe |
A Named Entity Recognizer.
Tanl::NER::NER::NER | ( | char const * | modelFile, | |
char const * | configFile = 0 , |
|||
char const * | POStag = "POSTAG" , |
|||
char const * | NEtag = "NETAG" | |||
) |
Create a NER using model in modelFile and parameters from configFile.
Directory containg resource files used by the Tagger.
References language, Tanl::NER::Resources::load(), IXE::Configuration::load(), and resourceDir.
Enumerator<std::vector<Token*>*>* Tanl::NER::NER::pipe | ( | Enumerator< std::vector< Token * > * > & | tve | ) | [virtual] |
Create a pipe pulling from an Enumerator tve.
tve. |
Implements Tanl::IPipe< std::vector< Token * > *, std::vector< Token * > * >.
vector< Token * > * Tanl::NER::NER::tag | ( | std::vector< Token * > * | sent, | |
NerEventStream * | eventStream = 0 | |||
) |
Tag the sentence sent
, in the context of an eventStream
which holds global document information.
References Tanl::NER::NerEventStream::analyze(), Tanl::NER::Resources::entityTypes, Tanl::Classifier::MaxEnt::estimate(), Tanl::Token::get(), Tanl::NER::NerEventStream::hasNext(), Tanl::NER::NerEventStream::next(), Tanl::Classifier::Classifier::NumOutcomes(), Tanl::Classifier::Classifier::OutcomeID(), and Tanl::NER::NerEventStream::predicted().
Referenced by Tanl::NER::NerPyPipe::Current().
Verbose output.
Referenced by train().