Tanl Linguistic Pipeline |
TermHit is used to represent a word occurrence in a document, a sentence delimiter or a tag. More...
#include <TermHit.h>
Public Types | |
enum | Type { word, fullstop, parstop, tag, lex } |
enum | Case { lower, upper, capital } |
typedef char | Char |
Public Member Functions | |
TermHit (Char *term=0) | |
void | MarkTag () |
Mark term as tag to distinguish it from normal words. | |
void | UnmarkTag () |
Public Attributes | |
Char * | term |
the normalized term, to be indexed | |
Char * | form |
the original term form, stored in lexicon | |
TermColor | color |
tag or name of attribute within which word appears | |
Type | type |
the type of term | |
int | length |
its length | |
Case | case_ |
word case | |
TermWeight | weight |
ranking weight | |
off32_t | offset |
char position from beginning of file (not byte offset) | |
HitPosition | position |
term position in document |
TermHit is used to represent a word occurrence in a document, a sentence delimiter or a tag.
enum IXE::TermHit::Type |
void IXE::TermHit::MarkTag | ( | ) | [inline] |
Mark term as tag to distinguish it from normal words.
Used to store tags as well as words in the same index.
References term.