Tanl Linguistic Pipeline

IXE::TermHit Struct Reference

TermHit is used to represent a word occurrence in a document, a sentence delimiter or a tag. More...

#include <TermHit.h>

List of all members.

Public Types

enum  Type {
  word, fullstop, parstop, tag,
  lex
}
enum  Case { lower, upper, capital }
typedef char Char

Public Member Functions

 TermHit (Char *term=0)
void MarkTag ()
 Mark term as tag to distinguish it from normal words.
void UnmarkTag ()

Public Attributes

Char * term
 the normalized term, to be indexed
Char * form
 the original term form, stored in lexicon
TermColor color
 tag or name of attribute within which word appears
Type type
 the type of term
int length
 its length
Case case_
 word case
TermWeight weight
 ranking weight
off32_t offset
 char position from beginning of file (not byte offset)
HitPosition position
 term position in document

Detailed Description

TermHit is used to represent a word occurrence in a document, a sentence delimiter or a tag.


Member Enumeration Documentation

Enumerator:
word 

normal word

fullstop 

end of paragraph

parstop 

end of sentence

tag 

annotation tag

lex 

enter in lexicon, do not index


Member Function Documentation

void IXE::TermHit::MarkTag (  )  [inline]

Mark term as tag to distinguish it from normal words.

Used to store tags as well as words in the same index.

References term.


The documentation for this struct was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
 
Copyright © 2005-2011 G. Attardi. Generated on 4 Mar 2011 by doxygen 1.6.1.