Tanl Linguistic Pipeline

IXE::PostingList::const_iterator Class Reference

Inheritance diagram for IXE::PostingList::const_iterator:
IXE::PostingList::remap_iterator

List of all members.

Public Member Functions

reference operator* ()
pointer operator-> ()
const_iteratoroperator++ ()
 Advance a PostingList::const_iterator.
const_iterator operator++ (int)
const_iteratornext (DocID min)
 Advance posting list to document with DocID min.
bool atEnd ()
size_type size () const
void copyHits (std::fstream &o)
 Copy hits to file.
size_type index () const

Public Attributes

HitsCursor hitsCursor

Protected Member Functions

 const_iterator (size_type s, byte const *p)

Protected Attributes

size_type rest_
 rest_ contains the number of remaining elements including the current one, already read into posting (cannot be 0, which means end)
byte const * c_
 start of next Posting (may differ from end of current one)
size_type tablesz_
 The size of the skip list table.
size_type size_
 The size of the PostingList (same as size_ of parent PostingList).
PostingOffsettable_
 table_ contains pointers to posting lists of each Postings_Segment_Size element, and the DocID of the corresponding document.
value_type posting
Size hitlen

Friends

class PostingList
bool operator== (const_iterator const &i, const_iterator const &j)
bool operator!= (const_iterator const &i, const_iterator const &j)

Member Function Documentation

size_type IXE::PostingList::const_iterator::index (  )  const [inline]
Returns:
the position we are now in the posting list.

References rest_, and size_.

Referenced by IXE::PostingList::remap_iterator::operator++().

PostingList::const_iterator & IXE::PostingList::const_iterator::next ( DocID  min  ) 

Advance posting list to document with DocID min.

Exploit PostingOffset table_ to perform binary search and jump to start of segment containing requested posting.

Parameters:
min the DocID of the requested posting.

References c_, operator++(), IXE::parseEptacode(), rest_, size_, table_, and tablesz_.

Referenced by IXE::PostingList::remap_iterator::operator++().

PostingList::const_iterator & IXE::PostingList::const_iterator::operator++ (  ) 

Advance a PostingList::const_iterator.

Returns:
a reference to itself as is standard practice for iterators.

A posting has the following format:

I[0x80{M}...0x80]OL{H}^O

that is: a DocID (I) followed by zero or more TermColors (M) surrounded by 0x80 bytes, followed by the number of occurrences in the document (O), followed by the byte length of the hitlist less O (L), followed by O hits, i.e. positions where the word occurs in document I. Each H is a document position, represented as delta increment with respect to the previous one. First word is at position 1.

See also:
Indexer::writeIndex() for full details of the index file format.

Reimplemented in IXE::PostingList::remap_iterator.

References c_, IXE::parseEptacode(), and rest_.

Referenced by next().


Member Data Documentation

table_ contains pointers to posting lists of each Postings_Segment_Size element, and the DocID of the corresponding document.

This is used to perform binary search to the segment containing the posting. Thereafter the segment is scanned linearly. Ex: 0: off0, 1234 (1024th posting contains docID 1234, at offset off0) 1: off1, 2345 (2*1024th posting contains docID 2345, at offset off1) ... n: offn, NNNN ((n+1)*1024th posting contains docID NN, at offset offn)

Referenced by next(), and IXE::PostingList::remap_iterator::remap_iterator().


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
 
Copyright © 2005-2011 G. Attardi. Generated on 4 Mar 2011 by doxygen 1.6.1.