Tanl Linguistic Pipeline

Tanl::Text::Encoding Class Reference

List of all members.

Public Types

typedef unsigned char ID

Public Member Functions

 Encoding (char const name[], ID id, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0)
 Encoding (char const *name, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0)
std::string Name ()
 name of this encoding
size_t Encode (Encoding const *fromCode, char const *in, size_t inlen, char *&out, size_t outlen=0) const
 Converts a multibyte sequence starting at in, of length inlen, from character encoding fromCode to this encoding.

Static Public Member Functions

static Encoding const * get (char const *name)
 Get the encoding with the given name.
static Encoding const * get (ID id)
 Get the encoding with the given id.
static void Register (Encoding *encoding)
 register known Encodings
static void Register (char const *alias, char const *canonical)

Public Attributes

std::string name
 the official canonical name
ID id
 the internal id for the encoding
float averageBytesPerChar
 the average bytes used to encode one character
float maxBytesPerChar
 the maximum count of bytes use to encode one character

Member Function Documentation

size_t Tanl::Text::Encoding::Encode ( Encoding const *  fromCode,
char const *  in,
size_t  inlen,
char *&  out,
size_t  outlen = 0 
) const

Converts a multibyte sequence starting at in, of length inlen, from character encoding fromCode to this encoding.

The converted sequence is stored in out, for a maximum size of outlen. If outlen is 0, a buffer is allocated with malloc() and returned in out.

Returns:
the size of the converted sequence, or 0 if conversion failed.

References averageBytesPerChar, and name.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
 
Copyright © 2005-2011 G. Attardi. Generated on 4 Mar 2011 by doxygen 1.6.1.