Tanl Linguistic Pipeline |
Public Types | |
typedef unsigned char | ID |
Public Member Functions | |
Encoding (char const name[], ID id, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0) | |
Encoding (char const *name, float averageBytesPerChar=1.0, float maxBytesPerChar=1.0) | |
std::string | Name () |
name of this encoding | |
size_t | Encode (Encoding const *fromCode, char const *in, size_t inlen, char *&out, size_t outlen=0) const |
Converts a multibyte sequence starting at in , of length inlen , from character encoding fromCode to this encoding. | |
Static Public Member Functions | |
static Encoding const * | get (char const *name) |
Get the encoding with the given name. | |
static Encoding const * | get (ID id) |
Get the encoding with the given id. | |
static void | Register (Encoding *encoding) |
register known Encodings | |
static void | Register (char const *alias, char const *canonical) |
Public Attributes | |
std::string | name |
the official canonical name | |
ID | id |
the internal id for the encoding | |
float | averageBytesPerChar |
the average bytes used to encode one character | |
float | maxBytesPerChar |
the maximum count of bytes use to encode one character |
size_t Tanl::Text::Encoding::Encode | ( | Encoding const * | fromCode, | |
char const * | in, | |||
size_t | inlen, | |||
char *& | out, | |||
size_t | outlen = 0 | |||
) | const |
Converts a multibyte sequence starting at in
, of length inlen
, from character encoding fromCode
to this encoding.
The converted sequence is stored in out
, for a maximum size of outlen
. If outlen
is 0, a buffer is allocated with malloc() and returned in out
.
References averageBytesPerChar, and name.