torque Torque Game Engine Documentation
TGE Version 1.5.2

engine/core/unicode.h File Reference

#include "platform/types.h"

Functions

UTF16convertUTF8toUTF16 (const UTF8 *unistring)
 Functions that convert buffers of unicode code points, allocating a buffer.
UTF32convertUTF8toUTF32 (const UTF8 *unistring)
UTF8convertUTF16toUTF8 (const UTF16 *unistring)
UTF32convertUTF16toUTF32 (const UTF16 *unistring)
UTF8convertUTF32toUTF8 (const UTF32 *unistring)
UTF16convertUTF32toUTF16 (const UTF32 *unistring)
const U32 convertUTF8toUTF16 (const UTF8 *unistring, UTF16 *outbuffer, U32 len)
 Functions that convert buffers of unicode code points, into a provided buffer.
const U32 convertUTF8toUTF32 (const UTF8 *unistring, UTF32 *outbuffer, U32 len)
const U32 convertUTF16toUTF8 (const UTF16 *unistring, UTF8 *outbuffer, U32 len)
const U32 convertUTF16toUTF32 (const UTF16 *unistring, UTF32 *outbuffer, U32 len)
const U32 convertUTF32toUTF8 (const UTF32 *unistring, UTF8 *outbuffer, U32 len)
const U32 convertUTF32toUTF16 (const UTF32 *unistring, UTF16 *outbuffer, U32 len)
const UTF32 oneUTF8toUTF32 (const UTF8 *codepoint, U32 *unitsWalked=NULL)
 Functions that converts one unicode codepoint at a time
  • Since these functions are designed to be used in tight loops, they do not allocate buffers.

const UTF32 oneUTF16toUTF32 (const UTF16 *codepoint, U32 *unitsWalked=NULL)
const UTF16 oneUTF32toUTF16 (const UTF32 codepoint)
const U32 oneUTF32toUTF8 (const UTF32 codepoint, UTF8 *threeByteCodeunitBuf)
const U32 dStrlen (const UTF16 *unistring)
 Functions that calculate the length of unicode strings.
const U32 dStrlen (const UTF32 *unistring)
const UTF8getNthCodepoint (const UTF8 *unistring, const U32 n)
 Functions that scan for characters in a utf8 string.


Function Documentation

UTF16* convertUTF8toUTF16 ( const UTF8 unistring  ) 

Functions that convert buffers of unicode code points, allocating a buffer.

  • These functions allocate their own return buffers. You are responsible for calling delete[] on these buffers.
  • Because they allocate memory, do not use these functions in a tight loop.
  • These are usefull when you need a new long term copy of a string.

UTF32* convertUTF8toUTF32 ( const UTF8 unistring  ) 

UTF8* convertUTF16toUTF8 ( const UTF16 unistring  ) 

UTF32* convertUTF16toUTF32 ( const UTF16 unistring  ) 

UTF8* convertUTF32toUTF8 ( const UTF32 unistring  ) 

UTF16* convertUTF32toUTF16 ( const UTF32 unistring  ) 

const U32 convertUTF8toUTF16 ( const UTF8 unistring,
UTF16 outbuffer,
U32  len 
)

Functions that convert buffers of unicode code points, into a provided buffer.

  • These functions are useful for working on existing buffers.
  • These cannot convert a buffer in place. If unistring is the same memory as outbuffer, the behavior is undefined.
  • The converter clamps output to the BMP (Basic Multilingual Plane) .
  • Conversion to UTF-8 requires a buffer of 3 bytes (U8's) per character, + 1.
  • Conversion to UTF-16 requires a buffer of 1 U16 (2 bytes) per character, + 1.
  • Conversion to UTF-32 requires a buffer of 1 U32 (4 bytes) per character, + 1.
  • UTF-8 only requires 3 bytes per character in the worst case.
  • Output is null terminated. Be sure to provide 1 extra byte, U16 or U32 for the null terminator, or you will see truncated output.
  • If the provided buffer is too small, the output will be truncated.

const U32 convertUTF8toUTF32 ( const UTF8 unistring,
UTF32 outbuffer,
U32  len 
)

const U32 convertUTF16toUTF8 ( const UTF16 unistring,
UTF8 outbuffer,
U32  len 
)

const U32 convertUTF16toUTF32 ( const UTF16 unistring,
UTF32 outbuffer,
U32  len 
)

const U32 convertUTF32toUTF8 ( const UTF32 unistring,
UTF8 outbuffer,
U32  len 
)

const U32 convertUTF32toUTF16 ( const UTF32 unistring,
UTF16 outbuffer,
U32  len 
)

const UTF32 oneUTF8toUTF32 ( const UTF8 codepoint,
U32 unitsWalked = NULL 
)

Functions that converts one unicode codepoint at a time

  • Since these functions are designed to be used in tight loops, they do not allocate buffers.

  • oneUTF8toUTF32() and oneUTF16toUTF32() return the converted Unicode code point in *codepoint, and set *unitsWalked to the # of code units *codepoint took up. The next Unicode code point should start at *(codepoint + *unitsWalked).
  • oneUTF32toUTF8() requires a 3 byte buffer, and returns the # of bytes used.

const UTF32 oneUTF16toUTF32 ( const UTF16 codepoint,
U32 unitsWalked = NULL 
)

const UTF16 oneUTF32toUTF16 ( const UTF32  codepoint  ) 

const U32 oneUTF32toUTF8 ( const UTF32  codepoint,
UTF8 threeByteCodeunitBuf 
)

const U32 dStrlen ( const UTF16 unistring  ) 

Functions that calculate the length of unicode strings.

  • Since calculating the length of a UTF8 string is nearly as expensive as converting it to another format, a dStrlen for UTF8 is not provided here.
  • If *unistring does not point to a null terminated string of the correct type, the behavior is undefined.

const U32 dStrlen ( const UTF32 unistring  ) 

const UTF8* getNthCodepoint ( const UTF8 unistring,
const U32  n 
)

Functions that scan for characters in a utf8 string.

  • this is useful for getting a character-wise offset into a UTF8 string, as opposed to a byte-wise offset into a UTF8 string: foo[i]




All Rights Reserved GarageGames.com, Inc. 1999-2005
Auto-magically Generated with Doxygen