The Battle for Wesnoth  1.13.4+dev
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
Classes | Typedefs | Functions
utf8 Namespace Reference

Functions for converting Unicode wide-char strings to UTF-8 encoded strings, back and forth. More...

Classes

class  invalid_utf8_exception
 Thrown by operations encountering invalid UTF-8 data. More...
 

Typedefs

typedef ucs4::iterator_base
< std::string,
ucs4_convert_impl::convert_impl
< char_t >::type
iterator
 
typedef char char_t
 
typedef std::string string
 

Functions

static int byte_size_from_utf8_first (const unsigned char ch)
 
utf8::string lowercase (const utf8::string &s)
 Returns a lowercased version of the string. More...
 
size_t index (const utf8::string &str, const size_t index)
 Codepoint index corresponding to the nth character in a UTF-8 string. More...
 
size_t size (const utf8::string &str)
 Length in characters of a UTF-8 string. More...
 
utf8::stringinsert (utf8::string &str, const size_t pos, const utf8::string &insert)
 Insert a UTF-8 string at the specified position. More...
 
utf8::stringerase (utf8::string &str, const size_t start, const size_t len=std::string::npos)
 Erases a portion of a UTF-8 string. More...
 
utf8::stringtruncate (utf8::string &str, const size_t size)
 Truncates a UTF-8 string to the specified number of characters. More...
 
void truncate_as_ucs4 (utf8::string &str, const size_t size)
 Truncates a UTF-8 string to the specified number of characters. More...
 

Detailed Description

Functions for converting Unicode wide-char strings to UTF-8 encoded strings, back and forth.

Typedef Documentation

typedef char utf8::char_t

Definition at line 29 of file unicode_types.hpp.

Definition at line 42 of file unicode.hpp.

typedef std::string utf8::string

Definition at line 30 of file unicode_types.hpp.

Function Documentation

static int utf8::byte_size_from_utf8_first ( const unsigned char  ch)
static

Definition at line 38 of file unicode.cpp.

References count_leading_ones().

Referenced by index(), and size().

utf8::string & utf8::erase ( utf8::string str,
const size_t  start,
const size_t  len = std::string::npos 
)

Erases a portion of a UTF-8 string.

Parameters
strUTF-8 encoded string.
startStart position.
lenNumber of characters to erase.
Note
This implementation does not check for valid UTF-8. Don't use it for user input.

Definition at line 106 of file unicode.cpp.

References index(), pos, and size().

Referenced by gui2::ttext_box::delete_selection(), main(), and truncate().

size_t utf8::index ( const utf8::string str,
const size_t  index 
)

Codepoint index corresponding to the nth character in a UTF-8 string.

Returns
str.length() if there are less than index characters.

Definition at line 73 of file unicode.cpp.

References byte_size_from_utf8_first(), ERR_GENERAL, and i.

Referenced by BOOST_AUTO_TEST_CASE(), gui2::ttext_::copy_selection(), erase(), insert(), and font::ttext::insert_text().

utf8::string & utf8::insert ( utf8::string str,
const size_t  pos,
const utf8::string insert 
)
utf8::string utf8::lowercase ( const utf8::string s)
size_t utf8::size ( const utf8::string str)

Length in characters of a UTF-8 string.

Definition at line 88 of file unicode.cpp.

References byte_size_from_utf8_first(), ERR_GENERAL, and i.

Referenced by BOOST_AUTO_TEST_CASE(), create(), destroy(), erase(), execute_command(), execute_window(), font::ttext::insert_text(), main(), modify(), and truncate_as_ucs4().

utf8::string & utf8::truncate ( utf8::string str,
const size_t  size 
)

Truncates a UTF-8 string to the specified number of characters.

Parameters
strUTF-8 encoded string.
sizeSize to truncate to.
Note
This implementation does not check for valid UTF-8. Don't use it for user input.

Definition at line 119 of file unicode.cpp.

References erase().

Referenced by BOOST_AUTO_TEST_CASE(), utils::ellipsis_truncate(), and font::ttext::set_maximum_length().

void utf8::truncate_as_ucs4 ( utf8::string str,
const size_t  size 
)

Truncates a UTF-8 string to the specified number of characters.

If the string has more than size UTF-8 characters it will be truncated to this size.

The output is guaranteed to be valid UTF-8.

Parameters
[in]strString encoded in UTF-8.
[out]strString encoded UTF-8 that contains at most size codepoints.
sizeThe size to truncate to.

Definition at line 124 of file unicode.cpp.

References size(), and unicode_cast().

Referenced by wesnothd::chat_message::truncate_message().