Functions for converting Unicode wide-char strings to UTF-8 encoded strings, back and forth. More...
Classes | |
class | invalid_utf8_exception |
Thrown by operations encountering invalid UTF-8 data. More... | |
Typedefs | |
typedef ucs4::iterator_base < std::string, ucs4_convert_impl::convert_impl < char_t >::type > | iterator |
typedef char | char_t |
typedef std::string | string |
Functions | |
static int | byte_size_from_utf8_first (const unsigned char ch) |
utf8::string | lowercase (const utf8::string &s) |
Returns a lowercased version of the string. More... | |
size_t | index (const utf8::string &str, const size_t index) |
Codepoint index corresponding to the nth character in a UTF-8 string. More... | |
size_t | size (const utf8::string &str) |
Length in characters of a UTF-8 string. More... | |
utf8::string & | insert (utf8::string &str, const size_t pos, const utf8::string &insert) |
Insert a UTF-8 string at the specified position. More... | |
utf8::string & | erase (utf8::string &str, const size_t start, const size_t len=std::string::npos) |
Erases a portion of a UTF-8 string. More... | |
utf8::string & | truncate (utf8::string &str, const size_t size) |
Truncates a UTF-8 string to the specified number of characters. More... | |
void | truncate_as_ucs4 (utf8::string &str, const size_t size) |
Truncates a UTF-8 string to the specified number of characters. More... | |
Functions for converting Unicode wide-char strings to UTF-8 encoded strings, back and forth.
typedef char utf8::char_t |
Definition at line 29 of file unicode_types.hpp.
typedef ucs4::iterator_base<std::string, ucs4_convert_impl::convert_impl<char_t>::type> utf8::iterator |
Definition at line 42 of file unicode.hpp.
typedef std::string utf8::string |
Definition at line 30 of file unicode_types.hpp.
|
static |
Definition at line 38 of file unicode.cpp.
References count_leading_ones().
utf8::string & utf8::erase | ( | utf8::string & | str, |
const size_t | start, | ||
const size_t | len = std::string::npos |
||
) |
Erases a portion of a UTF-8 string.
str | UTF-8 encoded string. |
start | Start position. |
len | Number of characters to erase. |
Definition at line 106 of file unicode.cpp.
References index(), pos, and size().
Referenced by gui2::ttext_box::delete_selection(), main(), and truncate().
size_t utf8::index | ( | const utf8::string & | str, |
const size_t | index | ||
) |
Codepoint index corresponding to the nth character in a UTF-8 string.
index
characters. Definition at line 73 of file unicode.cpp.
References byte_size_from_utf8_first(), ERR_GENERAL, and i.
Referenced by BOOST_AUTO_TEST_CASE(), gui2::ttext_::copy_selection(), erase(), insert(), and font::ttext::insert_text().
utf8::string & utf8::insert | ( | utf8::string & | str, |
const size_t | pos, | ||
const utf8::string & | insert | ||
) |
Insert a UTF-8 string at the specified position.
Definition at line 101 of file unicode.cpp.
References index().
Referenced by create(), destroy(), execute_command(), execute_window(), pathfind::paths::dest_vect::insert(), utils::smart_list< Data >::insert(), font::ttext::insert_text(), font::ttext::insert_unicode(), main(), wesnothd::make_add_diff(), wesnothd::make_change_diff(), modify(), and utils::smart_list< Data >::resize().
utf8::string utf8::lowercase | ( | const utf8::string & | s | ) |
Returns a lowercased version of the string.
Definition at line 53 of file unicode.cpp.
References ucs4::iterator_base< string_type, update_implementation >::end(), itor, ucs4::iterator_base< string_type, update_implementation >::substr(), uchar, and unicode_cast().
Referenced by wesnothd::server::bans_handler(), BOOST_AUTO_TEST_CASE(), gui2::contains(), wesnothd::server::dul_handler(), wesnothd::server::handle_login(), campaignd::server::handle_upload(), campaignd::blacklist::is_in_globlist(), filesystem::looks_like_pbl(), wesnothd::ban_manager::parse_time(), wesnothd::server::process_command(), and gui2::tchat_log::model::stream_log().
size_t utf8::size | ( | const utf8::string & | str | ) |
Length in characters of a UTF-8 string.
Definition at line 88 of file unicode.cpp.
References byte_size_from_utf8_first(), ERR_GENERAL, and i.
Referenced by BOOST_AUTO_TEST_CASE(), create(), destroy(), erase(), execute_command(), execute_window(), font::ttext::insert_text(), main(), modify(), and truncate_as_ucs4().
utf8::string & utf8::truncate | ( | utf8::string & | str, |
const size_t | size | ||
) |
Truncates a UTF-8 string to the specified number of characters.
str | UTF-8 encoded string. |
size | Size to truncate to. |
Definition at line 119 of file unicode.cpp.
References erase().
Referenced by BOOST_AUTO_TEST_CASE(), utils::ellipsis_truncate(), and font::ttext::set_maximum_length().
void utf8::truncate_as_ucs4 | ( | utf8::string & | str, |
const size_t | size | ||
) |
Truncates a UTF-8 string to the specified number of characters.
If the string has more than size
UTF-8 characters it will be truncated to this size.
The output is guaranteed to be valid UTF-8.
[in] | str | String encoded in UTF-8. |
[out] | str | String encoded UTF-8 that contains at most size codepoints. |
size | The size to truncate to. |
Definition at line 124 of file unicode.cpp.
References size(), and unicode_cast().
Referenced by wesnothd::chat_message::truncate_message().