General
Unicode
support in the Solaris Operating Environment shows what is needed
for software developers to support UTF-8.
The Open
Group's summary of ISO C Amendment 1 is a detailed explanation on
locale and wide character technologies.
Markus Kuhn's UTF-8
and Unicode FAQ for Unix/Linux is a detailed explanation on UTF-8
and Unicode.
What is
MOJIBAKE shows what occurs when character handling is improper.
Mojibake is a Japanese word which almost all computer users (not only
Linux/BSD/Unix but also Windows/Macintosh) know.
MARUCHIRINGARU
KANKYOU NO JITSUGEN - X Window/Wnn/Mule/WWW BURAUZA DENO TAKOKUGO
KANKYO" or "Realization of Multilingual Environment -
Multilingual Environment in X Window/Wnn/Mule/WWW Browser" (in Japanese),
ISBN4-88735-020-1, TOPPAN, 1996
KOKUSAIKA
PUROGURAMINGU - I18N HANDOBUKKU" or "Internationalization
Programming - I18N Handbook" (in Japanese), ISBN4-320-02904-6, KYORITSU,
1998
Linux/FreeBSD
NIHONGO KANKYOU NO KOUCHIKU TO KATSUYOU" or "Construction
and Utilization of Linux/FreeBSD Japanese Environment" (in Japanese),
ISBN4-7973-0480-4, SOFTBANK, 1997
MOJI KOODO
NO SEKAI" or "The World of Character Codes" (in
Japanese), ISBN4-501-53060-X, Tokyo Denki University Press Center, 1999
Characters (general)
Character
Tables Graphic images for various character sets in the world.
Ken
Lunde's CJK info information on CJK (Chinese, Japanese, and Korean)
character set standards, written by the writer of "CJKV Information
Processing" published by O'Reilly.
IANA
character set registry Note that both coded character sets (for
example, KS_C_5601-1987, MIBenum 36) and encodings (for example, ISO-2022-KR,
MIBenum: 37) are registered. How confusing!
International Register of
Coded Character Sets A complete list of registered CCS, with ISO
2022 escape sequences. PDF files for these CCS are also available.
Characters (ISO 8859)
Characters (ISO 2022)
Characters (ISO 10646 and Unicode)
Softwares
Arena-i18n
Multilingual web browser.
Mozilla is also a
multilingual web browser.
Mule Multilingual editor
whose function is included in GNU Emacs 20 and XEmacs 20. Mule is the most
advanced m17n software in my knowledge.
JFBTERM
(in Japanese) is a multilingual terminal for Linux framebuffer console.
Supported encodings are ISO 2022, EUC-JP, CN-GB, and EUC-KR. Supported CCS are
ISO 8859-{1,2,3,4,5,6,7,8,9,10}, JISX 0201, JISX 0208, GB 2312, and KSX 1001.
UNICON
Project intends to implement display/input
CJK(Chinese/Japanese/Korean) characters under the Framebuffer under Linux.
CCE - Chinese
Console Environment enables CN-GB Chinese to be displayed on Linux
and FreeBSD console. It also supplies input methods for Chinese.
Xterm is a part of
XFree86 distribution. It can display UTF-8 encoding including doublewidth
characters and combining characters.
Rxvt can display multibyte
encodings such as EUC-JP, Shift-JIS, CN-GB, and Big-5.
libiconv
provides iconv() implementation for systems which don't have one.
It supports various encodings like ASCII, ISO 8859-*, KOI8-*, EUC-*, ISO
2022-*, Big5, Shift-JIS, TIS 620, UTF-*, UCS-*, CP*, Mac*, and so on. This
library also has locale_charset(), a replacement of
nl_langinfo(CODESET).
libutf8 - a
Unicode/UTF-8 locale plugin provides UTF-8 locale support for
systems which don't have UTF-8 locales.
Pango is a project to develop
a portable high-quality text rendering engine.
Projects and Organizations
Linux Internationalization
Initiative, or Li18nux, focuses on the i18n of a core set of APIs
and components of Linux distributions. The results will be proposed to LSB.
LI18NUX 2000 Globalization
Specification is the first fruits of Li18nux. focuses on the i18n
of a core set of APIs and components of Linux distributions. The results will
be proposed to LSB.
Citrus Project is a
project to implement locale/iconv for BSD series OSes so that these OSes
conform to ISO C / SUSV2.
Introduction to i18n
[email protected]