General
Unicode
support in the Solaris Operating Environment
shows what is needed
for software developers to support UTF-8.
The Open
Group's summary of ISO C Amendment 1
is a detailed explanation on
locale and wide character technologies.
Markus Kuhn's UTF-8
and Unicode FAQ for Unix/Linux
is a detailed explanation on UTF-8
and Unicode.
What is
MOJIBAKE
shows what occurs when character handling is improper.
Mojibake is a Japanese word which almost all computer users (not only
Linux/BSD/Unix but also Windows/Macintosh) know.
MARUCHIRINGARU
KANKYOU NO JITSUGEN - X Window/Wnn/Mule/WWW BURAUZA DENO TAKOKUGO
KANKYO
" or "Realization of Multilingual Environment -
Multilingual Environment in X Window/Wnn/Mule/WWW Browser" (in Japanese),
ISBN4-88735-020-1, TOPPAN, 1996
KOKUSAIKA
PUROGURAMINGU - I18N HANDOBUKKU
" or "Internationalization
Programming - I18N Handbook" (in Japanese), ISBN4-320-02904-6, KYORITSU,
1998
Linux/FreeBSD
NIHONGO KANKYOU NO KOUCHIKU TO KATSUYOU
" or "Construction
and Utilization of Linux/FreeBSD Japanese Environment" (in Japanese),
ISBN4-7973-0480-4, SOFTBANK, 1997
MOJI KOODO
NO SEKAI
" or "The World of Character Codes" (in
Japanese), ISBN4-501-53060-X, Tokyo Denki University Press Center, 1999
Characters (general)
Character
Tables
Graphic images for various character sets in the world.
Ken
Lunde's CJK info
information on CJK (Chinese, Japanese, and Korean)
character set standards, written by the writer of "CJKV Information
Processing" published by O'Reilly.
IANA
character set registry
Note that both coded character sets (for
example, KS_C_5601-1987, MIBenum 36) and encodings (for example, ISO-2022-KR,
MIBenum: 37) are registered. How confusing!
International Register of
Coded Character Sets
A complete list of registered CCS, with ISO
2022 escape sequences. PDF files for these CCS are also available.
Characters (ISO 8859)
Characters (ISO 2022)
Characters (ISO 10646 and Unicode)
Softwares
Arena-i18n
Multilingual web browser.
Mozilla
is also a
multilingual web browser.
Mule
Multilingual editor
whose function is included in GNU Emacs 20 and XEmacs 20. Mule is the most
advanced m17n software in my knowledge.
JFBTERM
(in Japanese) is a multilingual terminal for Linux framebuffer console.
Supported encodings are ISO 2022, EUC-JP, CN-GB, and EUC-KR. Supported CCS are
ISO 8859-{1,2,3,4,5,6,7,8,9,10}, JISX 0201, JISX 0208, GB 2312, and KSX 1001.
UNICON
Project
intends to implement display/input
CJK(Chinese/Japanese/Korean) characters under the Framebuffer under Linux.
CCE - Chinese
Console Environment
enables CN-GB Chinese to be displayed on Linux
and FreeBSD console. It also supplies input methods for Chinese.
Xterm
is a part of
XFree86 distribution. It can display UTF-8 encoding including doublewidth
characters and combining characters.
Rxvt
can display multibyte
encodings such as EUC-JP, Shift-JIS, CN-GB, and Big-5.
libiconv
provides iconv() implementation for systems which don't have one.
It supports various encodings like ASCII, ISO 8859-*, KOI8-*, EUC-*, ISO
2022-*, Big5, Shift-JIS, TIS 620, UTF-*, UCS-*, CP*, Mac*, and so on. This
library also has locale_charset(), a replacement of
nl_langinfo(CODESET).
libutf8 - a
Unicode/UTF-8 locale plugin
provides UTF-8 locale support for
systems which don't have UTF-8 locales.
Pango
is a project to develop
a portable high-quality text rendering engine.
Projects and Organizations
Linux Internationalization
Initiative
, or Li18nux, focuses on the i18n of a core set of APIs
and components of Linux distributions. The results will be proposed to LSB.
LI18NUX 2000 Globalization
Specification
is the first fruits of Li18nux. focuses on the i18n
of a core set of APIs and components of Linux distributions. The results will
be proposed to LSB.
Citrus Project
is a
project to implement locale/iconv for BSD series OSes so that these OSes
conform to ISO C / SUSV2.
Introduction to i18n
[email protected]