[Prev: BOTLINKS] | [Next: CHARSETCONVERTERS] |
N/A.
<CHARSETALIASES>
charset-name; alias, ...
...
</CHARSETALIASES>
N/A.
CHARSETALIASES defines aliases for character set names. For example, the charset iso-8859-1 is also known by latin1. Hence, latin1 is an alias for iso-8859-1 and can be defined as follows:
<CharsetAliases> iso-8859-1; latin1 </CharsetAliases>
Each line of the CHARSETALIASES element defines an alias definition. The syntax of an alias definition is as follows,
charset-name; alias, ...
i.e. the character set name followed by a semi-colon followed by a comma separated list of aliases.
Specifying a character set multiple times is allowed. For example, the following are equivalent:
<CharsetAliases> iso-8859-1; latin1, l1, iso_8859_1 </CharsetAliases> <CharsetAliases> iso-8859-1; latin1 iso-8859-1; l1 iso-8859-1; iso_8859_1 </CharsetAliases>
If the same alias is specified for two different charsets, then the last one defined is use. For example, if the following is defined,
<CharsetAliases> iso-8859-1; x-foo koi8-u; x-foo </CharsetAliases>
then x-foo will be an alias for koi8-u.
When MHonArc invokes CHARSETCONVERTERS filters, MHonArc maps aliases to real names before invoking the filters. Therefore, it is not necessary for a filter to know all possible names for a given character set.
If the override attribute is specified for CHARSETALIASES, then any previous settings will be cleared. Otherwise, each occurance of CHARSETALIASES will augment existing settings.
<CharsetAliases> us-ascii; ascii us-ascii; ansi_x3.4-1968 us-ascii; iso646 us-ascii; iso646-us us-ascii; iso646.irv:1991 us-ascii; cp367 us-ascii; ibm367 us-ascii; csascii us-ascii; iso-ir-6 us-ascii; us iso-8859-1; latin1 iso-8859-1; l1 iso-8859-1; iso_8859_1 iso-8859-1; iso_8859-1:1987 iso-8859-1; iso8859-1 iso-8859-1; iso8859_1 iso-8859-1; 8859-1 iso-8859-1; 8859_1 iso-8859-1; cp819 iso-8859-1; ibm819 iso-8859-1; x-mac-latin1 iso-8859-1; iso-ir-100 iso-8859-2; latin2 iso-8859-2; l2 iso-8859-2; iso_8859_2 iso-8859-2; iso_8859-2:1987 iso-8859-2; iso8859-2 iso-8859-2; iso8859_2 iso-8859-2; 8859-2 iso-8859-2; 8859_2 iso-8859-2; iso-ir-101 iso-8859-3; latin3 iso-8859-3; l3 iso-8859-3; iso_8859_3 iso-8859-3; iso_8859-3:1988 iso-8859-3; iso8859-3 iso-8859-3; iso8859_3 iso-8859-3; 8859-3 iso-8859-3; 8859_3 iso-8859-3; iso-ir-109 iso-8859-4; latin4 iso-8859-4; l4 iso-8859-4; iso_8859_4 iso-8859-4; iso_8859-4:1988 iso-8859-4; iso8859-4 iso-8859-4; iso8859_4 iso-8859-4; 8859-4 iso-8859-4; 8859_4 iso-8859-4; iso-ir-110 iso-8859-5; iso_8859-5:1988 iso-8859-5; cyrillic iso-8859-5; iso-ir-144 iso-8859-6; iso_8859-6:1987 iso-8859-6; arabic iso-8859-6; asmo-708 iso-8859-6; ecma-114 iso-8859-6; iso-ir-127 iso-8859-7; iso_8859-7:1987 iso-8859-7; greek iso-8859-7; greek8 iso-8859-7; ecma-118 iso-8859-7; elot_928 iso-8859-7; iso-ir-126 iso-8859-8; iso-8859-8-i iso-8859-8; iso_8859-8:1988 iso-8859-8; hebrew iso-8859-8; iso-ir-138 iso-8859-9; latin5 iso-8859-9; l5 iso-8859-9; iso_8859_9 iso-8859-9; iso-8859_9:1989 iso-8859-9; iso8859-9 iso-8859-9; iso8859_9 iso-8859-9; 8859-9 iso-8859-9; 8859_9 iso-8859-9; iso-ir-148 iso-8859-10; latin6 iso-8859-10; l6 iso-8859-10; iso_8859_10 iso-8859-10; iso_8859-10:1993 iso-8859-10; iso8859-10 iso-8859-10; iso8859_10 iso-8859-10; 8859-10 iso-8859-10; 8859_10 iso-8859-10; iso-ir-157 iso-8859-13; latin7 ,l7 iso-8859-14; latin8 ,l8 iso-8859-15; latin9 iso-8859-15; latin0 iso-8859-15; l9 iso-8859-15; l0 iso-8859-15; iso_8859_15 iso-8859-15; iso8859-15 iso-8859-15; iso8859_15 iso-8859-15; 8859-15 iso-8859-15; 8859_15 iso-2022-jp; iso-2022-jp-1 utf-8; utf8 cp932; shiftjis cp932; shift_jis cp932; shift-jis cp932; x-sjis cp932; ms_kanji cp932; csshiftjis cp936; gbk cp936; ms936 cp936; windows-936 cp949: euc-kr cp949: ks_c_5601-1987 cp949: ks_c_5601-1989 cp949: ksc_5601 cp949: iso-ir-149 cp949: windows-949 cp949: ms949 cp949: korean cp950; windows-950 cp1250; windows-1250 cp1251; windows-1251 cp1252; windows-1252 cp1253; windows-1253 cp1254; windows-1254 cp1255; windows-1255 cp1256; windows-1256 cp1257; windows-1257 cp1258; windows-1258 koi-0; gost-13052 koi8-e; iso-ir-111 koi8-e; ecma-113:1986 koi8-r; cp878 gost-19768-87; ecma-cyrillic gost-19768-87; ecma-113 gost-19768-87; ecma-113:1988 big5-eten; big5 big5-eten; csbig5 big5-eten; tcs-big5 big5-eten; tcsbig5 big5-hkscs; big5hk big5-hkscs; big5hkscs big5-hkscs; hkscs-big5 big5-hkscs; hk-big5 gb2312; gb_2312-80 gb2312; csgb2312 gb2312; hz-gb-2312 gb2312; iso-ir-58 gb2312; euc-cn gb2312; chinese gb2312; csiso58gb231280 macarabic; apple-arabic maccentraleurroman; apple-centeuro maccroatian; apple-croatian maccyrillic; apple-cyrillic macgreek; apple-greek machebrew; apple-hebrew macicelandic; apple-iceland macromanian; apple-romanian macroman; apple-roman macthai; apple-thai macturkish; apple-turkish macarabic; x-mac-arabic maccentraleurroman; x-mac-centraleurroman maccroatian; x-mac-croatian maccyrillic; x-mac-cyrillic macgreek; x-mac-greek machebrew; x-mac-hebrew macicelandic; x-mac-icelandic macromanian; x-mac-romanian macroman; x-mac-roman macthai; x-mac-thai macturkish; x-mac-turkish </CharsetAliases>
N/A
CHARSETALIASES is generally useful for resolving "unknown charset" warnings that MHonArc generates since some MUAs can specify non-standard names for charsets.
Another use is to fool MHonArc into thinking that data labeled with one charset is actual data in another charset. For example, in some locales, MUAs improperly set the charset="..." parameter in text messages. CHARSETALIASES can be used to tell MHonArc to treat the improperly labeled data in another charset during conversion. For example,
<CharsetAliases> iso-8859-8; us-ascii </CharsetAliases>
tells MHonArc to treat US-ASCII data as Hebrew.
2.6.0
[Prev: BOTLINKS] | [Next: CHARSETCONVERTERS] |