Chinese Character Set Standard in GLIBC-2.2
-
BIG5:
(CVS)
- Encoding range: Byte1: 0xa1-0xf9, Byte2: 0x40-0x7e, 0xa1-0xfe.
- Follows the
CP950 standard.
- Also merges ETen extension characters (0xC6A1..0xC7FC), with the
mapping into the "Private Use" segment of Unicode (U+F6B1..U+F848).
This mapping is adopted from the
Aprhic Tech. CO., LTD.
- Irreversible mapping of Big5 characters includes:
0xA2CC, 0xA2CE, 0xF9E9, 0xF9EA, 0xF9EB,
0xF9F9, 0xF9FA, 0xF9FB, 0xF9FC, 0xF9FD.
-
BIG5HKSCS:
(CVS)
- Encoding range: Byte1: 0x81-0xfe, Byte2: 0x40-0x7e, 0xa1-0xfe.
- Basically compatible to BIG5
(CP950) with some variations, plus a lot of HongKong extension
characters out side the encoding range of CP950.
- Official information:
http://www.info.gov.hk/digital21/eng/hkscs/.
-
EUC-TW (CNS11643):
(CVS)
-
GB2312:
(CVS)
-
GBK:
(CVS)
-
GB18030:
(CVS)
-
UTF-8:
(CVS)