星期五, 四月 30, 2004

Java Platform补充字符(supplementary characters)在j2se 1.5

Java 使用固定宽度的16bit的来表示char字符,所以Java可以处理多达65536个字符.
但是Unicode 现在可以支持1,112,064个字符.
中国现在支持gb18030,台湾支持:CNS-11643字符.
有关J2SE 1.5如何支持这些字符参看http://java.sun.com/developer/technicalArticles/Intl/Supplementary/

Support for supplementary characters is likely to also become a common business requirement in East Asian markets. Government applications are going to require them in order to correctly represent names that include rare Chinese characters. Publishing applications may need them in order to represent the full set of historical and variant characters. The Chinese government requires support for GB18030, a character encoding that encodes the entire Unicode character set, and so includes supplementary characters if Unicode version 3.1 or later is assumed. The Taiwanese standard CNS-11643 includes numerous characters that have been included in Unicode 3.1 as supplementary characters. The Hong Kong government defined a collection of characters that are needed for Cantonese, and some of these characters are supplementary characters in Unicode. Finally, some vendors in Japan are planning to use the large private use area in the supplementary character space for more than 50,000 kanji character variants in order to migrate from their proprietary systems to solutions based on the Java platform.