[Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


1     Character Sets

The DIGITAL UNIX operating system software supports the following Japanese Industrial Standard (JIS) standard character sets:

It also supports the following international standard codesets:


[Contents] [Previous Chapter] [Next Section] [Next Chapter] [Index] [Help]


1.1   JIS X 0201

The JIS X 0201 is a national standard for Japanese information interchange. It was first published by the Japanese Industrial Standard Committee in 1976. The current version is JIS X 0208-1997. It is a single-byte character set and consists of Roman letters and Katakana characters.

The Roman letters defined in JIS X 0201 are the same as those in ASCII, except that the tilde sign (~) is replaced by a horizontal bar (located at the upper part of a character cell), and the backslash (\) is replaced by the Japan currency sign (Yen).

The Katakana characters are Japanese phonetic symbols.

Figure 1-1: JIS X 0201 Encoding

JIS X 0201 Encoding


[Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]



1.2   JIS X 0208

The JIS X 0208 is a national standard for a primary set of graphic characters for Japanese information interchange. It was first published by the Japanese Industrial Standard Committee in 1978. The current revision is JIS X 0208-1997.

JIS X 0208 has defined a total of 6,879 characters which include those in Table 1-1.

Table 1-1: Characters Defined in JIS X 0208

Rows

Characters

1 and 2

Special symbols

3

Numerals and Roman alphabets

4

Hiragana characters

5

Katakana characters

6

Greek alphabets

7

Russian alphabets

8

Symbols for drawing graphs, diagrams, and lines

16-47

First level Kanji

48-84

Second level Kanji

The JIS X 0208 code table is divided into 94 rows, numbered from 1 to 94. Each row has 94 columns, also numbered from 1 to 94.

Figure 1-2: JIS X 0208 Character Set

JIS X 0208 Character Set


[Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


1.3   JIS X 0212

The JIS X 0212 is a national standard for a supplemental set of graphic characters for Japanese information interchange. JIS X 0212 was published by the Japanese Industrial Standard Committee in 1990, and JIS X 0212-1990 is the current revision.

Support for JIS X 0212 characters in DIGITAL UNIX is limited. The DIGITAL UNIX operating system provides no fonts or input methods for JIS X 0212 characters. However, the character data encoded by the codesets which contain the JIS X 0212 character set can be processed properly by internationalized library functions.

JIS X 0212 has defined a total of 6,067 characters which include those in Table 1-2.

Table 1-2: Characters Defined in JIS X 0212

Rows

Characters

1 and 2

Special symbols

6-11

Alphabetic characters

16-77

Supplemental Kanji

The JIS X 0212 code table is divided into 94 rows, numbered from 1 to 94. Each row has 94 columns, also numbered from 1 to 94.

Figure 1-3: JIS X 0212 Character Set

JIS X 0212 Character Set


[Contents] [Previous Chapter] [Previous Section] [Next Section] [Next Chapter] [Index] [Help]


1.4   Unicode

The Unicode Standard, Version 2.0 specifies a universal character set (UCS) that contains definitions for 38,885 characters and also includes a Private Use Area for vendor- or user-defined characters. The main features of this character set are:


[Contents] [Previous Chapter] [Previous Section] [Next Chapter] [Index] [Help]


1.5   ISO/IEC 10646

The ISO/IEC 10646 standard, which is specified in Information Technology-Universal Multiple-Octet Coded Character Set, ISO/IEC 10646, allows characters to be specified as either 32-bit units or like Unicode, as 16-bit units. In their 32-bit form, the 16-bit character values in Unicode are zero-extended.


[Contents] [Previous Chapter] [Previous Section] [Next Chapter] [Index] [Help]