Japanese language and computers

In relating to the Japanese language and computers, unique adaptation issues arise. Many problems relate to transliteration and romanization, some to character encoding, and some to the input of Japanese text.

Table of contents

1 Romanization
2 Character Encoding
3 Input method
4 Gaiji
5 See Also

Romanization

Modern Japanese is usually input into a computer via romanization. There are two main systems for the romanization of Japanese, known as Kunrei-shiki and Hepburn. The Kunrei system is used widely in Japan for input on a roman keyboard, since it is slightly briefer and more systematic than the Hepburn system. Foreigners typically prefer the Hepburn system however, because the Kunrei system does not correspond as well to the actual sounds of Japanese.

Character Encoding

There are several standard methods to encode characters for use on a computer, including JIS, SJIS, EUC, and Unicode. While mapping the set of kana is a simple matter, kanji has proven more difficult. Because the Japanese kanji differ slightly or significantly from the corresponding characters in Chinese, it has proven both challenging and controversial to construct an encoding system which encompasses both Chinese and Japanese characters equitably.

Unicode has been criticized in Japan (as well as in China and Korea) because it assigns the same code to similar characters from various East Asian languages, even though the character may varies in terms of form and pronunciation [1]. Unicode is also criticized for failing to allow for older and alternate forms of kanji. Though Japanese computer users have almost no trouble handling contemporary text, ancient Japanese language research has been considerably handicapped by this limitation.

This problem has led to the continued wide use of many encoding standards, despite increased Unicode use in other countries. For example, most Japanese e-mail and web pages are encoded in SJIS or JIS rather than Unicode. This has led to the problem of mojibake (misconverted characters) and much unreadable Japanese text on computers.

Input method

Japanese text input is a complicated matter not only because of the encoding problems discussed above but also because it is practically impossible to type all of characters used in Japanese writing system with a finite set of keys in keyboards. On modern computers, Japanese is input on a standard keyboard via romanization combined with an Input Method Editor which allows the user to choose the correct characters from a list. There is also another method, known as Oyayubi shift, developed by Fujitsu, which allows direct kana input, but this method is now obsolete.

Gaiji

Because a number of often-used characters are omitted in a standard character set such as JIS or even Unicode, gaiji (外字　external character) is sometimes used to supplement the character set. However, with the spread of computer networking and the Internet, gaiji is no longer used as frequently. As a result, omitted characters are written with similar or simpler characters in their place.