Avoiding Mojibake
by John de Hoog (Wataru Tenga)
Even when using a Japanese-capable e-mail program,
it is possible inadvertently to send mail to a mailing list or other
recipients that shows up as unreadable gibberish—so-called mojibake.
Here are some basic considerations for avoiding this problem.
Choose the correct encoding
Ordinarily, Unicode would seem to be the safest choice. Unfortunately,
a few popular programs, notably Eudora,
do not support this encoding scheme. Use it only if you are sure
your recipient can handle it, and when you need to mix different
character types (e.g., Japanese and Korean) in the same message.
Shift-JIS is to be avoided. For Japanese, using ISO-2022-JP will
ensure that anyone with a Japanese-enabled system and e-mail program
will be able to read your message. (In some mail programs this may
be called simply "JIS" or "JIS7".)
Trying to send Japanese characters in a message whose encoding
is set to ASCII (plain text) or to a Western encoding like ISO-8859-1
is an invitation to serious mojibake.
Common pitfalls
The Honyaku
mailing list, with its heavy traffic in multilingual messages passed
among translators, is a good place to observe some of the common
pitfalls that lead to mojibake.
The most common problem by far is caused by sending not from a
mail program but through the Web interface. See below on how to
avoid mojibake in this case.
Another cause of mojibake is when posters are using a mail program
that does not support Japanese, and try to get around this by pasting
in Japanese text from a different program. The pasted-in text is
likely to be Shift-JIS, whereas the message header will identify
the message as ASCII or as ISO-8859-1. This combination leads to
mojibake on the receiving end in most cases.
Even in a mailer with full Japanese support, something similar
can happen. The poster responds to a message that did not have any
Japanese text, but inserts Japanese in the response. Or, the poster
starts a new message in a default Western encoding, but adds Japanese
in the process of editing. Some mailers will not adjust the encoding
header accordingly. So it is important to check the encoding before
sending a message.
One more pitfall has to do with Unicode. Programs like Outlook
and Outlook
Express will default to Unicode (UTF-8) encoding when they detect
characters that cannot be sent in the 7-bit schemes. Rather than
accepting this choice, it is safer to go back and find the offending
character, remove it, and set the encoding to ISO-2022-JP (JIS).
A good e-mail program like TuruKame
Mail will warn you anytime you try to send a message containing
any characters that do not match the encoding in your headers. For
other programs it is necessary to check manually, which is done
differently in each program. Take time to learn how to check and
adjust the encoding in your particular program.
Posting via a Web interface
This last problem does not have to do with e-mail programs per
se, but relates to the mailing lists hosted on Yahoo Groups.
It is possible to post messages to Honyaku and other such lists
directly from the Yahoo Groups Web
site. Unfortunately, the default is set to English (ISO-8859-1),
which is unacceptable for Japanese.
Right under the text entry box where you enter your message text
is a place to designate the language. It looks like this:
A common mistake is to assume that since the message is in English,
the choice above should also be "English". But in fact,
what this question is asking about is those pesky encoding schemes.
The choice of "English" actually sets the encoding to
ISO-8859-1, which is no good for Japanese.
So when posting from the Web
site, if your message (or signature) has any Japanese at all
in it, be sure to choose "Japanese" as the posting language.
Additional complications can arise, however, depending on your
browser's default encoding settings. If they are set to Auto-detect,
Japanese (JIS), or Japanese (EUC), there should be no problem. But
if Unicode (UTF-8) is set as the default, even selecting Japanese
as the posting language may not have the desired result. The safest
approach is to avoid posting through the Web altogether, using a
Japanese email program like those introduced here instead.
Other resources
If you've received a bake'd piece of mail, this
page may be able to help decode it. It offers a MIME header
decoder, a broken JIS mail recovery service, and a Unicode decoder.
.: Return to the main
page :.
|