Книга: Linux Network Administrator Guide, Second Edition

National Character Sets

National Character Sets

A set of standards and RFCs have been developed that amend the RFC-822 standard to support various types of messages, such as plain text, binary data, PostScript files, etc. These standards are commonly referred to as MIME, or Multipurpose Internet Mail Extensions. Among other things, MIME also lets the recipient know if a character set other than standard ASCII has been used when writing the message, for example, using French accents or German umlauts. elm supports these characters to some extent.

The character set used by Linux internally to represent characters is usually referred to as ISO-8859-1, which is the name of the standard it conforms to. It is also known as Latin-1. Any message using characters from this character set should have the following line in its header:

Content-Type: text/plain; charset=iso-8859-1

The receiving system should recognize this field and take appropriate measures when displaying the message. The default for text/plain messages is a charset value of us-ascii.

To be able to display messages with character sets other than ASCII, elm must know how to print these characters. By default, when elm receives a message with a charset field other than us-ascii (or a content type other than text/plain, for that matter), it tries to display the message using a command called metamail. Messages that require metamail to be displayed are shown with an M in the very first column in the overview screen.

Since Linux's native character set is ISO-8859-1, calling metamail is not necessary to display messages using this character set. If elm is told that the display understands ISO-8859-1, it will not use metamail, but will display the message directly instead. This can be enabled by setting the following option in the global elm.rc:

displaycharset = iso-8859-1

Note that you should set this option even when you are never going to send or receive any messages that actually contain characters other than ASCII. This is because people who do send such messages usually configure their mailer to put the proper Content-Type: field into the mail header by default, whether or not they are sending ASCII-only messages.

However, setting this option in elm.rc is not enough. When displaying the message with its built-in pager, elm calls a library function for each character to determine whether it is printable. By default, this function will only recognize ASCII characters as printable and display all other characters as ^?. You may overcome this function by setting the environment variable LC_CTYPE to ISO-8859-1, which tells the library to accept Latin-1 characters as printable. Support for this and other features have been available since Version 4.5.8 of the Linux standard library.

When sending messages that contain special characters from ISO-8859-1, you should make sure to set two more variables in the elm.rc file:

charset = iso-8859-1
textencoding = 8bit

This makes elm report the character set as ISO-8859-1 in the mail header and send it as an 8-bit value (the default is to strip all characters to 7-bit).

Of course, all character set options we've discussed here may also be set in the private elmrc file instead of the global one so individual users can have their own default settings if the global one doesn't suit them.

Оглавление книги

Оглавление статьи/книги

Генерация: 1.154. Запросов К БД/Cache: 3 / 1
поделиться
Вверх Вниз