Localization is the act of making programs behave in a region-specific way. When a program formats numbers or dates in a way specific to your part of the world or prints messages (or accepts input) in your native language, the program is said to be localized. This section describes steps Subversion has made toward localization.
Most modern operating systems have a notion of the “current locale”—that is, the region or country whose localization conventions are honored. These conventions—typically chosen by some runtime configuration mechanism on the computer—affect the way in which programs present data to the user, as well as the way in which they accept user input.
在类Unix的系统,你可以运行locale命令来检查本地关联的运行配置的选项值:
$ locale LANG= LC_COLLATE="C" LC_CTYPE="C" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL="C" $
The output is a list of locale-related environment variables and their
current values. In this example, the variables are all set to the default
C
locale, but users can set these variables to specific
country/language code combinations. For example, if one were to set the
LC_TIME
variable to fr_CA
, programs
would know to present time and date information formatted according to a
French-speaking Canadian's expectations. And if one were to set the
LC_MESSAGES
variable to zh_TW
,
programs would know to present human-readable messages in Traditional
Chinese. Setting the LC_ALL
variable has the effect of
changing every locale variable to the same value. The value of
LANG
is used as a default value for any locale variable
that is unset. To see the list of available locales on a Unix system, run
the command locale -a
.
On Windows, locale configuration is done via the “Regional and Language Options” control panel item. There you can view and select the values of individual settings from the available locales, and even customize (at a sickening level of detail) several of the display formatting conventions.
The Subversion client, svn, honors the current locale
configuration in two ways. First, it notices the value of the
LC_MESSAGES
variable and attempts to print all messages
in the specified language. For example:
$ export LC_MESSAGES=de_DE $ svn help cat cat: Gibt den Inhalt der angegebenen Dateien oder URLs aus. Aufruf: cat ZIEL[@REV]... …
This behavior works identically on both Unix and Windows systems. Note,
though, that while your operating system might have support for a certain
locale, the Subversion client still may not be able to speak the particular
language. In order to produce localized messages, human volunteers must
provide translations for each language. The translations are written using
the GNU gettext package, which results in translation modules that end with
the .mo
filename extension. For example, the German
translation file is named de.mo
. These translation
files are installed somewhere on your system. On Unix, they typically live
in /usr/share/locale/
, while on Windows they're often
found in the share\locale\
folder in Subversion's
installation area. Once installed, a module is named after the program for
which it provides translations. For example, the de.mo
file may ultimately end up installed as
/usr/share/locale/de/LC_MESSAGES/subversion.mo
. By
browsing the installed .mo
files, you can see which
languages the Subversion client is able to speak.
The second way in which the locale is honored involves how svn interprets your input. The repository stores all paths, filenames, and log messages in Unicode, encoded as UTF-8. In that sense, the repository is internationalized—that is, the repository is ready to accept input in any human language. This means, however, that the Subversion client is responsible for sending only UTF-8 filenames and log messages into the repository. To do this, it must convert the data from the native locale into UTF-8.
For example, suppose you create a file named caffè.txt
,
and then when committing the file, you write the log message as
“Adesso il caffè è più forte.” Both the filename and the log
message contain non-ASCII characters, but because your locale is set to
it_IT
, the Subversion client knows to interpret them as
Italian. It uses an Italian character set to convert the data to UTF-8
before sending it off to the repository.
Note that while the repository demands UTF-8 filenames and log messages, it does not pay attention to file contents. Subversion treats file contents as opaque strings of bytes, and neither client nor server makes an attempt to understand the character set or encoding of the contents.