Character Sets

This chapter describes the pitfalls of language depending character sets.

Maitreya supports Unicode character sets since release 4.0. Old version have Ansi charsets. This code page change may lead to several problems.

Charset Problems in Location Database, etc.

You may see broken characters in the location database if you copy your old file "locations.dat" to the the directory of the Unicode release.

Older versions of the location database had Ansi format; the new database has Unicode format. Solution: open the database file "locations.dat" in a Unicode capable editor (like Word on Windows or gedit on Linux). Save the file in Unicode UTF-8 format. The problem should disappear then.

Same for astrological data files "*.mtx" etc.

Unicode Charsets on Windows Platforms

Unicode character sets are especially required for non European (e.g. Telugu) translation and display of Sanskrit characters in Sarvatobhadra view.

Problems in Unicode charset display can be a result of Windows configuration.

On Windows XP, additional support for Unicode languages can be installed as follows: Start →Settings → Control Panel → Regional Options and Language Options.

In the Languages tab, check the Supplemental language support option(s) you want. Setting both options will install all optional fonts. This option adds fonts as well as system support for those languages.

Installation on Windows 2000 is similar.

Unicode on Windows 9x

On Windows 95 and 98 the compilation must done with libunicows (a Unicode library for Windows).

Setting up the Correct Language on Linux/BSD Systems

Language configuration on Linux/BSD is done with the environment variable LANG. This variable holds the ISO code of the desired language and country plus extra information about the character set.

Example

  • en - means English language.
  • en_US - means US American English language.
  • en_US.UTF-8 - means US American language with utf-8 Unicode character set.

Most systems have the correct configuration by default because installation programs generally setup the correct language.

If not, try to set the language manually: export LANG=te for Bource Again Shell or setenv LANG te for csh. Or try export LANG=te_IN (may work on some systems like Fedora 5).

Russian language should be configured with export LANG=ru_RU, German lang with de_DE, etc.