Unix questions - specific UTF-8 locales

This is Interesting: Free IT Magazines  
Home > Archive > Unix questions > March 2005 > specific UTF-8 locales





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author specific UTF-8 locales
saroj.yadav@gmail.com

2005-03-22, 6:08 pm

As I understand it (correct me, if I am wrong) Unicode came into
picture so that a document containing multiple language characters can
be supported like somebody can write a document comparing Korean and
Chinese in French language.

Now, I am looking at all UNIX platforms and seems like all Unix (AIX,
HP, Solaris) platforms support Unicode by supporting language/region
specific UTF-8 locales like fr_FR.UTF-8, ja_JP.UTF-8, ko_KR.UTF-8 etc.

Now in order to use UTF-8 for Japanese, I have to set locale to
ja_JP.UTF-8. To use UTF-8 for Korean, I have to set locale to
ko_KR.UTF-8.

With this approach it's not possible to mix multiple language
characters. Doesn't this defeat the whole purpose of Unicode ?
Am I missing something ?

Thanks in advance for any insight you can provide.

Gianni Mariani

2005-03-23, 2:52 am

saroj.yadav@gmail.com wrote:
> As I understand it (correct me, if I am wrong) Unicode came into
> picture so that a document containing multiple language characters can
> be supported like somebody can write a document comparing Korean and
> Chinese in French language.
>
> Now, I am looking at all UNIX platforms and seems like all Unix (AIX,
> HP, Solaris) platforms support Unicode by supporting language/region
> specific UTF-8 locales like fr_FR.UTF-8, ja_JP.UTF-8, ko_KR.UTF-8 etc.
>
> Now in order to use UTF-8 for Japanese, I have to set locale to
> ja_JP.UTF-8. To use UTF-8 for Korean, I have to set locale to
> ko_KR.UTF-8.
>
> With this approach it's not possible to mix multiple language
> characters. Doesn't this defeat the whole purpose of Unicode ?
> Am I missing something ?
>
> Thanks in advance for any insight you can provide.
>


The "language" in the locate is used to find the message catalog as well
as the following attributes. In theory, you can have japanese and
korean characters in your string. It's just that if you format time, or
collate, or classify a character or format money etc, you'll be getting
the locale specific behaviour.

As for rendering multiple languages in one display, that's tricky,
especially if you're displaying chinese, thai and arabic all in the same
window and then trying to select a bit of thai and arabic with a mouse.

From the "setlocale" man page.

LC_COLLATE
for regular expression matching (it determines the
meaning of
range expressions and equivalence classes) and string
collation.

LC_CTYPE
for regular expression matching, character
classification, con-
version, case-sensitive comparison, and wide
character func-
tions.

LC_MESSAGES
for localizable natural-language messages.

LC_MONETARY
for monetary formatting.

LC_NUMERIC
for number formatting (such as the decimal point and the
thou-
sands separator).

LC_TIME
for time and date formatting.
saroj.yadav@gmail.com

2005-03-23, 6:09 pm

I thought the main purpose of Unicode was to do provide multi lingual
support.
Do you know what does a UTF-8 locale for a particular language provide
? in the sense, for "ja_JP.UTF-8" - (in addition to Japanese convention
like numbers, money etc.) does it have character set codes "only" for
Japanese characters or it contains complete Unicode character set codes
?

Gianni Mariani wrote:
> saroj.yadav@gmail.com wrote:
can[vbcol=seagreen]
and[vbcol=seagreen]
(AIX,[vbcol=seagreen]
language/region[vbcol=seagreen]
etc.[vbcol=seagreen]
>
> The "language" in the locate is used to find the message catalog as

well
> as the following attributes. In theory, you can have japanese and
> korean characters in your string. It's just that if you format time,

or
> collate, or classify a character or format money etc, you'll be

getting
> the locale specific behaviour.
>
> As for rendering multiple languages in one display, that's tricky,
> especially if you're displaying chinese, thai and arabic all in the

same
> window and then trying to select a bit of thai and arabic with a

mouse.
>
> From the "setlocale" man page.
>
> LC_COLLATE
> for regular expression matching (it determines the
> meaning of
> range expressions and equivalence classes) and string
> collation.
>
> LC_CTYPE
> for regular expression matching, character
> classification, con-
> version, case-sensitive comparison, and wide
> character func-
> tions.
>
> LC_MESSAGES
> for localizable natural-language messages.
>
> LC_MONETARY
> for monetary formatting.
>
> LC_NUMERIC
> for number formatting (such as the decimal point and

the
> thou-
> sands separator).
>
> LC_TIME
> for time and date formatting.


Gianni Mariani

2005-03-24, 2:54 am

saroj.yadav@gmail.com wrote:
> I thought the main purpose of Unicode was to do provide multi lingual
> support.
> Do you know what does a UTF-8 locale for a particular language provide
> ? in the sense, for "ja_JP.UTF-8" - (in addition to Japanese convention
> like numbers, money etc.) does it have character set codes "only" for
> Japanese characters or it contains complete Unicode character set codes
> ?


I think my earlier responses answers both the questions you have.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com