Question on Unicode
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Unix and Linux reviews > Free Unix support > Unix Programming > Question on Unicode




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    Question on Unicode  
SAM


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 07:49 AM

Hi all,

I wanted to know what datatype to be used in a C/C++ program for a
16 bit i.e Unicode.

Here is my code that does'nt work.

unsigned short testdata[50]="\x400\x401\x402";
printf("Testdata %x\n",testdata[0]);

This code does'nt compile with cc compiler on unix. My flavor of UNIX
is SCO Unixware 7.1.1

Note data-type used is unsigned short and size of it 2 bytes(16 bits).

How to assign data to a 16-bit array. Is there a datatype?

How to go about it?

Thanks.






[ Post a follow-up to this message ]



    Re: Question on Unicode  
Fletcher Glenn


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 07:49 AM


"SAM" <mshyamrao@gmail.com> wrote in message
news:1127451303.140695.235700@g47g2000cwa.googlegroups.com...
> Hi all,
>
>    I wanted to know what datatype to be used in a C/C++ program for a
> 16 bit i.e Unicode.
>
> Here is my code that does'nt work.
>
> unsigned short testdata[50]="\x400\x401\x402";
> printf("Testdata %x\n",testdata[0]);
>
> This code does'nt compile with cc compiler on unix. My flavor of UNIX
> is SCO Unixware 7.1.1
>
> Note data-type used is unsigned short and size of it 2 bytes(16 bits).
>
> How to assign data to a 16-bit array. Is there a datatype?
>
> How to go about it?
>
> Thanks.
>

Try looking up w_char.

--

Fletcher Glenn







[ Post a follow-up to this message ]



    Re: Question on Unicode  
Maxim Yegorushkin


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 12:50 PM


SAM wrote:

>     I wanted to know what datatype to be used in a C/C++ program for a
> 16 bit i.e Unicode.

It depends on which library you use for unicode manipulation. If it is
glibc then the type is wchar_t, see <wchar.h>






[ Post a follow-up to this message ]



    Re: Question on Unicode  
Roger Leigh


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 12:50 PM

On 2005-09-23, Fletcher Glenn <fandxxxmgiiBLOCKED@pacbell.net> wrote:
>
> "SAM" <mshyamrao@gmail.com> wrote in message
> news:1127451303.140695.235700@g47g2000cwa.googlegroups.com... 
>
> Try looking up w_char.

ITYM wchar_t.

It might also be a good idea to use an OS other than SCO that actually
has UTF-8 locales and allows UTF-8 source code, that way you can
simply write

const wchar_t testdata = L"Grüße";

That said, wchar_t isn't necessarily 16 bits.  On GNU/Linux, it's 32.
Is there any particular reason you need 16 bits?  UCS is a 32-bit code.






[ Post a follow-up to this message ]



    Re: Question on Unicode  
Bjorn Reese


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 10:55 PM

Maxim Yegorushkin wrote:

> It depends on which library you use for unicode manipulation. If it is
> glibc then the type is wchar_t, see <wchar.h>

wchar_t has been a standard type since 1989 (C89 and XPG3), and
<wchar.h> a few years later (C94 and XPG4). So, it predates glibc
by a wide margin.

--
mail1dotstofanetdotdk





[ Post a follow-up to this message ]



    Re: Question on Unicode  
Mikko Rauhala


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 10:55 PM

On Fri, 23 Sep 2005 10:42:20 +0100, Roger Leigh
<${roger}@whinlatter.uklinux.net.invalid> wrote:
> That said, wchar_t isn't necessarily 16 bits.  On GNU/Linux, it's 32.
> Is there any particular reason you need 16 bits?  UCS is a 32-bit code.

wchar_t doesn't even necessarily use Unicode code points internally,
though probably does on most relevant systems. One is advised to
check the presence of the __STDC_ISO_10646__ macro before making
assumptions on what's inside wchar_t. (If present, wchar_t contains
ISO-10646-1/Unicode code points.)

--
Mikko Rauhala   - mjr@iki.fi     - <URL:http://www.iki.fi/mjr/>
Transhumanist   - WTA member     - <URL:http://www.transhumanism.org/>
Singularitarian - SIAI supporter - <URL:http://www.singinst.org/>






[ Post a follow-up to this message ]



    Re: Question on Unicode  
Maxim Yegorushkin


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-23-05 10:55 PM


Bjorn Reese wrote:
> Maxim Yegorushkin wrote:
> 
>
> wchar_t has been a standard type since 1989 (C89 and XPG3), and
> <wchar.h> a few years later (C94 and XPG4). So, it predates glibc
> by a wide margin.

So what?

If you reread my posting carefully you might notice that I don't speak
about what predates what, neither do I care.

I'm talking that if you use some library for handling unicode you may
like to stick to whichever type the library accepts as unicode code
points. glibc accepts wchar_t, ICU - UChar32.






[ Post a follow-up to this message ]



    Re: Question on Unicode  
Bjorn Reese


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-26-05 11:02 PM

Maxim Yegorushkin wrote:

> So what?
>
> If you reread my posting carefully you might notice that I don't speak
> about what predates what, neither do I care.

Your posting was phrased in a glibc context, which could make it easy to
misinterpret your reply as if wchar_t is a glibc-only feature.

I intended to elaborate on your reply by making the original poster, who
is using UnixWare, aware that wchar_t was introduced to the standards
quite some time ago, which means that wchar_t is widespread and
therefore a good option.

I am a bit surprised at your dismissive reaction.

> I'm talking that if you use some library for handling unicode you may
> like to stick to whichever type the library accepts as unicode code
> points. glibc accepts wchar_t, ICU - UChar32.

Fair point.

--
mail1dotstofanetdotdk





[ Post a follow-up to this message ]



    Re: Question on Unicode  
Maxim Yegorushkin


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-27-05 07:52 AM


Bjorn Reese wrote:

[]

> I am a bit surprised at your dismissive reaction.

Sorry, didn't mean to offend you.






[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 08:26 PM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register