Web Server forum
Back To The Forum Home!Search!Private Messaging System

This is Interesting: Free IT Magazines Now Free shipping to   
Web Server Talk Web Server Talk > Free Databases support forum > Microsoft SQL server > SQL Server > Unicode and hkscs




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    Unicode and hkscs  
Fai


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
06-29-04 08:15 AM

Does sql2000 support unicode utf-16 unicode big endian, unicode 3.0 and hksc
s format in a single column





[ Post a follow-up to this message ]



    RE: Unicode and hkscs  
Bart Duncan [MSFT]


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
06-30-04 11:00 PM

SQL only supports little endian Unicode (x86 processor architecture is
little endian).  Technically SQL understands little endian UCS-2, but
UCS-2 is equivalent to UTF-16 with the exception of surrogate pairs.  You
can store and retrieve little endian UTF-16 data including surrogates in
a Unicode (nvarchar) column of SQL 2000 with a few caveats.  SQL 2k is
what we term "surrogate safe", meaning:
- Surrogate characters can be entered and retrieved without data loss.
- Surrogate characters are considered two separate unicode characters,
i.e. an nvarchar(1) can not fit a surrogate character.
- String operations are not "surrogate aware". E.g.
- Substring(nvarchar(2),1,1) will result in half a surrogate character
if the input is a 4-byte surrogate character.
- In sorting & searching, all surrogate characters compare equal to all
other surrogate characters.

SQL 2000 was written before Unicode 3.0 existed (it was written more in
the Unicode 2.0 timeframe).  This means that the meaning of characters
that were defined in the standard fairly recently will not be recognized
by SQL.  They can still be stored and retrieved, but SQL considers them
to be "undefined" UTF-16 Unicode code points.  Undefined Unicode
characters are handled like surrogates, by which I mean that they are
considered to be equal to all other undefined code points.

I apologize that I am not very familiar with HKSCS, but what I can find
on this standard seems to indicate that it is a standard set of
characters that can be mapped into the user-defined regions of Big-5 or
Unicode/ISO10646.  If by "hkscs" you mean characters encoded in the
user-defined range of code points in the Unicode standard, then these
could be stored in a Unicode column and would be handled just like
surrogates or undefined Unicode characters as described above.  However,
if by "hkscs" you are referring to Hong Kong data encoded with the Big-5
character set, that cannot be stored in the same column as Unicode data
because the encoding schemes are completely different.

HTH,
Bart
------------
Bart Duncan
Microsoft SQL Server Support

Please reply to the newsgroup only - thanks.
This posting is provided "AS IS" with no warranties, and confers no
rights.


--------------------
Thread-Topic: Unicode and hkscs
thread-index: AcRduJRUB/0bq4VnSkaTTGA0w9DC9A==
X-WBNR-Posting-Host: 210.176.229.83
From: examnotes <Fai@discussions.microsoft.com>
Subject: Unicode and hkscs
Date: Tue, 29 Jun 2004 02:08:01 -0700
Lines: 1
Message-ID: <D1956FD1-97BF-49C7-9016-66CABE3B1960@microsoft.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.sqlserver.server
NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 127.0.0.1
Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA01.phx.gbl!TK2MSFTNGXA03.phx.gbl
Xref: cpmsftngxa10.phx.gbl microsoft.public.sqlserver.server:348210
X-Tomcat-NG: microsoft.public.sqlserver.server

Does sql2000 support unicode utf-16 unicode big endian, unicode 3.0 and
hkscs format in a single column







[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 12:52 AM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 

Back To The Top
Home | Usercp | Faq | Register