IIS Index Server - Indexing/Searching Chinese

This is Interesting: Free IT Magazines  
Home > Archive > IIS Index Server > June 2005 > Indexing/Searching Chinese





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Indexing/Searching Chinese
Kirk Potter

2005-06-03, 6:01 pm

Hi,

I am having some trouble with indexing & searching HTML files which contain
a UTF-8 representation of Chinese.

I have done a load of reading of previous articles on this and most suggest
a variety of things to get this working, namely:

1. Adding <meta http-equiv="Content-Type" content="text/html;
charset=UTF-8"> to the head of the pages.
2. Adding <meta name="ms.locale" content="cn-zh">
3. Specifying the locale identifier when connecting via MSIDXS
4. Specifying a code page (when using ASP we are are)

Unfortunately none of these things worked for me.

What I have done with some success is the following:

1. Installed the Chinese language packs to the Windows 2000 Server
concerned.
This has stopped our initial error of "The query contained only ignored
words"

2. Made sure that files to be indexed are saved to disk in UTF-8 format.

With these two items I can get very simple Chinese indexing and searching to
work (e.g. finding the Chinese for "tree" successfully - ?)

The problem I have is that if my page to be indexed and searched contains
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> then I
get no results returned. I need this to be there as web browsers will
require this to display the pages correctly (I know in IIS I can add this
header but I don't want to if I don't need to).

This seems counter-intuitive to me and due to the nature of how we are
creating these pages I don't really want to have to remove the meta tag. Can
anyone explain why Indexing Service will not return any results if this tag
is present and if there is anyway around this?

Many thanks in advance,

Kirk


Hilary Cotter

2005-06-04, 7:47 am

This should be working. Please post sample docs here or send them to me
offline.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Kirk Potter" <potter_kirk@hot|\|OSPAMmail.com> wrote in message
news:J3_ne.12792$YH5.5290@fe1.news.blueyonder.co.uk...
> Hi,
>
> I am having some trouble with indexing & searching HTML files which

contain
> a UTF-8 representation of Chinese.
>
> I have done a load of reading of previous articles on this and most

suggest
> a variety of things to get this working, namely:
>
> 1. Adding <meta http-equiv="Content-Type" content="text/html;
> charset=UTF-8"> to the head of the pages.
> 2. Adding <meta name="ms.locale" content="cn-zh">
> 3. Specifying the locale identifier when connecting via MSIDXS
> 4. Specifying a code page (when using ASP we are are)
>
> Unfortunately none of these things worked for me.
>
> What I have done with some success is the following:
>
> 1. Installed the Chinese language packs to the Windows 2000 Server
> concerned.
> This has stopped our initial error of "The query contained only ignored
> words"
>
> 2. Made sure that files to be indexed are saved to disk in UTF-8 format.
>
> With these two items I can get very simple Chinese indexing and searching

to
> work (e.g. finding the Chinese for "tree" successfully - ?)
>
> The problem I have is that if my page to be indexed and searched contains
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> then I
> get no results returned. I need this to be there as web browsers will
> require this to display the pages correctly (I know in IIS I can add this
> header but I don't want to if I don't need to).
>
> This seems counter-intuitive to me and due to the nature of how we are
> creating these pages I don't really want to have to remove the meta tag.

Can
> anyone explain why Indexing Service will not return any results if this

tag
> is present and if there is anyway around this?
>
> Many thanks in advance,
>
> Kirk
>
>



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com