IIS Index Server - Chinese Wordbreaker

This is Interesting: Free IT Magazines  
Home > Archive > IIS Index Server > May 2006 > Chinese Wordbreaker





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Chinese Wordbreaker
Martin

2006-04-27, 7:24 pm

I am looking for commercially available Chinese word breaker for Index
server. I am looking for a word breaker that would be able to perform
the actual Chinese words segmentation instead of considering each
Chinese character as a word like current Index Server Chinese word
breaker does.

Does anyone know where I could find it?

Hilary Cotter

2006-05-01, 7:20 am

The Chinese word breaker appears to look at each character, detect radicals,
subcharacters, and then parse the token looking for compound characters.

You can find the patent filed for the actual process that they use -
unfortunately I can't find it right now, but I did find it through Google
some time ago.

Once upon a time Oracle, Sybase, Microsoft, and IBM all used the same
company's word breaker - infosoft. I am not sure who uses what now.

--
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.

This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.

Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com



"Martin" <bartekma@gmail.com> wrote in message
news:1146168594.774392.124940@y43g2000cwc.googlegroups.com...
>I am looking for commercially available Chinese word breaker for Index
> server. I am looking for a word breaker that would be able to perform
> the actual Chinese words segmentation instead of considering each
> Chinese character as a word like current Index Server Chinese word
> breaker does.
>
> Does anyone know where I could find it?
>



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com