IIS Index Server - Searching for decimal numbers

This is Interesting: Free IT Magazines  
Home > Archive > IIS Index Server > February 2007 > Searching for decimal numbers





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Searching for decimal numbers
AC

2006-11-19, 1:28 am

Hello,

I have removed the numbers from my noise.enu file (using American English
locale). I can get results for whole numbers and alpha-numeric words. A
decimal number does not show up when I search for it. Some of the decimal
numbers are currency so I tried with the dollar sign as well. I would
appreciate suggestions from someone that has solved this problem. I find
very little when searching for this.

I am using the Indexing Service Query Form with Standard and Advanced query
modes. I am only putting in the numbers. No special instructions.

Will *not* find:
$30.20
847.8
(732.5)

Will find:
76750035
(H72.5)
5534.22C

--
AC


Gang_Warily

2006-11-20, 1:17 pm

Hi

I presume you've emptied & rebuilt the catalog ?

Is it possible that the word-breaker is breaking the numbers at the decimal
point ?
Breaking dates at '/' or '-' could be helpful to find 2006 in 20-11-2006.

There is a utility called LRtest that might help
http://support.microsoft.com/default.aspx/kb/890613

I don't know what all that means, but it seems that using an 'NN' prefix
might help find numbers ?

ie 'NN10' seems to find instances of '10'
so 'nn123 NEAR nn456' might find 123.456

Of course, it may also find 456.123 ?


Also 'NN1' to 'NN9' seem to be hard-coded as noise words, even though they
aren't in the noise-word file.

I'm doing a lot of guesswork here - let us know what you find !
Any response from the experts or those that have time to experiment would
also be most welcome.

"AC" wrote:

> Hello,
>
> I have removed the numbers from my noise.enu file (using American English
> locale). I can get results for whole numbers and alpha-numeric words. A
> decimal number does not show up when I search for it. Some of the decimal
> numbers are currency so I tried with the dollar sign as well. I would
> appreciate suggestions from someone that has solved this problem. I find
> very little when searching for this.
>
> I am using the Indexing Service Query Form with Standard and Advanced query
> modes. I am only putting in the numbers. No special instructions.
>
> Will *not* find:
> $30.20
> 847.8
> (732.5)
>
> Will find:
> 76750035
> (H72.5)
> 5534.22C
>
> --
> AC
>
>
>

AC

2006-11-20, 1:17 pm

I created a brand new catalog with a subset of data.

I will try NN tonight (I don't have the data with me today).

Thanks


"Gang_Warily" <GangWarily@discussions.microsoft.com> wrote in message
news:4562905F-07C8-4026-B8B4-2AD91F1602FE@microsoft.com...[vbcol=seagreen]
> Hi
>
> I presume you've emptied & rebuilt the catalog ?
>
> Is it possible that the word-breaker is breaking the numbers at the
> decimal
> point ?
> Breaking dates at '/' or '-' could be helpful to find 2006 in 20-11-2006.
>
> There is a utility called LRtest that might help
> http://support.microsoft.com/default.aspx/kb/890613
>
> I don't know what all that means, but it seems that using an 'NN' prefix
> might help find numbers ?
>
> ie 'NN10' seems to find instances of '10'
> so 'nn123 NEAR nn456' might find 123.456
>
> Of course, it may also find 456.123 ?
>
>
> Also 'NN1' to 'NN9' seem to be hard-coded as noise words, even though they
> aren't in the noise-word file.
>
> I'm doing a lot of guesswork here - let us know what you find !
> Any response from the experts or those that have time to experiment would
> also be most welcome.
>
> "AC" wrote:
>


AC

2006-11-28, 1:17 pm

NN didn't work

I tried NN30 AND NN80 as well as NN30.80

Thanks for the suggestions.
--AC

"Gang_Warily" <GangWarily@discussions.microsoft.com> wrote in message
news:4562905F-07C8-4026-B8B4-2AD91F1602FE@microsoft.com...[vbcol=seagreen]
> Hi
>
> I presume you've emptied & rebuilt the catalog ?
>
> Is it possible that the word-breaker is breaking the numbers at the
> decimal
> point ?
> Breaking dates at '/' or '-' could be helpful to find 2006 in 20-11-2006.
>
> There is a utility called LRtest that might help
> http://support.microsoft.com/default.aspx/kb/890613
>
> I don't know what all that means, but it seems that using an 'NN' prefix
> might help find numbers ?
>
> ie 'NN10' seems to find instances of '10'
> so 'nn123 NEAR nn456' might find 123.456
>
> Of course, it may also find 456.123 ?
>
>
> Also 'NN1' to 'NN9' seem to be hard-coded as noise words, even though they
> aren't in the noise-word file.
>
> I'm doing a lot of guesswork here - let us know what you find !
> Any response from the experts or those that have time to experiment would
> also be most welcome.
>
> "AC" wrote:
>


Gang_Warily

2006-12-13, 7:23 am

http://msdn2.microsoft.com/en-gb/library/ms693168.aspx

When you create a word breaker, it is recommended that the word breaker
normalize numbers to a canonical representation by using the pattern
"NNddDcc," where "NN" is the literal sequence "NN," dd is the integer portion
of the number, "D" is the literal "D," and cc is the fractional portion of
the number. Word breakers do not restrict the number of digits for either the
integer or the fraction portion of the number. It is recommended that word
breakers recognize numerical patterns that are delimited by both periods (.)
and commas (,). For example, Indexing Service represents both "1,000.2" and
"1.000,2" as "NN1000D2."

Hi

I'm not sure how this can be used, but I'm sure it's relevant somehow !

Are you using the SQL query language, or one of the other two dialects of
SQL Query Language ?
http://msdn2.microsoft.com/en-gb/library/ms690580.aspx

"AC" wrote:

> NN didn't work
>
> I tried NN30 AND NN80 as well as NN30.80
>
> Thanks for the suggestions.
> --AC
>
> "Gang_Warily" <GangWarily@discussions.microsoft.com> wrote in message
> news:4562905F-07C8-4026-B8B4-2AD91F1602FE@microsoft.com...
>
>
>

AC

2007-02-17, 1:22 am

Thanks!! I think this might help. Need to test it and see.

Regards

"Gang_Warily" <GangWarily@discussions.microsoft.com> wrote in message
news:934CD68E-A9A9-4828-AFE2-B7CF7B189584@microsoft.com...[vbcol=seagreen]
> http://msdn2.microsoft.com/en-gb/library/ms693168.aspx
>
> When you create a word breaker, it is recommended that the word breaker
> normalize numbers to a canonical representation by using the pattern
> "NNddDcc," where "NN" is the literal sequence "NN," dd is the integer
> portion
> of the number, "D" is the literal "D," and cc is the fractional portion of
> the number. Word breakers do not restrict the number of digits for either
> the
> integer or the fraction portion of the number. It is recommended that word
> breakers recognize numerical patterns that are delimited by both periods
> (.)
> and commas (,). For example, Indexing Service represents both "1,000.2"
> and
> "1.000,2" as "NN1000D2."
>
> Hi
>
> I'm not sure how this can be used, but I'm sure it's relevant somehow !
>
> Are you using the SQL query language, or one of the other two dialects of
> SQL Query Language ?
> http://msdn2.microsoft.com/en-gb/library/ms690580.aspx
>
> "AC" wrote:
>


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com