looking for efficient way to handle noise words in ASP
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Web Servers reviews > IIS server support > IIS Index Server > looking for efficient way to handle noise words in ASP




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    looking for efficient way to handle noise words in ASP  
Kevin Blount


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
04-07-05 11:05 PM

I'm using the following (example) query in my search script, and as
you'll probably spot, there's a noise/stop word in there, "to", which
basically causes an error.

--
SELECT DocTitle, vpath, path, filename, size, write, characterization,
rank, Authored, Product FROM SCOPE(' DEEP TRAVERSAL OF "/us" ') WHERE
(CONTAINS ('"software to go"') OR CONTAINS ('"software" NEAR "to" NEAR
"go"') > 0) ORDER BY rank DESC
--

What I'd like to do is to work through the entered keywords (i.e. what
the user wants to search for) and ignore any stop words when creating
my query. I thought about reading the noise.enu file into a variable,
then checking each word in the search string against the variable, and
skipping any found in the variable.

The problem is with that idea, is the word "go". While this doesn't
exist in the noise.enu file as a whole word, "got" does, so when
reading the whole noise.enu file in as one variable, "go" does appear
in the variable value.

CONTAINS ('"software" NEAR "to" NEAR "go"')

to be simply:

CONTAINS ('"software" NEAR "go"')

i.e. with the noise word removed.


Does anyone have any better suggestions for handling noise words with
ASP (not .NET). Ideally the end result would change the above query of:






[ Post a follow-up to this message ]



    Re: looking for efficient way to handle noise words in ASP  
Kevin Blount


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
04-08-05 10:51 PM

I went for the slightly slower method of reading each line of the
noise.enu file and checking it against each word in the search string.
The performance of the script isn't hit as much as I expected, so I'm
happy with this solution.






[ Post a follow-up to this message ]



    Re: looking for efficient way to handle noise words in ASP  
Hilary Cotter


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
04-09-05 01:48 AM

check this out.
http://www.indexserverfaq.com/searchpage1.zip

Kevin Blount wrote:
> I went for the slightly slower method of reading each line of the
> noise.enu file and checking it against each word in the search string.
> The performance of the script isn't hit as much as I expected, so I'm
> happy with this solution.
>





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 07:04 PM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register