IIS Index Server - Searching html files

This is Interesting: Free IT Magazines  
Home > Archive > IIS Index Server > March 2005 > Searching html files





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Searching html files
MattG

2005-03-15, 5:58 pm

I have my index server successfully searching various office documents,
with the new v2 of DSOFile successfully working with my custom
properties.

However, .htm, .html and .pdf files don't seem to have any properties
cached by indexing service.
They are returned in my query when I search for $all htm, but indexing
service does not seem to have their properties available (except filename
and size).

Custom properties are chached, unknown extensions is checked.. it works
perfectly on office documents.. what am I doing wrong??


Hilary Cotter

2005-03-15, 5:58 pm

you will have to cache these properties yourself. Do they show up in the
properties folder? If not there is something malformed about your html doc.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

"MattG" <jetblackg60@hotmail.com> wrote in message
news:d4f77a5f7a565c71e5de377eeab3e763@lo
calhost.talkaboutsoftware.com...
> I have my index server successfully searching various office documents,
> with the new v2 of DSOFile successfully working with my custom
> properties.
>
> However, .htm, .html and .pdf files don't seem to have any properties
> cached by indexing service.
> They are returned in my query when I search for $all htm, but indexing
> service does not seem to have their properties available (except filename
> and size).
>
> Custom properties are chached, unknown extensions is checked.. it works
> perfectly on office documents.. what am I doing wrong??
>
>



MattG

2005-03-15, 5:58 pm

Thanks for the reply.
I modified the html files to have the same custom properties as the
various Office files.
These show up in the properties folder and are cached with their default
values.
(I will set up a test catalogue with only html files and see if the
properties are listed, somehow I doubt they will be.)

When I set RS to my IXSSO.Query, I get back for example:
RS("dl_checkoutby")
This displays fine (and is searchable) for office documents, but not for
hmtl files.

However, if I use DSOFile to look at a sample html file, using:
objCustomProperties("dl_checkoutby").Value
It will return the correct value that is supposed to be stored in there.

MattG

2005-03-16, 5:58 pm

I created a test catalogue with only one html file in it (that contained my
custom properties).

I cannot get the properties to show up in the catalogue's property folder.
Stop/restarts/rescans, nothing seems to get them.

I am able to query and successfully return the file by searching
filename.

This is really troubling, indexing service had no problem with html files
back when I started the application a few months ago.

Hilary Cotter

2005-03-16, 5:58 pm

download filtdump from the SDK. Run your doc through it to see if it picks
up the properties. If it does you should see these properties in the
properties folder for your catalog.

If not, it is possible - but unlikely you have catalog corruption. You might
also run into problems if your property names are not unique.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

"MattG" <jetblackg60@hotmail.com> wrote in message
news:4900cc33fb45d0930ee3ed79a14c0e1e@lo
calhost.talkaboutsoftware.com...
> I created a test catalogue with only one html file in it (that contained

my
> custom properties).
>
> I cannot get the properties to show up in the catalogue's property folder.
> Stop/restarts/rescans, nothing seems to get them.
>
> I am able to query and successfully return the file by searching
> filename.
>
> This is really troubling, indexing service had no problem with html files
> back when I started the application a few months ago.
>



MattG

2005-03-16, 5:58 pm

I've tried running htm and doc files through filtdump.

The htm file returns a whole bunch of properties, most are blank values.
The only properties with a value are the ones that are in the htm
themselves, like <title> and <th>. Custom properties do not show up.
For the doc file, filtdump says:
**Additional Properties available via IPropertyStorage.
And then just outputs the contents of the word doc, nothing else.

Using the demo VB6 exe included with DSOFile v2, I added some custom files
to a new htm file, stuck it in it's own directory, created a new catalog,
and added that dir.
Even after rescanning, the custom properties do now show up.
If I stick a new doc file in the dir, adding custom properties in the same
way, they correctly show up in the properties folder after a rescan.

I am doing this on my local machine this time, not the server, and I have
the same problem.


Hilary Cotter

2005-03-16, 5:58 pm

can you post one of these docs here or send it zipped to me offline?

"MattG" <jetblackg60@hotmail.com> wrote in message
news:ca7b498afaa33e57eeb38ce8bdc1113e@lo
calhost.talkaboutsoftware.com...
> I've tried running htm and doc files through filtdump.
>
> The htm file returns a whole bunch of properties, most are blank values.
> The only properties with a value are the ones that are in the htm
> themselves, like <title> and <th>. Custom properties do not show up.
> For the doc file, filtdump says:
> **Additional Properties available via IPropertyStorage.
> And then just outputs the contents of the word doc, nothing else.
>
> Using the demo VB6 exe included with DSOFile v2, I added some custom files
> to a new htm file, stuck it in it's own directory, created a new catalog,
> and added that dir.
> Even after rescanning, the custom properties do now show up.
> If I stick a new doc file in the dir, adding custom properties in the same
> way, they correctly show up in the properties folder after a rescan.
>
> I am doing this on my local machine this time, not the server, and I have
> the same problem.
>
>



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com