|
Posted by Tthe guy who br on September 1, 2006, 1:53 pm
Please log in for more thread options
Cheers - hope you got some good sleep in :)
That's a good 10,000 ft overview - just what I needed.
Gonna go away and think about it.......
"Jeff R." wrote:
> Okie dokie,
>
> Here's the skinny, it's kind of a 10,000 foot overview but I think it will
> answer most of your questions.
>
> The protocol handler is responsible for allowing the indexer to crawl your
> data base and is called by the indexer. You can set crawlscope rules in the
> indexer to specify what you do and don't want indexed in your database.
>
> Now the Protocol Handler (PH) will in essence get you from place to place in
> your database and enumerate what's there. however to break open and parse
> the items to get info in to the indexer you need IFilters. IFilters will
> pull chunks of data out and then you can get the type of data and the value
> of the data then pass it to the indexer in pairs and let it put it where it
> needs to go. When you are done with an item the PH looks at the next item,
> decides what it is and applies the proper IFilter if one is needed. For
> example if you had a word DOC, XL file and say a PDF. The two office docs
> would probably use an IFilter that comes with Office, no worries, the PDF
> however you would need to download from the Add-ins site or write your own.
>
> Also this technology is almost verbatium exactly the same as for SharePoint
> Server. If you look on MSDN you will find a wealth of information on this
> there as well as a prewritten IFilter (with source) to pick appart and hack
> to your hearts content! I believe there is also a doc and code for a PH but
> don't quote me on that.
>
> Try this for the IFilters:
> How To:
>
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odc_SP2003_ta/html/ODC_HowToWriteaFilter.asp?frame=true
>
> Premade:
> http://addins.msn.com/addins_category_desktop.aspx
>
> More Stuff:
> http://channel9.msdn.com/wiki/default.aspx/Channel9.DesktopSearchIFilters
>
> For PH's you may find this interesting :)
>
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/spssdk/html/_introduction_to_a_protocol_handler.asp
>
> Sorry it took so long to get back to you but I decided to sleep a little
> this week. :)
>
> Good Luck!
> JR
>
> >
> > Guess it kind helps.
> >
> > So I need to write a Protocol handler. What protocol will the handler be
> > handling?
> >
> > As I say my database is in a '.db' file. So would I be handling something
> > like:
> > myhandler://path/mydbfile.db
> >
> > If you'll excuse the questionable pseudo code, I want to do something like
> > this:
> > ---------
> > filenames = SELECT filename FROM filestable
> > foreach(nextfile in filenames)
> > {
> > filemetadata = SELECT key, value FROM metadatatable WHERE
> > filename=nextfile
> > foreach( keyvaluepair in filemetadata )
> > {
> > IndexThisPieceOfMetaData( nextfile, keyvaluepair )
> > }
> > }
> > ---------
> >
> > What does what & when in the process?
> > - Is this the right process flow?:
> > 1) WDS is running and decides to index stuff
> > 2) WDS finds my .db file in a indexable location
> > 3) Someway, Somehow WDS finds out my protocol handler is registered to
> > this
> > type, so calls myhandler://path/mydbfile.db
> > 4) My handler would recieve the path/filename, open the database and start
> > doing "stuff" with my current API to get the metadata.
> > 5) Metadata is sent to the index
> >
> >
> > Questions:
> > 1) Would I be right in assuming this happens when the file is changed?
> > Since
> > this is a db file this may change very often. Should I just 'live with
> > it'\'make sure it is thread safe'\'have some flag in my db to say what
> > changed'.
> > 3) & 5) How\Why\What\When\Where?
> >
> > In the bit of pseudo code, what does
> > IndexThisPieceOfMetaData( nextfile, keyvaluepair )
> > actually do? How is this metadata sent to the index? How do I tell the
> > index
> > that I'm talking about a different file? Is it simply case of one of the
> > key\value pairs sent to the index being 'filename'\'<absolute path of the
> > file>'
> >
> >
> > Is an IFilter used? Is so, when?
> >
> >
> >
> > I've got a gazillion other questions, but I need to get clear how the
> > whole
> > thing hangs together, what takes responsibility for doing what, and when.
> >
> >
> >
>
>
>
|