Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
My IIS log files are loaded with hits from the user-agent
'gsa-crawler'. It appears this is the user-agent of a Google Search
Appliance. An @ google email address is listed as well.

I would like to disallow this crawler from my website, but do not want
to restrict the regular google crawler. Can anyone confirm that
gsa-crawler is definitley NOT the crawler for google's search engine.

Re: gsa-crawler wrote in news:1131647529.858022.265700

Quoted text here. Click to load it

Re: gsa-crawler

This is not THE Google crawler...

It looks like they like they are attempting to outsource the actual crawling
for sites.

You have the machine that works like a local sitebased searchengine after
which Google comes in and take 1 file with all info... kinda like the
sitemap.xml they are promoting

Google's update frequency is slow and the GSA for corporate and business
sites and the sitemap.xml for the common man might be a good way to speed
things up dramaticly if it's use increases.  It sure goes faster to index 1
file compared to crawling 500.000!

You might see an evolution from the traditional crawling Google to an
indexing Google...

Quoted text here. Click to load it

Site Timeline