Do you have a question? Post it now! No Registration Necessary. Now with pictures!
January 18, 2007, 2:59 pm
rate this thread
It appears that only specific bots are not obeying the robots.txt file
and indexing pages are rates that can potentially cause server issues.
The specific IP addresses appear to be in the 74.6.x block. They do
reverse DNS to inktomi, which is correct.
Forum discussion at WebmasterWorld and Search Engine Watch Forums.
Posted by rustybrick in Yahoo! Search Optimization at January 17, 2007
7:57 AM | Comments (5) | TrackBacks (0)
Email this Subscribe to this feed Digg This! Save to
del.icio.us Google Co-op
Entry Technorati Tags :: Related Content
crawlers, slurp, yahoo, yahoo slurp
My blog is the Maytag man of the blogosphere. Yet Yahoo! (Inktomi)
crawls it every day faithfully.
Just me and Inktomi but I am thankful they notice, LOL.
Posted by: Mike at January 17, 2007 9:44 AM Permalink
This was discussed on the LED Digest last week - the original post is
in #2321: http://www.led-digest.com/content/view/1701/55/ with
responses in the next 3-4 issues. As far as I know the OP never
resolved this, but he did offer a piece of advice:
"Feature request for SE spiders: Provide a referrer. Please. It would
make me and I expect other site owners feel grateful when odd URL
requests are noticed. If more than one referrer, then just any one --
the last one, the first one, doesn't matter which. Referrer information
could save people a lot of time, and let them keep their
hair a while longer."
Hope this info helps...
Posted by: Adam Audette at January 17, 2007 11:31 AM Permalink
Thanks Adam, sorry for missing it.
Posted by: Barry Schwartz at January 17, 2007 11:39 AM Permalink
I answered the specific question on webmasterworld. It does not seem
that there is an issue with the crawler in this instance but an
incorrect interpreatation of the robots.txt syntax by the publisher.
Posted by: Tim at January 17, 2007 2:01 PM Permalink
Is Yahoo's bot based on WGet? I get this doubt because there was the
following line in my log file
2007-01-17 17:44:24 W3SVC105 NT-110 XX.XX.XX.XX GET / - 80 -
220.127.116.11 HTTP/1.0 Wget/1.8.2 - - www.mydomain.com 200 0 0 11155
The IP Reverse DNSes to i18ndev23.yst.corp.yahoo.com.
Posted by: Ram at January 18, 2007 12:59 AM Permalink
TrackBack URL for this entry: