Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Luigi Donatello Asero
August 17, 2005, 12:48 am
rate this thread
Google robots have often spidered parts of my websites where I do not have
I wonder whether I coould stop robots doing so by using this:
RewriteCond % ^NameOfBadRobot.*
RewriteCond % ^123\.45\.67\.[8-9]$
RewriteRule ^/~quux/foo/arc/.+ - [F]
I found it at
But I do not understand what the remote adress refers to.
Is it the IP number of the robot?
Luigi Donatello (un italiano che vive in Svezia)
(minš olen Italian kansalainen, mutta minš asun Ruotsissa)
Re: Access restriction
Yes, the remote address would be the IP address the robot is requesting
If they're always trying to spider the same place then it would be
easier to exclude that area in your robots.txt file. There must be a
reason they're trying to spider it so again if it is always the same
place use the link:<url> command in the various search engines to try
and find who is linking to that non-existant place and causing the
spiders to try to read it.
Check the spiderability of your pages: http://www.spidertest.com
Paul Silver - freelance web developer