|
Posted by Don on February 21, 2008, 10:13 pm
Please log in for more thread options
>
>
> [..]
>
>>> That being said: there are two ways that might do what you want:
>>>
>>> 1 IP address based: you have to find out the IP address ranges
>>> each bot you want to allow.
>>> 2 UserAgent string based: you have to find out each UA string for
>>> each bot you want to allow.
>>>
>>> In .htaccess you can redirect internally using either 1 or 2 to the
>>> right robots.txt.
>>
>> Thank you very much for a useful answer.
>>
>> Sorry if I've come off like an ass.
>
> Thanks, no problem.
>
> Like I said, a lot of people on Usenet think they have an X problem,
> while the real one is Y, so people often assume this is the case.
>
> I also still can't see why you want to do this, but like you wrote,
> it's your server:
>
> method 1: if you miss out spiders, you might lose traffic.
> hard to test (it can be done, with 2 computers + router)
> method 2: if you miss out spiders, you might lose traffic
> easy to test: you can either write a Perl program
> that changes the UA for each request, or check
> manually with Firefox + UA switcher add-on
>
>
> Untested:
>
> RewriteCond % =UA1 [OR]
> RewriteCond % =UA2 [OR]
> RewriteCond % =UA3 [OR]
> RewriteRule ^robots.txt$ real-robots.txt [L]
>
> with UA1..UAn the *exact* UA plain string, e.g.
> Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)
>
> See: http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html
>
John,
Just a heads up (not critique).
The last "[OR]" is invalid.
|