Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Nick Wedd
January 19, 2011, 8:07 pm
rate this thread
perl script which notes the $ENV value (this is the
visitor's IP address) for all visitors.
I have recently noticed an IP address 220.127.116.11, belonging to
someone/something which visits my pages exactly once a day. I find that
this is Yandex, the biggest Russian-language search engine. Fair
enough, Yandex is checking out my pages. But why don't I see similar
visits from Google, or any other search engine? I know that Google is
aware of these pages.
Nick Wedd firstname.lastname@example.org
Re: idle curiosity
Nick Wedd wrote:
Google does not have such visit pattern - to check one page once every
day. In fact, I would guess that Yandex does not have such visit pattern
either. Most likely it is someone (a competitor perhaps) using Yandex
translation tools as a free Web proxy to check on your page. And this is
probably done as a cron job via wget or a Perl or PHP script. Since they
don't want to be seen by their own IP, I think it's safe to assume it's a
To find a remedy you'll have to consider what sort of page it is. If it's
a dynamically-built page (such as account registration page - the usual
vector of attack) you may want to check your site's cookie before showing
the page. If it's a static page, I don't believe you have a remedy
sections of the page you don't want shown to bots.
Although it won't help with this particular attack (if it is in fact an
attack), you may want to use the "noarchive" meta tag just so people
cannot get all the info they need without even hitting your site (leaving
any trace) by browsing search engines' cache. I believe Yandex, as well as
Google, honor that meta tag ( <meta name="robots" content="noarchive"> ).
A side note: kudos for picking the most appropriate message subject! :)
Chasing *one* single IP hitting you *once* a day exactly qualifies for
"idle curiosity" :)