Web crawlers and dynamic content

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

This must be a very old and well studied question...
I created a site using JSP, that permits the viewing of articles that
are fully stored in a database. The url for the articles is something
My question is, and this is the same for all dynamic content
technologies (server-side scripting technologies): is there a way to
permit for search engines and crawlers (Google, Yahoo...) to find and
categorize these article pages?
Thank you very much,
Araxes Tharsis

Re: Web crawlers and dynamic content

Quoted text here. Click to load it

fwiw, google doesnt seem to have a problem accessing my pages with query
strings (I have scripts set up to email me when it indexes).  however, so
far, it has not (if memory serves) got to any page with more than one
variable in the qs.

what you can do is use mod_rewrite to make your urls se friendly.  google
it, there's lots of stuff on it.  this will help:

here is the script I'm using for the google bot:

$currentPage = $_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];// gives
contents of address bar
$mail = "googledetector@mydomain.com";
$subject = "GoogleBot - $currentPage ";
$message = "The GoogleBot has indexed this page! $currentPage";
mail($mail, $subject, $message);

here's a list of bots you could use to add to that script, generate a email
for each kind of bot, or write to a database or flatfile:


Re: Web crawlers and dynamic content

Don't multipost please.

Re: Web crawlers and dynamic content

Quoted text here. Click to load it

So an anwser should be easy to find (searchengines and usenet archives).

Quoted text here. Click to load it

Create a page with the links you want to have indexed, SE will find it
(if this page is linked somewhere offcourse) and index it.

Site Timeline