publicly accessible indexed web index?

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

dear readers:  I would like to do a basic web search and download a
list of all web pages matching the results.  I don't care very much
about ordering, because my own perl programs will then wget the
resulting web pages and see if they meet other needs of mine.  (I do
need to sift through 1000's of result pages, though.)

of course, I could use one of the many publicly accessible spider
program, and crawl the web myself, but this seems like a waste of
bandwidth.  are there public repositories that avoid the need for me to
crawl? used to have an API, but apparently just dropped it.
 moreover, I don't need much google or pagerank sophistication---I need
the old altavista-like comprehensiveness more than cleverness.

any pointers would be appreciated.



Re: publicly accessible indexed web index? wrote:

Quoted text here. Click to load it

From a search engine I guess.

Quoted text here. Click to load it

Didn't know that, and I doubt it. One can always spider Google directly,
but it's against Google's policy. The API is limited to 1000 queries a day
if I recall correctly.

Quoted text here. Click to load it

Based on what criteria do you want to fetch pages? I doubt you want to
spider away :-)

Quoted text here. Click to load it

I do this stuff for a living, 12+ years of Perl experience, see for pricing info etc.

John    Need help with SEO? Get started with a SEO report of your site:


Site Timeline