Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Posted on
November 30, 2005, 10:04 am
rate this thread
Re: Any reason for large reduction in number of pages indexed?
It is difficult for the search engines to work out how many pages are worth
indexing if you have much similar from one page to the next. I have had a
look at the site and the number of pages would appear to be well in excess
of 5000, possibly very many more. You need to envisage and count up all
possible permutations of parameters passed after the root url names.
Take an example. If the same content can be shown in three different sort
directions (sku, product and price) you triple the number of pages
potentially to be indexed but the content of the three pages in each triplet
is the same - with just the paragraphs in a different order. So ideally you
would put noindex nofollow on two thirds of these pages and save the search
engine a lot of wasted effort. Preferably you might use wildcards in the
googlebot robots.txt to bar it from accessing two thirds of your indexes in
the first place. Easier, just turn off the least used alternatives.
Here is something else: Consider all the individual book detail pages.
They are all substantially the same. Every page has the same long reviews
of County Fair and Jamie Oliver. It is being a bit optimistic to think that
the search engines will carefully identify and automatically ignore all
these identical reviews.
You want just the unique meaningful content indexed once only, if possible.
<bold text> If you have written all those book reviews yourself a printout
of just those book review texts is the core information value of your site
and ideally you want the search engine to store that only, plus a brief
index and cross references as appropriate. </unbold text>
Consider using Google Sitemaps so that you can tell Google to index each
detailed book review page once, plus an index tree leading up and down from
the home page. Drastically reduce duplicate content on your pages so that
instead of 85% of the text on all pages being the same as on every other
page the amount the same from page to page is under 10%. Content Management
Systems, which I think what this kind of thing is called, have a strong
tendency to dilute the unique content - like distributing a pinch of gold
dust in a pile of sand. It is the exact opposite of what search engines are
trying to do, which is to find and then index unique quality content, if
Best regards, Eric.