Hiding "sub pages" from web indexing?

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I've got a page that lists people's names as links.

Click on a person's name, and you see detail info for that person.

Is there any way to make it so that only that parent page can get to the detail

The brass ring would be just plain no access at all - unless the request comes
from the parent page...but hiding those detail pages from whatever goes through
the web and indexes things would be good enough.

I'm trying to head off a perceived privacy issue.   From my perspective it seems
moot - but I think it will come up eventually.

Re: Hiding "sub pages" from web indexing?

(Pete Cresswell) wrote:

Quoted text here. Click to load it


If there is a privacy issue, then the only answer is to not put the
pages up in the first place, or to have them behind a .htaccess password
protected directory.

If some of the information on the pages is private, then why not allow
the people to opt out of whatever information they don't want the world
knowing? If a person is happy with the page being on the Internet, then
they are happy with the world being able to see it, otherwise it
shouldn't be there.

Dylan Parry
http://webpageworkshop.co.uk -- FREE Web tutorials and references

Re: Hiding "sub pages" from web indexing?

Quoted text here. Click to load it

Thanks.   I arrived at some of that sleeping on it last nite.   Was going to
withdraw the question...but found you had answered already.

This is a high school class web site that grew out of the last
reunion we had.

My agenda:

1) Prevent harvesting of email addresses.

2) Give people a feeling that not too much is exposed.

3) Still allow people to see what/how their classmates are doing.

4) Facilitate communication between classmates if it is desired.

In the end, I would give everybody the opportunity to totally, completely edit
their personal page - basically by composing a word processor document and
sending it to me tb converted to HTML.   But realistically, I think most people
won't do that.  With that in mind, I'll try to generate something that's
informative, but not too informative to someone/something that is harvesting

My current straw man:

1) Remove all contact-specific info from each person's page.  i.e. no phone
numbers, no email addresses; no last names - first names only.  Show state or
country of residence as a matter of general interest, but do not list town.

2) Instead of naming the person's page as 'SmithJohn.htm', name it 1347.htm
'1347' being the person's PK in a DB that I'l use to generate skeleton pages.

3) Leave the person's full name in the link to the page, making it necessary for
a web crawler or whatever to make the logical connection between link title and
target file name.

4) Provide some sort of vehicle for people to request contact information.
Probably an email to me which I relay to the person that is tb contacted.
Something like "Yo Sam: Joe wants to get in touch with you.  Here's how to get
back to Joe....".


Re: Hiding "sub pages" from web indexing?

(Pete Cresswell) wrote:
Quoted text here. Click to load it

There are some ways to make emails unharvestable by using special
characters... I don't recall the specifics. You can also make gif text.
The bast bet is to get a script email form so the email is sent to the
recipient but they don't know the address they are sending to but can
include a reply adresss.

Re: Hiding "sub pages" from web indexing?

Quoted text here. Click to load it
that can be done using .htaccess or modifying the apache configuration

Quoted text here. Click to load it
Google for robots.txt - it's a special file that tells bots which pages  
not to look at. The bots have to honor this file for the thing to work, of  
course, but they do. [or so I hear - never tried it myself]


Re: Hiding "sub pages" from web indexing?

Quoted text here. Click to load it

The well-behaved ones (e.g. Google) do. The naughty ones (the ones Pete
is trying to avoid) if anything will be _more_ likely to look at a
directory "blocked" via robots.txt than one that isn't.

Mark Parnell

Site Timeline