Links, PageRank, Referrer Spam and a Future Reality of Web Traffic

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I have had a zombie attack directed at my Web server for the past week or
two. Yesterday it peaked with over 30,000 attacks by Windows machines
world-wide. The motive was referrer spam and probably vandalism too. As I
contacted my host for assistance (requiring installation of software), I
was served the following page by the sysadmin:

I immediately recognised it as davidof's site and I quite like the following
bit, which I will quote below for the group to read:

(context is link spam)

Isn't Spam really Google's fault?

It seems that Heisenberg was right. By observing something you affect the
observations. Google's PhDs, for all their brains,  are a somewhat innocent
bunch; unable to see the consequences of their actions or understand how
the real world operates. By basing their search engine rankings on
inbound-links and anchor text they encourage unscrupulous people to exploit
weaknesses in the system to boost their websites to the top of Google's

Google is a great resource for finding information. However the Webosphere
didn't ask Google to set up shop. Google is a business. Like the spammers
they are in it for the money. Now I'm not saying Google is evil but they
need to mature as a business. Instead of focussing on propeller-heads
who've probably never had a girlfriend they need to employ some guys with
street smarts who can think through the latest whizzy idea before it gets
beta'd on the rest of us. Other large businesses have to take some
responsibility for their actions (well ok not Microsoft, they have the
EULA), so why not Google?

Still there are differences between setting up the environment that
encourages spam and actually generating the damn stuff. But if we don't
act, search engine spam will harm the web just as surely as UCE has harmed
email. We don't want to reach the stage where 85% of all requests to a
website are spam do we?

There are other actors. Microsoft for selling a completely insecure
operation system in the form of Windows must shoulder a lot of blame. ISPs
and Web hosting companies for supporting the spammers.

The last two paragraphs are similar to stuff I said earlier today, before
reading David George's take. I think the analysis above is insightful. In a

* Blame ISP's for harbouring spammy traffic

* Blame Microsoft for unleashing a faulty O/S out of the box

* Blame Google for unintentionally giving incentive for Web spam

E-mail traffic worldwide is about 50% spam. If all goes as planned or
predicted, expect 50% of content to be mirrors, 50% of links to be
synthetic and 50% of Web traffic to be utter garbage. Great future ahead!
Enjoy the Net today... before it's destroyed. I have been manually
filtering (human filter) for my site for the past 24 hours. Had I not done
that, my shared host would not have coped and I would have been 'separated'
from the Web. I have not done any work whatsoever today and yesterday.
Luckily, my supervisor understands.


Re: Links, PageRank, Referrer Spam and a Future Reality of Web Traffic

Roy Schestowitz wrote:

Quoted text here. Click to load it

What's with the pessimism, Roy?  

Site’s under attack? Big deal. I am sitting here and looking at 1200+
entries list of unauthorized root access attempts on only one of my hosts
for the last 24 hours. Business as usual. Sometimes 3000, sometimes 500.
Never less than 500 in a day.

The point is: how is the Websphere (not a trade mark but an analogue to
Noosphere ir Biosphere ;-) ) different in this sense from the rest of your
life? Unless you were raised under a glass dome, you might have noticed
that life is not fair. There’s crime and there are accidents and
everything else that makes life less than perfect. I assume you wouldn’t
leave your car unlocked in a bad part of the city you live in, would you?
I you did, would you be surprised if you saw it vandalized upon your

90% of my mail is junk. That’s my US Post-delivered mail I’m talking
about.  If I did not dump all the junk from my PO into the adjacent trash
bin at least once a week, it would have exploded. Does it mean US Post has
to be shut down? God, no! I wouldn’t be able to get that new PC mouse I
ordered recently.

I do have a feeling that’s an exact opposite to yours: the Web was a saint
place compared to the rest of the world, and now it just gets back to

Hey, look at the bright side: there are good people out there! Say hello
to your supervisor ;-)

Cheers, and I mean CHE-E-E-RS!
See Site Sig Below

Article posted with Web Developer's USENET Archive
Web and RSS gateway to your favorite newsgroup - - 15948 messages and counting!

Re: Links, PageRank, Referrer Spam and a Future Reality of Web Traffic

On Thu, 13 Oct 2005 18:15:45 +0100, Roy Schestowitz

Quoted text here. Click to load it

SE spammers had incentive to spam long before Google, and they did
plenty of it. Anyone else remember the TV commercial - I forget for
which SE - where they had a bunch of old people("same old links") call
out their site name when a searcher calls out their search terms?
There was one old guy in a leather harness calling out "Hot Leather
Action!" or something like that, to which another replied "Oh, you
come up for everything!"

At the time it was mostly content and keyword tag spamming. Google's
system of analyzing links was just the ticket to sift out the real
stuff (which legit sites generally linked to) from the scum.

The problem is they let it come out how they did it. (It was, of
course, just a matter of time before the spammers figured it out, even
if they hadn't leaked it via their patents.) Any criteria by which
sites can be evaluated for relevancy & authority can be targeted if
it's known. Google has, of course, refined their system, mostly
plugging the holes in ways that seem to be aimed at forcing spam to be
more obvious to the user. The holy grail is, of course, to get the
criteria to the point that a page absolutely *has to be* relevant
and/or authoritative to meet the criteria and where any relevant
and/or authoritative page will meet it. That point may be approached,
but short of some degree of AI, it will never actually be met, and
probably not even then.

Re: How Search Engines SHOULD Be Managed

__/ [John A.] on Friday 14 October 2005 02:33 \__

Quoted text here. Click to load it

Interesting take. Some months ago I argued that in order to avoid bias and
avoid corruption, the following steps should at least be considered:

- Make a search engine public service[1], much like the W3C's validation
services and ICANN/ The Web belongs to everyone in this
world and search -- the means by which data gets organised -- should be a
service. Likewise, an operating system should be nobody's property.
Hardware should, but not the platform upon which people communicate.
Conflicting interests leads to protocol breakage... (I am going endlessly
off topic, so I will stop)

- Have sites register in one form or another to state their aims and scope.
DMOZ goes some way towards that, but the whole
(corporation) loveaffair is disturbing in my eyes.

- Use more proper methods for exploiting knowledge and information. Don't
tell me (Schmitt) how long it will take you to index all human knowledge
(300 years, he said - reference available on demand). Do the task
_properly_! See the URL in the bottom of my sig as I truly believe search
engines are lagging behind what science (AI in particular) has to offer.

[1] Funding of crawling resources can be managed in the same way Google
does, e.g. paid listing in SERP's (not sponsored links in the actual
results), much like Yellow Pages where yellow/white tells apart ham from

I think there needs to be a strategic movement like GNU in order to release
ourselves from commercial search engines (and all-round public information
domnation). The financial entry barrier is high though. See:




All they (Brin, Page) needed was a little cash to move out of the dorm ? and
to pay off the credit cards they had maxed out buying a terabyte of memory.
So they wrote up a business plan, put their Ph.D. plans on hold, and went
looking for an angel investor. Their first visit was with a friend of a
faculty member.


Best Regards (happy to have heard your thoughts),


Roy S. Schestowitz      | Previous signature has been conceded  |    SuSE Linux    |     PGP-Key: 74572E8E
  4:55am  up 49 days 17:09,  3 users,  load average: 0.79, 0.51, 0.55 - next generation of search paradigms

Re: How Search Engines SHOULD Be Managed

Quoted text here. Click to load it

- Who's going to pay for that?
- Who's going to decide how it is going to work? The public?
  Will fail.

Quoted text here. Click to load it

Who defines this service?

Quoted text here. Click to load it

Why not?

Quoted text here. Click to load it


Quoted text here. Click to load it

Open Source doesn't mean that protocols will become clear, and well
defined. Also, protocols are not limited to software, they are in
hardware as well. Don't you just hate it when your 1 year old hardware
doesn't all work in your new motherbord?

Quoted text here. Click to load it

The same would happen if it became independent: there is an editor,
there is someone who wants in it -> conflicts, corruption.
Quoted text here. Click to load it

TANSTAAFL, that's the problem.
Quoted text here. Click to load it

Yup, that's the whole point. GNU.... have a look at HURD...

John                       Perl SEO tools:
                                             or have them custom made
                 Experienced (web) developer:

Re: Releasing Ourselves from Google and Microsoft Tyranny?

__/ [John Bokma] on Friday 14 October 2005 07:08 \__

Okay, you pulled my finger, so I'll have to answer your questions. *smile*

Quoted text here. Click to load it

You must have hit reply before reading it the first time. *grin*

Quoted text here. Click to load it

A panel of people who are said to be suitable and are knowledgeable in the
field in question.

Quoted text here. Click to load it

By owning an O/S, you partly own a person's computer. You definitely have
/control/ over it. If a commercial body controls your computer, it can
steer you towards elements that serve its financial agenda, to name just
one aspect of the problem.

Quoted text here. Click to load it

Software can be duplicated. Hardware cannot.

Quality control and competition are encouraging development. You could use
the same arguments when referring to software, but let us assume that there
are many experts out there (there already are) who contribute to Open
Source and will continue to do so for reputation, not direct profit.

Quoted text here. Click to load it

That is true. Need I raise the fact, however, that some hardware is design
to work only with Windows? (references on demand)

Quoted text here. Click to load it

Yes, definitely. As we seek a way of verifying that a site is worthful, how
about specifying clear protocols for acceptance, classification, and
rating? You could say the same thing about taxing, but the system still
appears to work (let us pretend).

Quoted text here. Click to load it

I wonder what Torr has to say on the subject...

I also wonder if the next step for Google would be to steer users to and other Java Desktop and JRE stuff. It now seems more
realistic and defensible view than a Google operating system ( a fantasy to


Roy S. Schestowitz      | "Black holes are where God is divided by zero"  |    SuSE Linux    |     PGP-Key: 74572E8E
  7:50am  up 49 days 20:04,  4 users,  load average: 1.67, 1.03, 0.73 - next generation of search paradigms

Site Timeline