Categories

Google stops sending referral data - the end of keyword research?

Back in mid October 2011 Google announced that they will stop sending referral data for searches originating from their secure SSL domain – https://www.google.com. Here is their original announcement @ googleblog.blogspot.com . Since then most webmasters that care to look at their referral data noticed that great many Google referrals now look as if they are coming from Google’s homepage (or strange meaningless URLs on Google’s domain – depending on your statistics software settings).

Google referral data (missing) in the Web server log file

Google referral data (missing) in the Web server log file

This is how the log file record for a visitor coming from Google now looks like (IP scrambled for privacy). Note the &q= query string parameter that used to contain the URL-encrypted search keywords but now is just empty (highlighted in yellow).

Personally, I think that removing the referral data was actually a positive move from the privacy perspective. Imagine that someone on an open (or semi-open like in a hotel) WiFi network could see what you’re searching for by looking at the referral field and easily parsing the Google URL into the search phrase. Even if your connection to Google would have been secure via the SSL, without getting rid of the referral data, Google would not be able to completely conceal what you were looking for.

There has been a lot of doom-sayers on the Net right after the announcement that predicted the end of keyword research as we know it. However, it would have been naive to think that such important and valuable data will be out in the open forever. Eventually, all other search engines will follow suit. Their entire business is based on selling these keywords to their advertisers, right after receiving these keywords from their customers as a “payment” for the services rendered (information found). Sort of like a bank that sells credit services after receiving deposits from customers. Can you imagine banks not taking all necessary steps to protect money and other assets from the outside world? So the search engines will create ways to control the flow of the keyword information and profiting from restricting access to it.

Despite the relatively bleak outlook for the proverbial “little guy” on the Internet – owners and operators of smaller, niche sites – there are still important sources of data for keyword research out there. There are some that have been overlooked historically, and now is the perfect time to make better use of all those “secondary” sources.

  1. First and foremost there is your own historical data. Make sure you have backups of your raw Web server logs – chances are you will be coming back to them more often now that the access to fresh sources of keyword data will be restricted
  2. Google will only strip referral data from SSL users which includes all logged-in users (mandatory) and everybody else who cared to open https://www.google.com instead of the usual http://www.google.com (optional) This particular source of keywords, however, can be gone relatively soon, too: since most people just type in “google.com” into the address line of their browser window, it is the browser that has to come up with the proper URL and right now the default is http://www.google.com . I would predict that all major browsers will be converting this to https://www.google.com/ automatically within a year or even less. So, getting back to item #1, make sure you carefully back up this information because it may soon be gone
  3. Google’s search partners, such as AOL, Compuserve, Verizon, Netscape and a whole bunch of others are serving Google results but do not provide SSL functionality and don’t conceal the referral data. You better train your statistics software to extract keywords data from those secondary Google referrals, too
  4. Don’t forget to collect data from your very own site search! It is incredible how ofter this obvious source of keywords data is overlooked.
  5. When the restrictions on referral data will affect more than single digit percentage of all Internet searches (Google’s recent estimate), I think there may be a secondary keyword market created to facilitate buying and selling of this valuable data between webmasters directly. It may be the case that your old Web server logs, even if stripped of IP information for privacy, can be an unexpected product of your site that you may be able to sell, so, again, getting back to the point #1, make sure you keep the logs safely backed up
  6. Last but not least: Google promises to make 1000 most often used to find your site keywords available to you via the Web Master Tools. I think it would make total sense to back this information up regularly as well

So, if you are into keywords research, not all is lost. However, the sources of data are going to become more scarce as we go along and so it becomes even more important to keep the data you already have safe so you can come back to it and analyze it in the future.