Most of us have by now realised that the data Google gives us in its various webmaster and AdWords tools is, at most horrendously inaccurate. But just how inaccurate is it? I decided to gather some web analytics data on one of my client sites and compare it to relevant data from Google Webmaster Tools (GWT) and the AdWords Keyword Tool (GAWKT), and see exactly where the pain points are.
Now I’m not the greatest of Excel wizards, so I just stuck with a basic table and a couple of graphs that I feel get the point across quite well.
First of all I went to Google Webmaster Tools and looked at the Search Engine Optimization reports for that website, specifically the Queries report. There I found 10 first-page ranked non-branded keywords, and exported all the relevant metrics to an Excel sheet.
Then I used the site’s web analytics package’s (Google Analytics) Traffic Sources report to see how many visits were sent to the site on those 10 keywords via Google organic search.
I also ran the keywords through the Google AdWords Keyword Tool to get the global & local search volumes for these 10 keywords, and I checked each keyword in the SERPs to get its current Google rank (on a private browsing session using unpersonalised search). For nearly all of these keywords I’ve been tracking their rankings for months and I know they haven’t fluctuated much – if at all – in the last 30 days.
Data Accuracy in Google’s Toolset
Now that I had all this data, I started comparing it. First I compared the AdWords Keyword Tool search volume data to the actual visits that the site received, and mapped the most recent rank of the keywords as well. With this comparison I hoped to see how well search volume and rank correlate with site visits.
This graph doesn’t tell us very much aside from the fact that the amount of visits the site received on a specific keyword had little correlation with the search volumes as reported by Google’s keyword tool. Rankings combined with local search volumes do seem to show some correlation, but there’s an apparent disconnect between a keyword’s rank and the percentage of visits it sends to the site (traffic share) – something which I explored further in a second graph.
Because the first graph was so inconclusive, I decided to map the data from the Webmaster Tools queries report to the traffic share and ranking data that I’d gathered:
Some points emerged from this graph:
The rank as I registered it myself in the SERPs has a good correlation with the average rank reported by GWT, though aside from a single instance they always differed, with the actual rank usually higher than the average rank GWT reports.
Traffic share (site visits as a percentage of local search volume) and click-through rate as reported by GWT are vastly different, with the traffic share % nearly consistently higher – and usually by quite a margin. Only once were they acceptably close.
- When trying to determine which data source is polluting the figures, we can easily filter out some of the more obvious ones. As it turns out the GWT data is relatively accurate: visits on the website and click figures in GWT roughly match up, though the site’s own data is again usually a wee bit higher:
If we then map the GWT Impressions data against the Local Search Volume data as reported by the AdWords Keyword Tool, a different picture emerges:
Here we see that in every instance the impressions number differs from both the global and local search volumes, and often this difference is quite significant. Usually the impressions number from GWT is much higher, but in a few instances it’s a bit lower than the AdWords Keyword Tool’s search volumes.
This graph, combined with the problem we had reconciling the traffic share data (which is derived from local search volume), can only lead to one conclusion: the search volume data from Google’s AdWords Keyword Tool is the primary polluter of these metrics. While the other data sources show an acceptable level of cohesion, the data from the Keyword Tool invariably fails to match up in any meaningful way.
One last thing that I found highly intriguing was that the actual rank of a keyword – as well as the average rank reported by GWT – and that keyword’s clickthrough rate don’t seem to correlate very well. We often work off of the assumption that the top ranked keyword gets the most clicks, but this data leads me to suspect that might not be an entirely accurate assumption:
The highest CTR’s seem to be found with rankings slightly below the top organic slot.
Now this is of course a totally unscientific experiment based on a single website, a limited data set, and no control data whatsoever. So realistically we can’t – and shouldn’t – draw any reliable conclusions from this.
Nonetheless I wanted to share this with you, as what SEO blogging generally lacks is data. I thought I’d add a wee bit of actual data to the ongoing debates about SEO best practices, and specifically about data accuracy. Do with it what you will – I know I shall.