Creating Your Own Analytic Benchmark

February 19, 2009

Google Analytics, we all know, has a feature that you can opt into and benchmark your performance against similar sized websites in similar industries. But what do you do when that doesn’t quite meet your needs – for example, you have a fiduciary responsibility to benchmark specific websites against each other?  And how do you do it if the sites are competitive (can’t see each other’s data)?  There are both technical issues and analytic issues.

Here at LunaMetrics, we’ve succeeded with this by implementing a single set of GA tracking code across all sites, as well as a second set that is a unique account for each site. This creates both a roll-up and easy comparison.  It also ensures that each of the sites in the benchmark has its own, private account not accessible to the other sites.  There are situations in which the technical work gets complex; John Henson from Luna is going to blog about that topic next week.

The hardest part of creating a benchmark is getting the organization to agree on what the metrics should be, especially if the organization is benchmarking to show “how they are doing” (instead of benchmarking to answer specific questions). In the former case, the organization’s web managers usually need to see a trial benchmark to decide how well it meets their needs – so it tends to be an iterative process.

At this point – technical issues dealt with, benchmarked indicators agreed upon – the organization will often find that the indicators don’t quite make sense. For example, it’s not helpful to compare visits or unique visitors to (say) The New York Times vs. the Pittsburgh Post-Gazette, because one is so much larger than the other. In situations like these, trending is your friend – it still makes sense to see if visits to The New York Times increase or decrease over time at the same rate as those to the Pittsburgh Post-Gazette, by segment.

The skew that large sites bring to the benchmark can be remarkable, even when you are looking at rates rather than absolute metrics. For example, let’s say that you are comparing pageviews/visit among three sites. The first one has a million pageviews and 700K visits (1.43 pages/visit). The second has fifty pageviews and twenty-five visits (2 pages/visit). The third has 25 pageviews and three visits (8.33 pages/visit). But when we benchmark by adding all the pageviews and dividing by the sum of all the pageviews, the average will always be very close to 1.43 – i.e., very close to the one site that overwhelms the other sites in size. (And so the data are not very meaningful.) On the other hand, if we benchmark by computing the ratio of pageviews/visit for each site first, and then averaging the ratios across the three sites, the tiny sites with their higher metrics have a disproportionate pull on the average. A good answer might be to provide both benchmarks.

Finally, you have to be careful about using benchmarks with metrics that add up to 100%, unless you have a very clear understanding of what “good” means to the overall organization. Trends in metrics that add up to 100% are a zero-sum game – one metric can only go up if another one goes down. For example, let’s say that you consider benchmarking visits by medium.  One site might bring in lots of visits organically and another, via direct. Unless the benchmarking organization has a clear mandate (e.g. “increase visits from search engines organically”), the acquisition patterns of the two sites mentioned here (organic vs. direct) are not illuminating. You will have learned that they are different, but not how one is doing vs. the other. (New visits vs repeat visits face the same issue.)