Easy Cohort Analysis For Blogs And Articles

August 04, 2014
By Jon Meck

It’s now easier than ever to track and compare performance between articles and blogs. While Google Analytics shows you pageviews and other key metrics, frequent content comparisons are made difficult by the shifting time frames.

How can I compare a blog post that was published this month vs. a blog post that was posted last month? Sure, we can run two different reports, pull it into Excel and start crunching the numbers, but there’s gotta be a better way!

Update-Dorcas Alexander created a nifty Data Studio report to help visualize these results:

blog-cohort-apples

Enter Cohort Analysis. You may have heard this term thrown around before, usually in relation to users on your site and when they first became users. The idea here is to group users or sessions into common groups, like who first visited in January or first-month visitors. Avinash and Justin Cutroni both love cohorts, so obviously we should, too!

In this case, we’re going to use Google Tag Manager to put content into cohorts so we can analyze how they performed in similar time frames. We’ll pass these into Google Analytics as Custom Dimensions so they’re available for analysis. It’s actually much easier than it sounds!

Step One: Find the Published/Posted Date

Like the title says, this post is really geared towards things that get published on a certain date. I originally started with blogs posts or articles in mind, but this could apply to other things that are published, for instance, if you’re a deal site and you have new deals go up each day, etc.

We need to find a way to get that publish date into Google Tag Manager. Here are three options, in order of my preference.

1. Data Layer
2. URL Structure
3. Element on Page

Data Layer

If you’re using Google Tag Manager, you likely love the data layer. Put all of your important information into one great place so GTM can snag that data and use it. Add this to your site if it’s not currently there, or ask your developer to help out. For us, it was as easy as adding the following PHP code to our Header template in WordPress.

'postedDate' : ''

The ultimate goal is to get the full date with time information on the data layer. If you have more questions about this step, check out this article that explains more about how the data layer works.

If you get this to the data layer, setting up the Data Layer Variable macro is pretty easy.

cohort-dlv-posteddate

URL Structure

If the data layer step is going to take too long or just isn’t technically possible, it’s time to start getting creative. Where else can you find the publish date? Check out our blog URL structure up in the address bar. For us, we actually have the date available! There’s no time available, but it’s better than nothing!

Our URLs look like this:

/blog/YEAR/MONTH/DAY/blogtitle/

We can use a Custom Javascript Macro to extract the date from the URL Path like the examples below.

cohort-customjs

function (){
    var url= {{url path}};
    var arr = url.match(/^\/blog\/(\d{4}\/\d{2}\/\d{2})\/.+/);
    if (arr) {
        return arr[1];
    } else {
        return null
    };
}

 

Element on the Page

Lastly – I’ll mention using an Element on the page. Look to see if the date is somewhere that you can steal it from the page itself. Find the date and check to see if it’s wrapped in an element. You can right-click on the date and check to see if it’s wrapped in a span or html tag with a unique ID.

cohort-inspect-span-smaller

If you’re so fortunate to have this available to you, this DOM Element macro is pretty easy to set up as well!

cohort-html-publishdate

Alternately, try viewing your source code and doing a CTRL+F for variations of your publish date. It may appear in a hidden field or somewhere else on the page in a uniquely identified tag that you can use.

Note of Caution: There’s a reason this is my least preferred method. If you use a DOM Element, there’s a good chance it might be not be available when the Pageview Tag fires. Use the highest DOM element on the page that you can, but if that’s not working reliably, you may have to alter the Rule for your Pageview tag to wait until gtm.dom. Any time you delay your Pageview Tag, you may lose a few Pageviews, so keep this in mind!

Step Two: Calculating Days/Weeks/Months Since Posted

Now that we have the date that the blog/article was posted, we can quickly calculate numbers of days/weeks and… months? Perhaps.

Again, I’ll reemphasize that it’s best to have the time that the content was published AND the time zone! If your visitor is coming from a different time zone, we want to accurately count how long it has been since the content has actually been on the site.

Set Up Your Custom JavaScript Macros

Now that we have the publish date, let’s grab today’s date and take the difference.

cohort-daysSince2

daysSincePosted – I’m going to round up here, so the first 24 hours will count as Day 1, and so on.

function() {
    var postDate = new Date({{postedDate}});
    var currDate = new Date();
    var daysSincePost = Math.ceil((currDate.getTime()-postDate.getTime())/1000/60/60/24);
    if(daysSincePost) {
	return daysSincePost;
	} else {
	return null;
	}
}

weeksSincePosted – We’ll just take days and divide by 7. Again, we’ll start in Week 1.

function() {
    var postDate = new Date({{postedDate}});
    var currDate = new Date();
    var daysSincePost = Math.ceil((currDate.getTime()-postDate.getTime())/1000/60/60/24);
    var weeksSincePost = Math.ceil(daysSincePost/7);
	if(weeksSincePost) {
	return weeksSincePost;
	} else {
	return null;
	}
}

monthsSincePosted – Months are really the toughest thing to do. Some months have 31 days, some have 30, February just hates consistency. If we’re talking about time passed, then months just don’t work well. My advice here is to just go with buckets of 30 days. bucketsOf30DaysSincePost doesn’t have quite the same ring though, so call it months and add an asterisk to your reports.

function() {
    var postDate = new Date({{postedDate}});
    var currDate = new Date();
    var daysSincePost = Math.ceil((currDate.getTime()-postDate.getTime())/1000/60/60/24);
    var monthsSincePost = Math.ceil(daysSincePost/30); 
    if(monthsSincePost) {
	return monthsSincePost;
	} else {
	return null;
	}
}

Step Three: Passing this Information In

Now that you have your Macros up and running, it’s time to pass these in as Custom Dimensions (Universal Analytics only). I created my Custom Dimensions in Property Settings and then added them onto the Pageview Tag under Custom Dimensions.

cohort-customdim

cohort-customdim-gtm

You’ll notice I also passed in the posted date. This is mostly for flexibility, just in case we need it for something else down the line!

Step Four: Custom Reports

Now that you have the info in Google Analytics, you can create all kinds of custom reports. Two simple custom reports can be set up like below, that use a longer time span but then only include data from an article’s first week or month.

cohort-customreport

Or, after enough time has passed, it will be easy to export the full list and pivot it into a triangle chart with blog title down the left side and week or month across the top.

cohort-triangledate

cohort-triangle

Note of caution: Because these are Dimensions and not Metrics, we won’t be able to do anything inside the Google Analytics interface that resembles Greater than or Less than selections. If you wanted to get everything before 60 days, you could use a regular expression like so ^[1-5]?[0-9]$, or always spit it into another program to crunch.