Additional Tracking with Tag Management Systems: Data Attributes or Data Layer?

Tags:

December 3, 2019

Data Attributes or Data Layer? blog image

A big part of successfully building out tracking on a website is strategizing the best method of implementation. The more thought out your strategy, the smoother the implementation and launch of that configuration.

Thinking through this strategy is particularly important when dealing with:

Tracking that needs to work consistently across many sites
Tracking that needs to work hand-in-hand with existing tracking
Tracking that needs to include data that doesn't already exist on the page
Single-page applications (SPAs)

Most tag management systems (TMS) offer flexible tools within the interface to track on-page interactions, either through triggers, listeners, or custom JavaScript. However, there are often additional pieces of information that might be needed in order to track a specific item, or to report on it in a specific way. For these, we'd look to developers to help solve these issues.

There are two main approaches to custom tracking when dealing with these situations:

Data attributes
Data layer pushes

Data Attributes

Data attributes are custom fields that exist on the elements that make up your site. Data attributes allow you to store contextual information on HTML elements. These attributes can be utilized for various reasons, but when it comes to tracking, we primarily use them to (1) differentiate between elements and (2) store information.

(1) They can be used as conditions in triggers as CSS selectors, allowing us to zero in, tracking specific elements on a page like a singular button or a group of links.

(2) They can also be used to store information that we can scrape from the DOM, passing that information as trigger conditions, as logic in custom JS variables, HTML tags, etc., or sent straight to Google Analytics (GA) as custom dimensions or in an event field.

If we're looking to add particular context to the page around a specific button or page element, data attributes are a passive way to make that context available to our TMS, without being visible to the website visitor.

Data Layer Pushes

The data layer exists as part of most tag management systems. The data layer is a stabile JavaScript object (JSO) that contains the information we want to pass to MarTech solutions like Google Analytics or Adobe Analytics. Tag management solutions like Google Tag Manager (GTM), Adobe Launch, or Tealium are able to retrieve the information from the data layer and the data layer variables can then be referenced in tags, triggers and variables (GTM) and in rules, conditions and data elements (Launch).

When the data is stored and what data is stored is controlled by the site's developer. We can then hook into that information using our tag management system, sending whatever data wherever we want to send it. Often, the data layer loads above a TMS and includes a wealth of information about the page, the user, or the content. This static information can be then used to send to different platforms.

In addition, developers can send information, or push information to the data layer when a specific action occurs, like clicking on a button. The developer is in full control here — deciding when this extra information should be sent to the data layer. The TMS can then react to that information.

Similar to data attributes, data stored in the data layer can be used in different ways. The data can be used as conditions in triggers, as logic in custom JavaScript variables, HTML tags, etc., or sent straight to an analytics platform to populate custom fields or in an event field.

Data Attributes === Data Layer Push?

So far, tracking via data attributes and tracking via data layer sound nearly identical, so what are the differences between tracking with data attributes and tracking with the data layer?

The Obvious Differences

Data attributes exist in the website's DOM, so it's easy to see them implemented on a webpage. They are just key-value pairs added to elements on the website, so developers can place the attributes where we need them and then they're done — no additional logic needed. This solution puts the emphasis on the TMS setup. The context is available on the page, the team setting up the logic in the TMS is responsible for deciding how to use it.

Data layer pushes shift the emphasis to the developers. They can use a data layer push to "send" extra info to the TMS, but they're responsible for deciding when that information should be sent and what information. The logic that populates it is usually implemented via a script, hidden from view. For example, if information needs to be pushed to the data layer on page load, the logic to handle that situation will exist in the script. If information needs to be pushed to the data layer when a user interacts with something, that logic will also exist in the script, etc.

Which Solution is More Robust?

For information needed at time of page-load, the initial data layer load is available immediately and the recommended solution. For information about interactions happening after the page load, the data layer is usually thought of as being more robust than scraping information from the DOM. This is because the developers — those who know the site best — are in control of what information is populated and when it is pushed into the data layer. However, this may put extra strain on your developers for ongoing maintenance.

Data attributes would be the next best option, especially in comparison to relying on existing DOM attributes. Unlike other DOM attributes like “class” or “id,” data attributes for this purpose are custom, less likely to change. Similar to the data layer, they do require some level of strategy first, but can hopefully be built into templates and become standardized.

There is another aspect to consider here: which solution will be more likely to be "forgotten" by developers, causing tracking to be broken by accident. Some say the data layer is more robust from this standpoint and developers will be less likely to remove an entire script from a website, even if the proper documentation doesn't indicate why it's there. Some argue that data attributes are less likely to be accidentally removed — as long as they're named descriptively — because it may be more obvious why they exist.

Developers can forget about either implementation, so unless the developers you'll be working with have a strong preference one way or the other (likely due to past experiences implementing similar tracking or as a result of documentation practices), I don't think this aspect of robustness is a good argument for or against either option.

Which Solution Optimizes Time, Money, and Developer Resources?

The data layer is generally more intensive for developers to implement and maintain. However, depending on the site, going the data layer route could significantly decrease the time needed in GTM configuring tracking.

Your website and the teams implementing tracking will determine which route is the most pragmatic.

Can We Use Data Attributes for Some Tracking and Data Layer for Others?

Yes, you can! You do not have to go exclusively one route or the other. However in my experience building this type of tracking, developers tend to push back on tracking requests that include both data attributes and data layer simply because they are such similar options; there's something to be said for consistency.

But, Which Should I Choose?

Still not sure why you would choose one over the other? Here's an example.

I've given you a few different aspects to weigh before you decide which route to take with your tracking. They are such similar solutions, going with either route could be valid for you. However, sometimes there is a right and a wrong solution.

Here is an example situation to show you why thinking this decision through can be important: a client you work with has nine different websites and a new, 10th site is launching. In this example, we will be using GTM and GA.

Let's take a look at the tracking that is already implemented for the nine existing websites. This tracking will also apply to the new site:

Site navigation
Downloads
Outbound links
Contact links

Your client decides they want to implement two new pieces of tracking for the 10th site:

1. CTAs

On this imaginary, new website, the CTAs are different types of buttons and links — there is no way to put together a single trigger to track all of the CTAs across the site. Because of this, we need to write tracking specifications for the CTA tracking so the developers can implement tracking on the buttons and links that are CTAs — should we use data attributes for this tracking or data layer? Let's pick data layer:

dataLayer.push({
 'event': 'ctaClick',
 'cta': {
   'type': '<< CTA TYPE >>',
   'title': '<< CTA TITLE >>'
 }
});

2. All link interactions across the new site that are not already being tracked as an event

Side note: it's usually not a good idea to implement tracking that tracks "all links" for a multitude of reasons. I'm using this as an example because it's a more common request than you'd think!

We can use GTM's link click trigger to capture all link clicks and put blocking triggers on the tag for interactions we're already tracking:

Site navigation
Downloads
Outbound links
Contact links
CTAs

Now that we've made the above decisions, let's configure this tracking in GTM.

1. CTAs

2. All link interactions across the new site (that are not already being tracked as events)

Pretty straightforward, right? NOPE. Placing our data layer-based trigger "custom - ctaClick" as a blocking trigger on our "All Link Clicks" tag will not stop it from firing when a user clicks on a CTA. Why not?

When the CTA is clicked — the link "More information..." in the screenshot below — three events fire:

Click: GTM's click listener fired this event when it detected a click on the page
Link Click: GTM's link click listener fired this event when it detected a link click on the page
ctaClick: Site developers implemented our data layer-based CTA tracking specs to fire on any CTA click

If we take a look at the tags that fired, we can see that both the "All Link Clicks" and the "CTA" tags fired:

This means all CTAs that are also links will be double-tracked as both "cta" and "link click" events in GA. Not ideal.

This is happening because each tag is evaluated on each event, independent of one another. The Link Click event is a different, separate event from the ctaClick event, so we can't use our ctaClick trigger to block tags firing on the Link Click event:

We need a link click-type trigger to block clicks on CTAs, not a data layer-type trigger. If we had chosen the data attribute route instead of the data layer route for our CTA specifications, we could have used a link click-type trigger to track clicks on CTAs and could then also use that link click-type trigger as a blocking trigger on the "All Link Clicks" tag:

Now, we can add our link click-type trigger as a blocking trigger on the "All Link Clicks" tag, allowing us to successfully block "All Link Clicks" tag when a CTA is clicked.

In this example, creating tracking based on data attributes makes more sense than using data layer. If this "All Link Click" tracking was not requested, then data layer would have been a valid option. One seemingly simple request can completely change the approach to tracking.

Here is a flow chart to help you decide which custom tracking route is right for you:

Custom Tracking Data Attributes or Data Layer flow chart

Additional Resources

If you’re interested in learning more about using data attributes in tracking, we suggest giving Track More Click Detail With Data Attributes And Google Tag Manager a read.

If you want to go the data layer route, Unlock The Data Layer: A Non-Developer's Guide To Google Tag Manager and Track More Click Detail With Data Attributes And Google Tag Manager are great places to start — though both focus specifically on GTM’s data layer.