How Google Analytics Uses Cookies To Identify Users

December 23, 2019
By Abigail Matchett,
Lead, Enterprise Analytics

Let’s talk about web cookies and Google Analytics. Warning: this blog post does not talk about edible cookies, but we will dive deep into the morsels that make up our Google Analytics data (pun intended!). 

The Basics of Cookies

A cookie is a small bit of information that gets stored on your computer. Cookies are browser-specific, which means Chrome and Firefox will not be able to access each other’s cookies. Cookies are also site-specific, which means that Amazon.com will not be able to view the cookies that you have saved on Facebook.com.

Common Types of Cookies

Every cookie has a domain associated with it, and many Cookies are specific to your website’s domain (in our case, this would be Bounteous.com). These types of cookies are considered “First-Party” and are generated to access same-site information relative to the website domain in your address bar. 

Your website developers use First-Party cookies to store information about a user’s activity — even when a user is logged-in — from page to page. Common use cases for cookies include multi-session logins (keeping a user logged in even when they close the browser tab) or remembering a user’s preferences.

However, often times the domain associated with the cookie does not match your website’s domain (the domain in the address bar). When the domains differ, these cookies are considered cross-site or “Third-Party” cookies. 

This distinction between First-Party and Third-Party will become more important in a moment, but for now, let’s focus on First-Party cookies and how Google Analytics uses them. 

The Basics of Google Analytics Cookies

All versions of Google Analytics tracking that you can embed on your website use cookies to store and remember valuable pieces of information. Today, we’ll focus on the Universal Analytics implementation from Google Analytics, which relies on the persistent _ga cookie.

The _ga cookie stores one valuable piece of information: your Client ID.

It looks something like this:

GA1.2.12349876.1500644855

This Client ID represents you, the user, who is visiting the website where the tracking code is implemented. More specifically, Google Analytics uses the _ga cookie to recognize your unique combination of browser and device

Note: This post focuses on cookies used to represent Users. There is a great way to identify users on your site that have logged-in with a User ID that gets passed to Google Analytics. For more details, check out the section at the bottom!

How Does This Cookie Get Its Value?

For most default implementations, when a user arrives on your website the Google Analytics code executes and looks to see if there is a _ga cookie already present. If there is one, great – you are a returning user! If a _ga cookie is not present, it will randomly generate a new Client ID for the new user (also known as New Users in Google Analytics).

This Client ID is in the form of four sets of numbers that are generated and then stored in a cookie on that user’s browser and computer.

What Does This Number Mean?

The Client ID is a combination of 4 distinct values: version, domain, identity, and first visit time.  

The first number is fixed at 1, which represents the version of the cookie format that’s being used.

The second number, which is the number 2 in the above example, is dependent on the domain where the cookie is set. An easy way to think of this is to consider the number of dots between subdomains and root domains (e.g. bounteous.com = 1, www.bounteous.com = 2).

The third set of numbers is randomly generated to identify different users. (Technically, a randomly generated unsigned 32-bit integer, or anything between 1 – 2,147,483,647.).

The last set of numbers is a timestamp of when a user first visited a site. This timestamp is rounded to the nearest second (not millisecond) of the user’s first visit.

How Does Google Analytics Use Your Cookie

The _ga cookie is used to uniquely identify users, specifically with the third and fourth set of numbers explained above. Each action that you make on a website or app is called a Hit, and sends data along with your unique Client ID to Google Analytics. 

Google Analytics then looks for hits that have the same distinct Client ID, and connects hits that occur during the same window of time into what we call Sessions.  A User, or a unique Client ID, will have anywhere from one to many sessions that are associated with a particular user. Within our Google Analytics reporting, the Client ID is also responsible for data collection behind both the New and Returning Users dimensions.

We won’t spend too much time on Scope today, but check out this blog post for more insights: Understanding Scope in Google Analytics Reporting.

However, in the image below we can see a visualization of the Client ID (“cid”), and how the Cookie can be used to collect hits, and create sessions for a specific user. 

image showing the visualization of the client ID

All About Cookie Persistence 

Cookie persistence is an ever-changing topic, but we’ll highlight some of the most important concepts for you to know and be aware of, including ITP (Intelligent Tracking Prevention), subdomain and cross-domain tracking. 

Device Persistence

Users’ cookies are not shared across devices. Different browsers or devices will result in different cookies and therefore different users. How many browsers do you use to access the internet? Do you ever visit the same sites on different devices? You can spot the problem here.

Browser Persistence

Each Client ID is browser-specific, so it is not passed to different browsers on the same device, such as two different browsers on an individual’s computer.

For most browsers (such as Google Chrome) The _ga cookie, by default, lasts for two years of inactivity. For returning users, every time a user visits your site, this extends the expiration to two years from the latest date. You can adjust this if necessary via Google Tag Manager or by modifying the on-page Google Analytics script. 

However, recent changes in privacy concerns from browsers such as Apple and Firefox have changed the default persistence of the Google Analytics _ga cookie for many browsers. This change is a result of Intelligent Tracking Prevention, or ITP. 

What is Intelligent Tracking Prevention (ITP)?

ITP is Apple’s (Safari) and Firefox’s way of limiting the ability to track users across websites using third-party cookies. ITP (and its reach) continues to evolve, but the window of cookie persistence continues to shrink. 

Apple first introduced its plan for intelligent tracking protection in 2017, with ITP 2.1 going live on March 25, 2019. With this release, all client-side cookies (including First-Party trusted cookies such as Google Analytics) were capped to seven days of storage. 

This may seem like a brief window as many users do not visit a website each week. However, with ITP 2.2 and ITP 2.3 announced in September 2019, all client-side cookies are now capped to 24-hours of storage for Safari users on macOS Mojave 10.14.5. This means that if a user visits your site on Monday, and returns on Wednesday, they will be granted a new _ga cookie by default. 

What Does This Mean for Our Google Analytics Data?

Ultimately, your organization will need to determine the level of impact from ITP for your website and proceed accordingly. For example, websites with a high number of Safari and Firefox users will experience a surge of unique visitors, disrupting the new vs. returning users metric. 

Apple has provided a few guidelines that can be used to set the _ga cookie to two years as previously expected. Read more on these topics here.

What About Clearing Cookies?

You’ve touched on a weakness of using cookies. A user can clear his or her cookies at any time. If a user visits your site and sends traffic info to Google Analytics with one Client ID, then clears their cache prior to returning to your site, they will be given a brand new cookie and Client ID and treated as a New User within our Analytics reporting. 

It’s important to remember that the metric Users in the default Google Analytics reports does not refer to specific individuals, but rather specific Client IDs, which can change for many reasons.

But I’m Logged Into Chrome…

It is possible to create a profile on Chrome with your login. You can have a personalized homepage, sync bookmarks, and have multiple users on the same computer. Unfortunately, the cookie is not passed between these logged in sessions, and therefore a single user logged into Chrome on different devices will be seen as two users. 

What About Subdomains and Cross-Domains?

Remember that cookies are site-specific. If you have either Subdomains or Cross-Domains that you are tracking together in Google Analytics, then you need to verify the following two parts.

The default Google Analytics implementation is designed to work across subdomains automatically. (Remember, two subdomains would look like “blog.example.com” and “www.example.com.”). If someone travels between those two sites, they will maintain the same cookies. Read more about subdomain tracking if you are concerned about your existing implementation.

While subdomains are tracked by default, cross-domain tracking is a completely different animal. (Remember that cross-domains would look like “exampleblog.com” and “coolbusiness.com.”). In this case, your cookies will absolutely not be shared between the sites, unless you set up Cross-Domain tracking with Google Analytics. Read more: Do I Need Cross-Domain Tracking?

Capturing Real Users Instead of Devices & Browsers

If you are shaking your head at the limitations of cookie persistence outlined above, then you will want to take advantage of the User ID feature. 

While cookies are generated randomly and are used to represent anonymous visitors, the User ID is handy for sites where you actually know who the person is. If your site has a logged-in experience, the User ID is for you! In this case, you can pass a hashed, non-personally-identifiable, identifier to Google Analytics and that can be used to stitch sessions and users across browsers and devices. Read more on the User ID feature in Google Analytics:

Using the Client ID for Troubleshooting

Lastly, understanding the Client ID and its importance can be very helpful in troubleshooting common issues like subdomain tracking errors, cross-domain tracking, iframes-… you name it. 

If you want to see the Client ID in action, check out a few posts for troubleshooting and debugging your data: 

 

Cookies: The Backbone of Web Analytics

Cookies have been the backbone of most web analytics tracking and are useful for a number of reasons. Understanding how they work and potential downsides of the basic Google Analytics tracking can be helpful in identifying tracking errors and better describing your data from Google Analytics.

As we’ve touched on above, there are a number of forces changing the way that Google Analytics and Google Marketing Platform products can interact with first-party and third-party cookies for some browsers. Before implementing any cookie solution, it will be important to understand your business needs and concerns and pick the solution that best aligns with your marketing and website goals.