Introducing the CASE Worker - The CASE WHEN Grouping Tool

June 16, 2021
Senior Analytics Consultant

Note: This article highlights a free tool created by Bounteous team members during a recent Bounteous hackathon. Take it, tweak it, make it yours—but take time to confirm that the output is what you're hoping for!

If you ever wanted to replicate (or modify) your Google Analytics (GA) Content Groupings or Channel Groupings for reporting in Data Studio or BigQuery (BQ), chances are, you've spent far too much time fighting with CASE WHEN statements, trying to figure out just the right mix of matching cases using various logic (regular expressions, contains, starts/ends with, etc.).

Whether you are new to or experienced writing CASE WHEN logic can be time-consuming and prone to errors like a missing quotation mark or comma. If you try to create the same logic in two different tools, you may find yourself rewriting the same logic multiple times just to get the syntax correct.

Writing a new query in BQ or creating a new custom field in Google Data Studio can be daunting, a giant white box with very little to help you craft the perfect statement.

image of bigquery's custome field

 

If that all sounds like you and all you want to do is just analyze data and take action without hours or data cleaning and transforming, we've got a tool to make your life a lot easier.

Common Problems This Tool Solves For

  • Not familiar or advanced in using regular expression (regex) and/or CASE WHEN statements.
     
  • Spent too much writing and troubleshooting tedious regex and/or CASE WHEN statements.
     
  • Content/Channel groupings in GA are old or don't reflect how you want to analyze data now.
     
  • Content groupings were recently created but can't be used to analyze historical data (content groupings and channel groupings are not retroactive).
     
  • Don't want to make any permanent changes to data, only using the groupings for analysis outside of GA.
     
  • Content groups are new to GA4. If you have them set up in Universal Analytics (UA), you still need to set them up in GA4. But oh no! They aren't retroactive. How will you analyze historical data?

What Is CASE Worker?

The Bounteous CASE Worker, uses Google Sheets magic that we've created for you to easily input tabular data and get a formatted CASE WHEN statement that's ready to copy, paste, and use right away.

The CASE Worker makes grouping data via CASE WHEN statements easier, faster, less prone to error, and accessible to all by automating as much as possible and provides outputs for multiple platforms (Data Studio, BigQuery, GA API), across multiple data sources (both UA and GA4)!

This tool is the first of its kind to incorporate GA4 since we know that certain features are new/or not quite developed to do exactly what you have set up in UA. This will be especially helpful when:

  • GDS reports need to switch from the UA schema to the GA4 schema when your company fully migrates to GA4
  • Once your company fully migrates to GA4, and BigQuery becomes a more integral tool for data analysis

Anyway, let's cut to the chase since this tool is meant to save you time and energy and so is this article. We don't want to share our entire life story before introducing the tool as this isn't a recipe on an influencer's blog (All that information is available to you in this section).

How to Use the CASE Worker Tool

The CASE Worker Tool is super easy to use, just follow the below steps to get started!

1. Make a copy of the CASE Worker Google Sheet.
Make a copy of the CASE Worker Google Sheet

 

2. Read the instructions tab carefully.

The Instructions tab is the place to start and will guide you through the steps to generate your CASE WHEN statement. Here you have two things to complete:

  1. Choose which output you'd like.
  2. Determine the grouping names you'd like to use.

Here's what the Instructions tab looks like without settings chosen:

image of what the Instructions tab looks like without settings chosen

 

Here's what the Instructions tab looks like with several settings chosen:

image of what the instructions tab looks like with several settings chosen
3. Input the values you want to match and select the grouping name they belong to.

With Content Grouping, for example, you might put /services/ as the value to match and Services and Capabilities for the grouping name. This would put all pages that have "/services/" in the page path into the "Services and Capabilities grouping."

image of CASE worker sheet using Content Grouping

 

Additional configuration can be selected in this step for the CASE WHEN statement, although the default options are the most common. If the default options do not fit your needs, here are your choices:

image showing the CASE Worker sheeting using additional configuration
4. Lastly, jump into the Output tab for the appropriate platform (Data Studio, BigQuery, or GA API).

All you have to do now is literally, copy the CASE WHEN statement and paste it into your endpoint. Then you'll be on your way to make all the groupings you want. It's that easy.

GA4 CASE WHEN Statement:

image of GA4 CASE WHEN Statement

 

GA4 BiqQuery Query:

image of GA4 BiqQuery Query

 

Something that could literally take hours to write, troubleshoot, and text before being about to analyze your data and act on it… now takes a few minutes. Where was this tool when I needed it last week?

You can stop reading now if that tool was all you needed to create content groupings. Happy analyzing!

Or you could continue reading to learn more about why we created the tool and how it can be used beyond Content or Channel Groupings (We highly recommend this option!).

More Details About Using the CASE Worker

As promised, let's get into more detail about using the CASE Worker tool including factoring different dimension names, common considerations, and cleaning up your CASE Worker queries.

How to Factor for Different Dimension Names Across Tools

It can be challenging when trying to group data across different tools or accessing data from different platforms. For instance, if you're trying to combine the different pages on your website into logical groups. The URLs or Titles for these pages will stay the same, but the way we access that information will be different on every platform.

Your CASE Worker has built this in, and allows you to specify the right dimension based on where you're expecting to pull your report. Select where you'd like to pull the report (Google BigQuery, Data Studio, or Sheets) and select where the data is being stored (Google Analytics or Google Analytics 4).

You can find dimensions in several different places:

To get you started, here are a few common dimensions and how you might access them (but of course you can always add your own dimensions/metrics, which may need further configuration of the tool to accommodate).

Page Path

Let's figure out how to find the page paths from our site inside of Google Analytics and visualize them in different tools.

Where Are You Looking? Dimension Name
Data Studio, Google Analytics Page
Data Studio, Google Analytics 4 Page Path
BigQuery, Google Analytics hits.page.pagePath
BigQuery, Google Analytics (SELECT value.string_value FROM UNNEST(event_params) WHERE key = "page_location")

 

Page Title

Similarly, here is a breakdown for a dimension like Page Title.

Where Are You Looking? Dimension Name
Data Studio, Google Analytics Page Title
Data Studio, Google Analytics Page Title
BigQuery, Google Analytics hits.page.pageTitle
BigQuery, Google Analytics (SELECT value.string_value FROM UNNEST(event_params) WHERE key = "page_title")
Common Considerations When Using the CASE Work Tool

Your CASE Worker can only help you if you give it the right information. When you're using a tool like Google Analytics as your data source, there are several setup items to consider. If your data isn't 100 percent perfect, that's ok! Using flexible queries and calculated fields allows you to continue to adjust and accommodate deficiencies in your data collection.

Use the Error Checking tab if your groupings don't look like you'd expect in your final output. This will help you pinpoint any errors and give you the chance to fix them.

image showing CASE Worker Error Checking tab

 

Additionally, keep the following considerations in mind as you build your CASE WHEN statement:

Trailing Slashes on Page Paths

Sometimes a page can end with or without a trailing slash. For example:

  • example.com/services/
  • example.com/services

If you set your matching rules to say pages that End With (or Exactly Match) "/services" belong to the "Services and Capabilities" group. With that rule in place, you may be missing occurrences of the page with the trailing slash

Pages that will be grouped:

  • example.com/services

Pages that won't be grouped:

  • example.com/services/
  • example.com/services/blog-writing

Generally, you should be careful when using Exact, Starts With, or Ends With for your type of match. But if you think this is the right match type for you, you can always double-check the results. Just log into GA and go to the Behavior > Site Content > All Pages report. Click on advanced next to the table filter:

image showing the advanced google analytics view

 

Then, choose your match type, enter the value you want to match, and Apply.

Next, note the number of returned rows and the Pages that you see in the top 10. Now, go back into the advanced filter and change the match type to Containing, then apply.

Lastly, check the results. Are there more rows than before? Are there Pages that you want to include in your group missing from the previous filter?

Query Parameters on Page Paths

Just like with trailing slashes, query parameters in your URL can often cause the same page to show up in many different ways. Query parameters can be added for a number of reasons, often from marketing tools or as part of the site structure. Sometimes these query parameters are important and change what's presented on the page. Often though, query parameters are used for reporting or some other function and don't actually change what you see on the page.

For example:

  • example.com/services
  • example.com/services?id=abc123
  • example.com/services?id=abc123&category=platforms

For reporting purposes, we often want to combine all of these pageviews together. The user is looking at the Services page, even if there are query parameters. While there are ways to clean this up going forward, you need to make sure you're including all of these into your groupings.

Again, if you use Exact or Ends With match types, you might miss any page that has a query parameter on the end. You may be better off using the Regex or Contains match types.

Groupings are Exclusive and Follow the Order of the Statement

When making your groupings keep in mind that groupings are exclusive which means a Page Path can not fall into both Services and Blog. There is where you need to be selective with your match type and consider how it might affect the groupings.

If the groupings are highly specific, Exact or Ends With match types. If the groupings are broader, use the Regex or Contains match types.

This is why the "Other" group always comes at the end and serves as a catch-all. If "Other" was first it would take away from the other groups.

Groupings Limitations in the Google Analytics Interface

If this regex is being used to create content groups for the GA (UA) interface, keep in mind that GA only allows for five content groups. This means you'll need to be strategic in how you set up and categorize these content groups. However, within those content groups, there can be unlimited (as far as I've seen) sub-groups.

For GA4, we haven't found a limit for content groups or sub-groups (yet).

For Google Data Studio and/or BigQuery purposes, you can make unlimited content groups to analyze your data using this tool. Just note the interface requirements of Google Analytics itself.

Cleaning Up Your CASE Worker Query

Your CASE Worker has good intentions and does 99.99 percent of the work, but sometimes is a bit too verbose. There may be some instances where you can simplify things by combining multiple WHEN lines into one. For example, instead of:

CASE
    WHEN REGEXP_MATCH(Page, '((?i).*/services/).*') THEN 'Services and Capabilities'
    WHEN REGEXP_MATCH(Page, '((?i).*/solutions/).*') THEN 'Services and Capabilities'
    WHEN REGEXP_MATCH(Page, '((?i).*/capabilities/).*') THEN 'Services and Capabilities'
    ELSE 'Other'
END

Try this:

CASE
    WHEN REGEXP_MATCH(Page, '((?i).*/services|solutions|capabilities/).*') THEN 'Services and Capabilities'
    ELSE 'Other'
END

Get to Grouping!

We created this to help make groupings easier, both for ourselves and for others who might have similar struggles. We found a lot of great resources out there but none that did exactly what we needed. We worked together during our recent hackathon and tried to include a lot of great functionality, but keep in mind you will want to review the outputs and make sure they work for you and your data!