Methods To Strip Queries From URLS In Google Analytics

April 17, 2015 | Samantha Barnes
Methods To Strip Queries From URLS In Google Analytics

Google Analytics has a user-friendly interface that makes editing your set-up relatively simple. However, when it comes to large accounts, performing view-level settings to be duplicated across tens or even hundreds of views within the same account isn’t as easy.

One example is when you want to exclude one or more URL parameters from your reports on every view. As mentioned in previous posts, this is important since Google Analytics will split pages into multiple rows that have different parameters. Most of the time you want to aggregate your data by page content, so this isn’t good when you go to analyze your content metrics.

SplitRows

In the above example, I want to see how ‘post-23’ is performing, but it is split into two rows. Now I can’t get an accurate picture of the bounce rate, hits, entrances or any other metrics since they aren’t combined. If you are not sure which, if any, URL queries you should keep, the general ‘rule’ is to keep only queries that involve different content. Also, it is best practice to have a view that strips out all queries along with a completely unaltered view.

The option exclude queries is under Admin > View Settings. The issue here is that you would have to copy and paste your parameters to every view that you want to exclude these on. When accounts have 100 or more views across multiple properties, this is a tedious task. Plus, maybe you missed a parameter and want to add to it… time to go back and copy-and-paste through all the views again.

Another limitation is that this field has a 256-character limit. If your site has a lot of URL queries, it would take a very long time to first identify what they all are and then list them one by one in this text box.

strip-queries-small

A more efficient method is to use a filter instead. Since most filters are account-level, they can be added to every view in your account at once, even across all of your different properties. You can try it now by going to Admin > Account > All Filters. After selecting or creating a filter, there is an option toward the bottom to shift-click-select and add it to as many views as you like. This works for most filters except those that are custom dimension-based – they cannot be batch applied with this method (…yet. I hope this will change!) since dimensions are property-level.

BatchApplyFilters

Remove One URL Query with Filters

Note: we will be using regular expressions for all of our filters, so if you need a little refresher there are posts here and here.

First, select the option to create a new filter. In the options, choose ‘Custom,’ ‘Search and Replace’ and choose the Request URI as the filter field. We are searching for the query in the URL, so this is where we place our regular expression that will match the query that we want to remove. In my example, I want to remove a ‘name’ query that is passed since it is PII, or Personally Identifiable Information. An example of the regex is below.

(name=[^&]*&?)

We’ll leave the replace field blank, as we want to take out everything in parenthesis and replace it with nothing.

It might look a little confusing, so let’s break it down:

The first part is the query’s name and the ‘=’. Keep in mind that the query may be one of many and between or before other queries that we want to remain. The next part of the expression, ‘[^&]’, is targeting any character that is not an ampersand so that it stops before the next query. The asterisk is there representing 0 or more of the preceding character, so it will cover us if it is the last query. If there’s an ampersand after the query parameter, we also want to remove it to prevent something like this from showing up “/blog/post-23?lang=en&&x1=key4929”.

Once this filter is applied, a second filter should also be created to strip out trailing ‘?’s or ‘&’s on page paths. Our first filter grabs the query word and value but not the character before it (‘?’ if it is the first query in the URL, ‘&’ if it is not.) We do this because we’re not sure if there will be one or more query parameters.

The second filter will also be a Search and Replace custom filter. In the first field, the RegEx to grab the “?” or “&” only if it is the last character is:

([?&]$)

Again, we’ll leave the Replace field blank.

Remove Multiple URL Queries with Filters

If you want to exclude more than one query, you will have to create another filter for each query that you would like to be excluded. Remember that the order of filters is important, so make sure the remove query filters all go before the secondary filter for trailing “?”s and “&”s.

Remove All URL Queries

If you are sure none of the queries are relevant to your content, you can also remove all of them at once to have clean page paths. In some cases, it may make sense to have a separate view that strips off all query parameters, no matter what. This makes content reporting on say blog posts or news stories much easier!

Only one search and replace filter is needed in this case. The regular expression for the search field is the following:

\?.*

This will capture anything following and including the question mark, and then we’ll replace it with nothing.

Remove Queries with Google Tag Manager

This method is more than just excluding the URL parameters, rather it prevents them from ever hitting the interface. This would be ideal in a situation where Personally Identifiable Information (PII) is appearing in the URL. Google Analytics’ Terms of Service states that this information can’t be stored on their servers, so it’s best to prevent it from ever reaching Google Analytics instead of filtering it out afterward.

An example is if someone’s name, phone number or email address becomes a URL query parameter briefly after filling out a form.

You’ll need to make sure you have your Page URL variable enabled in the list of Built-In Variables in Google Tag Manager V2. (If you’re still in V1, make sure you have a url path macro defined!)

Next, create a Custom JavaScript variable and name it Updated Document Location. Paste in the following JavaScript:

function() {
 
  var params = ['name', 'email'];
  var a = document.createElement('a');
  var param,
      qps,
      iop,
      ioe,
      i;

  a.href = {{Page URL}};

  if (a.search) {
 
    qps = '&' + a.search.replace('?', '') + '&';

    for (i = 0; i < params.length; i++) {
 
      param = params[i];
      iop = qps.indexOf('&' + param + '=');

      if(iop > -1) {

        ioe = qps.indexOf('&', iop + 1);
        qps = qps.slice(0, iop) + qps.slice(ioe, qps.length);

      }
 
    }

    a.search = qps.slice(1, qps.length - 1);
   
  }
 
  return a.href;
 
}

This script has an array of query parameters that will be removed from your page path. You can add or remove query parameters by editing the array on the second line.

Next, in your pageview tag, go to More Settings > Fields to Set. Add a new field called “location” and set that value to your new Variable, {{Updated Document Location}}, as shown below.

All that’s left is to Preview and Debug and then Publish!

Wrapping Up

There you have it – multiple ways to remove query parameters from your URLs! Remember, you can always use the built-in Remove Query Parameters option at the View level for the most basic query parameter issues. Using a Filter or script in Google Tag Manager will help with larger websites that have more complicated setups!