Custom Filters For GA, Part 4: Custom Advanced Filters

May 4, 2007

Custom Advanced filters are so cool, and there is so little information available about them. It’s too bad that they have such an intimidating name.

I find myself using them in three ways:

  1. To rewrite stuff in GA. Usually, this it to rewrite a URI. For example, we work with a site where the CMS insists on calling the same page three different URIs, and we are using custom advanced filters (among other things) to rewrite them, so that we always know what page we are looking at.
  2. To associate data that aren’t already associated. For example, Benjamin in SF wrote me and asked how to associate a referring source with a transaction ID. This is a job for Superman Custom Advanced Filters.
  3. To change all sorts of other things (but which are mostly about #1 and #2.)

So let’s look at a really easy Custom Advanced filter — rewriting all your URIs to be Title Tags. (True, you can already see them in the Content Performance > Content by Title report, but this will rewrite them for every report.)

I want to teach two things before I start.

Thing #1: When I wrote about Regular Expressions, I explained how a dot means, match any one character. And I wrote about how a star means, match zero or more of the previous items. So when you put them together, they mean, match everything.

Thing #2: When you use parenthesis, it create a variable in Regular Expressions. Most of the time, I don’t care. But it really matters in Custom Advanced filters.

Putting Thing #1 with Thing #2: When you write this: (.*) it means, get everything and put it in a variable.

OK, now we are ready to start. Check out the screen shot below


First, I gave it a friendly name (“Rewrite URI, etc.”) Then I chose Custom filters, then within custom, I chose Advanced. As soon as I chose Advanced, I got all the other options below it.

Today, we are ignoring the middle set of boxes, the ones that say “Field B” and are just dealing with Field A and output. So everytime I talk about the A stuff, I am referring to the boxes that say, Field A –> Extract A.

Now, let’s sit back for a moment and think about what we’re doing before we do it. Our goal is to get the page title and to rewrite it — to reconstruct it — so that it shows up everywhere that Request URI might. So instead of seeing URI’s (urls – you can all fight about the correct way to say that), we’ll see page titles.

To do this, we first choose Page Title as Field A (just like we chose filter fields in this post that I wrote last weekend. You have to decide, what are you working on?) Then we extract it — we create a Regular Expression(RegEx) that describes it. In this case, our RegEx is (.*), i.e. get everything and put it in a variable (like I described early in this post.)

Next, we decide where we are going to put it. We want to output it to the URI.

Now, here is the magic (or at least, that’s the way it felt to me as I went through life, trying to understand what $A1 or $B3 was.) The first variable (the first set of parenthesis) in the –> Extract A field is called $A1. We only have one variable in this screenshot, but if we had a second one, it would be $A2. $A3 for the third one (if we had one), and so on. So when we use $A1 as our constructor, it means, use the first variable (.*) in the extract A field to reconstruct our URI.

I know that was confusing, so let me say it another way. Here’s what we did. We took the title tag, and rewrote it as a Regular Expression in the A field. The expression we used was (.*), i.e. get everything and put it in a variable. (So that means, we put the whole title tag in a variable.) Then we told the constructor fields to take the Request URI and rewrite it to be the first A variable — which is now defined as the whole title tag. Consequently, all URIs get rewritten as their page’s title tag.

Please comment if you didn’t understand anything. (I’m serious. I got on someone else’s blog today and said, I just don’t understand.) Or send me email to my last name at my company name dot com.

Robbin