Regular Expressions Part VII: (Parenthesis)

October 12, 2006

As promised, here is installment VII of my Regular Expressions (RegEx) tutorial – parenthesis. I am learning and sharing at the same time. I am only learning them to use for Google Analytics.

I wanted to get this one out soon after my last RegEx post, because the last one was on the use of pipes, which stand for OR in Regular Expressions. Pipes (OR symbols) and parenthesis often go together.

regular-expressions

My tutor, Steve in Australia, does a really good job of explaining parenthesis. In the same way that this mathematical statement,-

6*(2+3)

is equivalent to 6*2 plus 6*3, parenthesis in Regular Expressions make sure that the stuff outside of the parenthesis get applied to the stuff inside of the parenthesis equally.

For example — and remembering that the pipe symbol | stands for OR — we can have a regular expression like this:

grand(mother|father)

That will match either grandmother or grandfather.

Or, here is another, similar but not identical example:

Ste(ph|v)en

that will match either Stephen or Steven

What if the two terms are really different and there isn’t much in the way of grouping to do? For example, what if we want to filter out Robbin or Luna (which I do all the time in my GA)? Then we can go back to the last lesson on OR and just use a simple pipe:

Robbin|Luna

(Often, even people who know me well misspell my name, so I could use what I learned in lesson V, question marks, to make the second “b” optional, like this: Robb?in|Luna)

In Google Analytics (I won’t speak to other languages) we don’t need to use any parenthesis if there isn’t any grouping — the pipe can stand on its own. Or as Justin always tells me, keep it simple.

[Incredibly techie addition: My last comment about never needing parenthesis when there is nothing outside the parenthesis is not always true. At the eMetrics Summit, Nick from Google and Justin from Epikone taught me a lot about creating custom filters and during that process, explained how parenthesis define a variable. I will revisit this topic later.]

Backslashes \ 
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes –
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
{Braces}
Minimal Matching
Lookahead

Robbin