Regular Expressions Part VIII: [Square Brackets] And Dashes ‑

October 22, 2006

Come learn Regular Expressions for Google Analytics with me. I am learning Regular Expressions for Google Analytics and teaching with each lesson. This is why I roll them out slowly – each expression requires a lot of research. I have been awed at this process because the explanations are so opaque before I understand them, and once I learn them, they make perfect sense. Tonight, let’s talk about square brackets, and I hope you’ll see what I mean.

regular-expressions

Google Analytics defines square brackets like this:

[] Match one item in this list

This is exactly what they mean, it just sounds hard because they don’t tell you how to create the list and how to define an item. Simple explanation: When you use square brackets, each character within the bracket is an item. Look at this sample list with five items in it, each of which happens to be a vowel: [aeiou]. The hard part is undertanding that you don’t need anything to separate the characters, and that each item in the list is only one character.

Here’s how someone might use square brackets with Google Analytics. Let’s say you were selling items with part numbers formatted like this: PART1, PART2, etc. You want to know how often someone lands on your site by typing the actual part number into a search engine, but you only care about PART3, PART5 and PART7. So, you could enter PART[357] into the fiter box on the top of your Overall Keyword Conversion report (for example). That will match each of those part numbers. (Technically, it matchest one of these three and more, but I will hold that problem/opportunity for a different post.)

It’s helpful to understand dashes so that you can use square brackets easily. Google Analytics defines dashes like this:

– Create a range in a list

That means, instead of creating a list like this [abcdefghijkl], you can create it like this: [a-l], and it means the same thing — only one letter out of the list gets matched. You can also combine the range method and the brute force, type-them-all-in method and create a list like this: [a-lqtz], which matches any one letter between a and l, or q, or t, or z.

Special case: Sometimes — perhaps often — we really want the dash to be one of the characters we are searching for. Maybe we want to see searches of luna-metrics and lunarmetrics and lunammetrics. In that case, we put the dash at the beginning or end of the list, like this [-rm]. That means that the full RegEx which would match the three lunametrics keywords above would be luna[-rm]metrics. This is because the phrase will start with luna, end with metrics, and in between will have a dash, an r, or an m. Those are the only choices in the little list I created, the one that looked like this: [-rm].

There are other interesting things that you can do with square brackets, but I am leaving them out for now, either because they don’t all work with Google Analytics, or because I think this is enough for today. (Correct me if I’m wrong!)

Backslashes/ 
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes –
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
{Braces}
Minimal Matching
Lookahead

Robbin