Regular Expressions Part X: Stars *

November 15, 2006
By Robbin Steif

This is Part X of the long long series I have been doing on Regular Expressions (RegEx) for Google Analytics. It is the last one I will do that explains what Google says vs. what they mean.

regular-expressions

When it comes to stars (or call them asterisks if you like), Google Analytic says this:

* Match zero or more of the previous items

Perfectly reasonable, if you know how to create a list of previous items. If you already read Post IX, use of the plus sign in RegEx, this will be easy, and if not, I’ll try to make it easy.

If the only special character you are using is the star *, then the previous item is defined as the previous character. For example, let’s say that my company has five digit part numbers, and I want to know how many people are searching for part number 34. The problem I have are all those leading zeros – technically, the part number is PN00034. So I could use the little Google Analytics filter box in my search report with a RegEx like this: PN0*34. That will bring me back all the searches for PN034 and PN0034 and PN00034 and PN00000034 and for that matter, PN34, since using the star means that the previous item doesn’t need to be in the search — zero or more of the previous items, it says.

Alternatively, we could build a list of previous items using square brackets. Like in my post on plus signs, I had a hard time finding a reason someone would want to use this, but again, used the example that Steve gave me. His example was square brackets with a space. So, I could do a search for my company name in the same filter box on the keywords report, like so:
Luna[ ]*metrics. That will come back with LunaMetrics (no use of the space) or Luna Metrics, or Luna Metrics, etc.

For the sake of completeness, I should point out that you can put real characters in the square brackets like this:b[aeiou]*d, and it matches bad and bed and bid and bod and bud. But for that matter, it matches baaaad and boud and bd, so I don’t think it is particularly useful. If I really just wanted to see those five examples (bad, bed, bid, bod and bud), I would be smarter to use the OR pipe | and do it like this: b(a|e|i|o|u)d.

Anyone who has a great example of using a star with square brackets is strongly encouraged to comment.

Backslashes
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes –
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
{Braces}
Minimal Matching
Lookahead

Robbin