Earlier in April, Google Search Console added the much-anticipated regex filter to the performance report. Regex filtering is something that’s been around for a long time in Google Analytics to do advanced filtering of web analytics data. Since its creation, Google Search Console (formerly Google Webmaster Tools) has had simple “contains” or “does not contain” filters. These filters severely limited the ability to distill data in a more complex way within the platform. With the addition of the regex filter, that has finally changed.
What is the regex filter in Google Search Console?
Regular expression, or regex, is a string of characters that specifies a rule to perform advanced search. What makes it different from a regular filter is that you can filter a dataset for more than one condition. We’ll get into more examples further in this post.
Why is having regex in Search Console such a big deal?
In my opinion, the lack of a regex filter was the biggest drawback to Google Search Console. Prior to this update, you could only have one filter on at a time for a dimension. So if you wanted to filter queries to contain both “sale” and “clearance” for example, you would have to export it into Google Sheets or Excel to do this as it wasn’t possible in the platform. With the new regex feature you can save A LOT of time!
What are some things I can do with regex in Search Console?
The possibilities are endless with the new regex filter. Here are some common problems that regex filters can solve in Google Search Console.
Branded Search Filters
Measuring organic branded impressions and clicks is an important exercise to see how much of your organic search is influenced by brand demand. Prior to regex, you could filter to see queries containing your brand name to see this data. However, this never gave you the full picture because you couldn’t filter for all of the different variations in how users search for your brand.
For example, let’s pretend there’s a holding company called “The Knight Group”. There’s so many ways that people may search for this brand such as:
Knight group
Night group
The night group
Knightgroup
Knight company
People misspell things all the time or include/exclude words in branded search. You can’t just filter for “knight” because that would exclude from your brand search people using the word “night” by accident. You could filter for “night” but maybe you have a product, service, or a blog post with the word “night” used in a non-branded context which would throw off your branded numbers. Previous to regex, you just had to accept that you couldn’t capture the full picture of what was going without exporting the list and filtering it there.
With regex, you could create a rule for the search that matches all the different instances. A regex filter that would match all the variations could look something like this:
knight group|the night group|nightgroup|knight company
The regex above would match all of the different variations of the branded search listed previously. The vertical pipe “|” acts as an or statement so it would match any query that contains “knight group”, “the night group”, etc.
Filter Product Model Names
Another use case for regex is to filter organic searches for product model names. This could be useful to see the search demand for specific models. For example, let’s say you own an online electronic store and you’re trying to see how well your site is doing capturing searches for Samsung TV models. Here are some of the real model numbers of Samsung TVs:
Q80A
Q70A
Q60A
Q60T
Q70T
Q80T
This is a challenge because you can’t just filter for “Q” because you’re going to pull in anything with the letter Q in it. Also, the numbers in the middle of the model number are all different. Prior to regex there was no way to filter for these model numbers in Search Console. But with regex, you can create a rule like this.
(?i)Q[0-9]0(A|T)
This short regex would capture all of the model numbers. Let’s break down how this works:
- (?i): This makes the match case insensitive meaning the regex would match “q90T”, “Q90t”, or “q90t”
- Q: This simply matches the “Q” proceeding the start of the model number.
- [0-9]0: The [0-9] would match any number between 0-9. The “0” outside of the bracket is there because all of the model numbers end in zero. So this will match anything from “00” to “90” in the model number.
- (A|T): This would match any model numbers ending A or T.
Filter URLs by File Extension
You can also filter URLs with regex. This could be helpful in many ways. For example, you may want to see organic impressions, clicks, and queries for any non-HTML assets you have ranking in Google Search such as: .docx, .pdf, .rtf, .xls. This is especially useful since non-HTML assets cannot have the Universal Analytics code loaded and would typically not show up in your Google Analytics report.
Filtering for this is very easy to do with the following regex:
\.docx|\.pdf|\.rtf|\.xls
Note that the backslash “\” escapes certain special characters. The dot “.” is a protected character set used by regex to match any single character so the backslash escapes the function to make it a string.
Conclusion
Regex takes some time to get used to if you’re not familiar with it, but trust us that it’s worth investing the time to learn it. You can speed up your analysis and extract greater insight into your site’s organic performance. However, insights can only take your brand so far; you need proven SEO experts to translate analytics insights into actionable strategies to grow your organic performance. If you’re stuck in “analysis paralysis”, contact Atypical Digital today to help craft and execute a bespoke SEO solution for your brand today.