Analyzing web traffic is a lot easier if you apply intuitive filters on them. Google Analytics filters are used to gather much greater intelligence from your data. In this tutorial, learn how to use them and why apply certain important filters on your data.
Let’s talk about Google Analytics filters. Some of you may already know what they are and how to use them. For some, it’s new. Filters are a pretty cool feature of GA that allows you to edit the views by including, excluding, or modifying the data. They also let you create unique views with only the most critical information you need.
Combining different options available on GA, you can create a whole bunch of filters. There are some important and popular ones, such as IP exclusion filter, referral spam filter, etc. There are also obscure ones whose usefulness may not be immediately apparent.
There are all kinds of websites—e-commerce, small business, medium business, personal blogs, social media, professional consulting, etc. Some filters, such as IP exclusion, are applicable to most of them. But some other custom filters are very useful for specific types of websites only. In this article, let’s learn what kind of filters you may really need.
Let me give a structure for this article for you to have an idea of what to expect. We will see a brief introduction to filters and see an example of how to set one up. Then we will look at a number of filters and see the parameters to configure them. We will also see why you may need these filters.
Say you manage the corporate website for your company, almost the same way I do. Your website gets visited by your company’s employees, who may be spread across the globe. But your boss is not selling to them, so he wants to know only about the visits from potential clients. You can use a filter to exclude your ISP traffic in this case.
Similarly, if you are focusing your business on a particular region(s), you may want to track how people from those regions interact with your site; you are least bothered about other places. Use a geography include filter in this case.
As you can see filters are useful to remove unnecessary data from your reports.
There is another use these filters serve. Assume that you have a website and its URL structure is somewhat messed up. For instance, say “yourdomain.com/?pageid=898” is a URL on your site—obviously, you have no idea what it is about, thanks to your web developer. If you don’t have much control over the link structure of your site and it is managed by an agency, you may want GA to permanently rename these URLs in the reports to something that you can understand, such as “yourdomain.com/?pageid=mens-shoes” or “yourdomain.com/mens-shoes”.
How do you do it? With a search and replace filter.
So, as you can see there are two key uses for filters:
- To include or exclude traffic from a particular ISP or geography, traffic to a particular subdirectory, etc.
- In order to modify the data captured to make it more intuitive
Based on the uses, there are different types of filters in GA. Let’s look at some of them. You can create them in your account section of your admin screen.
Broadly, filters are classified into “predefined” and “custom” filters.
Predefined filters are absolutely easy to set up; all the necessary options are already there; you don’t even need to tame the daunting regex devil. Two types of predefined filters are “exclude” and “include only” filters. Using these, you can exclude or include traffic from specific ISPs or IP addresses or traffic to specific subdirectories or hostname.
Also, you don’t have to know the full name of your ISP or the full IP address to exclude the traffic as the filter gives you options to match for values that begin with, end with, or contain your search string. Pretty cool!
Custom filters are where all the action is. You can implement custom filters to have Analytics dance to your tune. Some uses are including only traffic to certain subdirectories, excluding all traffic from a region, search and replace all of those unappealing query strings (/?ref=blah-blah&more) from your URLs, replace the unintuitive “(not set)” and “(not provided)” data with something that you can use, and many others.
On the flip side, custom filters require you to have a pretty decent understanding of regular expressions.
With all the goodness, there are some cautionary things also that you need to know.
- Filters destroy data, so set up one unfiltered view. You need to be aware that once you apply the filter, your view will no longer have its original configuration. Filters permanently modify the view. This is the reason why you must set up an unfiltered view that always holds your raw data.
- How many filters can you really have? It’s kind of unlimited, really. But you can have only 25 views per property in your standard GA account. Since filters are applied to views, effectively you can have only 25 filter setups.
- Filters don’t help you modify the past data. Once you create a filter, it will be applied only to the future data captured in the view. Filters are, in short, non-retroactive.
- In several cases you can’t really verify the filter. You can get the filtered data only after about 24 hours. But we’ll see a trick to verify the filter soon after setting it up.
Let’s see how you can set up a normal predefined filter to exclude traffic from your ISP. Please note that you can create filters at the account level as well as the view level, depending on your access level in Google Analytics. Remember that it’s always better to create your unique filters at the account level and then assign them as necessary to each view.
Head over to predefined filters section and click “Add Filter”; put a name for your filter; select the filter type as “Exclude”; select “traffic from the ISP domain” and then “that are equal to”. In the “ISP Domain” textbox, enter the domain of your ISP as shown below.
So, let’s dive into the key thinking of this article. It would be counterintuitive to have all kinds of filters set up on your account. For one, you have a limitation of 25 Analytics views for your property. So you don’t want to waste them by creating filters you don’t need.
A basic filter that everyone could make use of is the IP exclusion filter for the internal traffic. Other than that, depending on the type of website you have, you need to use specific filters.
The filters also have to be ordered in the most logical way possible.
Let’s check out several types of filters and understand why you should have them.
Filters to exclude your IP address or your ISP are very commonly used by websites. This is to ensure that your own visits to your website are not counted by GA. But for whom is this filter really?
Imagine scenario of a personal blog. Most of the time, the personal blogs are written by hobbyists and enthusiasts who may blog even from public Wi-Fi. They wouldn’t find their own visits screwing up their GA dashboard much.
Let’s look at another scenario where the IP exclusion filter may not be ideal. Imagine a restaurant offering Wi-Fi service to the customers. It is possible that the customers visit the restaurant website through the same Wi-Fi network. If you exclude this Wi-Fi IP from your reports, these customer visits won’t be recorded.
On the other hand, consider an IT company with thousands of employees across the globe. In this case, an ISP/IP filter would seem extremely important and logical as the employees of this company can visit the website all the time, and using the company internet. All these traffic would create much confusion in your dashboard.
In case of any medium to large corporate, this filter is extremely important and should be implemented as one of the first.
Let’s understand a little bit about referral traffic. Referrals are the websites that send traffic to your site, usually through a link. For instance, if another website has a link to a page in your site and somebody clicks that link to visit your site, it’s counted as an instance of referral traffic and GA captures the URL of the referring page.
However, spammers took advantage of this to inject malicious referral data on your traffic reports. They accomplish this in two ways:
- By visiting your website (crawlers) and not obeying your robots.txt rules
- By not visiting your website (ghosts), but sending data to random GA tracking IDs
In either case, referral spam can seriously screw up your Analytics reports. You will see the effects of this kind of spam as your website grows in popularity and attracts global traffic, especially from countries like Russia.
You need to use a spam filter in Google Analytics to filter out the referral spam. An updated list of referral spammers can be found here. You may have to create a regular expression to add as many domains as possible as your GA filter textbox has a character limit.
In the filter configuration boxes that follow, the bolded items are what you actually configure. If something is not bolded, it’s intended as an example or explanation.
Filter Type: Custom->Exclude
Filter Field: Referral
Filter Pattern: Ideally a regex matching spam referral domains
The GitHub page also tells you how you can throw a “403 Forbidden” error to the spammy crawlers.
Another use of the referral filter is to filter out traffic from the domains that you own or work with, such as a shopping cart operator in the case of an ecommerce site.
Imagine you are using PayPal as your shopping cart operator. When somebody orders from your site and clicks the PayPal “Buy Now” button, they are taken to the PayPal domain to complete the checkout process. At the final stage, they are redirected back to your site. This confuses Google, which may report a new session from PayPal. You can avoid this kind of erroneous reporting by using PayPal URL in your referral filter.
More about filtering out referral spam can be found here.
Let’s look at the example of a restaurant chain that operates in select cities. This kind of a business may be very interested in setting up a custom view only for those locations. You can set it up with a geography filter.
Filter Type: Custom->Include
Filter Field: City/country
Filter Pattern: City names separated by the pipe (|)
Geography filters can be created also with countries. So, a medium-sized business operating in certain countries can create a filter like this to get a custom view on the prospective website visits from those countries.
This kind of a filter wouldn’t add much value to a large corporate like Oracle that operates across the globe.
Imagine that you are tracking multiple domains and subdomains. In such case, the request URI, which is the content that follows the domain name in your GA reports, may be duplicated.
For instance, consider “blog.yourdomain.com/index.php” and “www.yourdomain.com/index.php”. Here, in both cases, the URI is “/index.php”. In your reports, only the URI is shown, so you can’t figure out which one it is. In such case, the URI rewrite filter will be useful.
We can use an advanced filter in this case. Please note that the advanced filters are a little trickier, and as added burden, they can’t be verified in real time.
Filter type: Custom->Advanced
Field A (Hostname): .*
Field B (Request URI): .*
Output to (Request URI): $A1$B1
Check: Field A Required, Field B Required, Override Output Field
Obviously this filter is not necessary if you have one domain and you don’t have any subdomain in it. In such cases, you will have only unique request URIs.
In the corporate website of my company, there are many URLs with query strings appended to them, as in “mydomain.com/pagename/?query=value”. Google treats the query-string version of the URL as different from the normal URL (even though they both take you to the same page). So, in final reports, we can never get the exact number of visits for a URL unless we go through the entire report and sum up the visits to all of its duplicate query-string versions.
But a smarter way would be to use a query string removal filter. Be very careful in implementing this. It may not work for your site if it uses a different kind of query string structure.
Filter Type: Custom->Advanced
Field A->Extract A: Request URI: (^/[^\?]+)(\?.*)
Output To->Constructor: Request URI: $A1 (replaces the request URI with its first part that precedes the ‘?’)
Check: Field A Required, Override Output Field
OS filter is used to include or exclude visits from a specific operating system. Although you can get all OS and OS version reports directly from the Audience->Technology section of Google Analytics, the OS filter may be useful for some app developers and mobile-targeted websites.
Filter Type: Custom->Include
Filter Field: Operating System Platform
Filter Pattern: Names of OSes separated by the pipe (|)
Imagine an ecommerce giant that uses “shop.mydomain.com” as the primary shop and “www.mydomain.com” for generic content. If you don’t want to frustrate your ecommerce analytics admin, create a subdomain only view to only include visits to the “shop” subdomain and give him access to that instead of the entire analytics dashboard.
The subdirectory include filter works in the same way.
This filter is very important for ecommerce sites, blog networks, corporate websites, diverse content blogs such as DotDash, etc.
For subdomain, use the filter field “hostname” and for subdirectory, use “request URI”.
Filter Type: Custom->Include
Filter Field: Hostname/request URI
Filter Pattern: Subdomain/subdirectory
After Google moved to secure search, webmasters no longer get any real data on organic search queries. Google only provides (not provided) as the keyword for all secure, logged-in searches; obviously, it’s of no use to us. But it may be more intuitive to replace all those (not provided) and (not set) keywords with something useful, such as the title of the page that they referred.
In essence, you can split up a row with 300 visits marked as (not provided) to individual rows with the page title as the value of each search term.
Filter Type: Custom->Advanced
Field A->Extract A: Campaign Term: (not set)|(not provided)
Field B->Extract B: Page Title: (.*)
Output To->Constructor: Campaign Term: NP: $B1
Check: Field A Required, Field B Required, Override Output Field
After you set up your filters, you may want to test whether they work correctly. From Admin->View->Filters screen, you can verify if the filter is working properly. But this is possible only if you have enough data in the previous 7 days.
How can you test it if you don’t have enough data?
You can do it through the Google real-time view, for certain filters only. Let’s check out the query string removal filter.
Open your real-time overview page on two tabs. On one tab, open the unfiltered view and on the second, open the view with the query string removal filter applied. Now open the website with the query string from your phone or any other device which is not filtered out by your setup.
The following image is the unfiltered view with the query string in place for the URI “/internet-of-things/?ref=facebook”
Now, check out the filtered view in the next tab that has properly removed the query filter.
Obviously this cannot be done for the filters where visit parameters are not in your control, such as a geography filter.
Hope you enjoyed the article. Set up the filters you really need and ensure that you order them logically. Make sure that filters are created at the account level and assigned to views, rather than creating them at the view level. That way, management of many filters will become much easier.