removing referral spam from google analytics

Removing Referral Spam From Google Analytics

Table of contents

    Estimated reading time: 13 minutes

    Are you wanting to know how to block referral traffic in Google Analytics? Almost everyone who uses Google Analytics will come across spam traffic in their statistics in one form or another. The most popular form of spam traffic is referral spam in Google Analytics. In this article, I will show you the process of removing referral spam from Google Analytics.

    Last Updated: 14th May 2020

    What Is Referral Spam In Google Analytics?

    Referral spam is fake traffic that can show up in Google Analytics as a referral visit, a search term or a direct visit. In the majority of cases, these are not actually visits and can, therefore, mess up your website analytics data by making it appear that you have had a lot more visits than you have actually had. This is extremely frustrating to anyone who is trying to measure the success of their online marketing efforts, as it is important to get a true picture of how many visits you are getting and where those visits are coming from.

    The below screenshot shows just how much of an impact referral spam in Google Analytics can have. The top table shows a list of external websites (source) that have sent referral visits to our website and the bottom table shows a list of our landing pages where the visits occurred. I have highlighted all the spam visits in red and if you look closely, it is easy to see why.

    What Is Referral Spam In Google Analytics

    In the top table, I know that our website does not have backlinks on websites such as lifehacker or get-free-social-traffic, so it cannot be possible to have traffic from these sites. If you also think about it, just the names of these websites sound spam-like. In the bottom table, I know that we don’t have landing pages on our site with URLs such as /traffic2cash or /free-share-buttons, so it cannot be possible to have traffic to pages that don’t exist. You can also see that for most of these visits the average time on page is 00:00:00 which if you think about it, isn’t even possible.

    Why Is Referral Spam Traffic There?

    It is important to remember that referral spam in Google Analytics rarely ever occurs, so if you look at your server logs, for example, those visits would probably not show up. They are simply fake numbers that spammers have injected into the analytics software to make it look like the traffic occurred. But why would someone bother to go to all this effort? Well, the reasons are actually quite clever.

    Spammers will typically want to promote a website for one malicious reason or another and they know that Google Analytics is the most popular form of web analytics software being used. By looking at the source code of any site, it is easy to see if a website has Google Analytics installed and typically anyone who uses it will be checking it on a regular basis and closely studying the data it tracks. By changing the data within Google Analytics profiles to make it look like visits have occurred from the websites they wish to promote, they know there is a good chance that people will see those sites and hopefully pay a visit when scrutinising the data.

    The types of spammers that do this can vary, along with their reasons for doing it. Some may wish to simply promote their own sites or it has even been known to be a tactic used by some shady SEO companies who use black-hat methods to acquire traffic for their clients. These companies make promises to their clients by guaranteeing them a certain amount of visits – they don’t care where these visits have come from or the fact that they are low-quality visits that are highly unlikely to convert. For them, as long as they get the visits, they have done their job.

    How To Block Referral Traffic In Google Analytics

    For those of us that invest both time and money into marketing our websites, referral spam in Google Analytics is a real pain, as we never get a true representation of how our sites are performing as the legitimate traffic ends up being masked. So why is Google not doing something about it, I hear you ask? Well, the truth is, they are. But, as we know, spammers are always one step ahead so it is a continuous battle that seems never-ending.

    When it comes to removing referral spam from Google Analytics, there are methods to block the offending URLs through your website .htaccess file, but this does not always work due to the fact that most of the referral spam in Google Analytics does not actually visit your site and therefore does not go through your server. There is typically two types of referral spam; the first is Crawler Spam which crawls your website as a search engine does, and the second is Ghost Spam which is where they bypass your site completely and just manipulate your website analytics data. Your .htaccess file can help to block Crawler Spam but not Ghost Spam when it comes to removing referral spam from Google Analytics, and Ghost Spam is a lot more common.

    As there is no way to stop Ghost Spam, as you can’t officially stop something that never really happened in the first place, you need to filter the ghost traffic out which you can do using the built-in filters within Google Analytics. Below I have listed the various filters that we use, when removing referral spam from Google Analytics, both on our own site and our clients, along with how you can apply these successfully to your own Google Analytics account:

    Step 1: Make A Filtered View

    When you want to know how to block referral traffic in Google Analytics, the first step to take is to create a new view within your Google Analytics profile to apply the filters to. Although you can apply the filters to the default view, I would not advise doing this as it is important to leave this for raw, untouched, data. This way, you can always switch back if you needed to.

    Log in to your Google Analytics account and go to the admin screen. Under ‘View’ click the dropdown and select ‘Create new view’. Give the view a name such as ‘Filtered View’, complete the rest of the information on-screen and click ‘Create View’.

    How To Block Referral Traffic In Google Analytics

    Step 2: Apply Bot Filtering

    Google Analytics comes with a built-in Bot Filter, which will exclude all hits from known bots and spiders. Although I find this built-in feature pretty basic and not overly effective when it comes to removing the majority of spam traffic, it is still the best place to start.

    From the admin screen, under ‘View’ click ‘View Settings’. On this screen, make sure the checkbox is selected next to ‘Bot Filtering’.

    Step 3: Filter Hostnames

    Hostnames are basically the opposite of source. The source is the website where the visitor has come from (i.e. Facebook, Twitter, Google, etc), the hostname is the website where the visitor arrives (i.e. your own website). Your main hostname will be your website domain but there could be others, depending upon how your site is set up. By filtering out all invalid hostnames, we can remove all ghost spam visits from your data.

    To find all of your valid hostnames, you need to look at the Network report that you can find by going to ‘Audience – Technology – Network’. On this screen, click the little tab titled ‘Hostname’ which is just above the table. From the list of hostnames in the table, you should be able to recognise those that are your own and those that are spam (make sure you select the widest date range possible to get the most amount of data). As well as your domain name, you may have hostnames for things like your email marketing (MailChimp, for example, will have a hostname that looks something like yourname.us13.list-manage). Go through the list and write down all the hostnames you recognise – anything else, including (not set) will be spam, no matter how much they look like genuine websites.

    Filtering Hostnames In Google Analytics

    Now, go to the admin screen and click ‘Filters’ under the ‘View’ menu. Click the ‘Add Filter’ button to add a new filter to the current view and follow these steps:

    1. Give the filter a name in the ‘Filter Name’ box
    2. Under ‘Filter Type’, select ‘Custom’ and select the ‘Include’ radio button
    3. Under ‘Filter Field’ select ‘Hostname’ from the dropdown list
    4. In the ‘Filter Pattern’ text box, type in your valid hostnames, making sure that you separate each with a vertical line (|) without any spaces and not including any www prefix. Any dots or dashes must also have a backslash in front of it. So, the hostnames for our website may look like this: improveposition\.co\.uk|improveposition\.us13\.list\-manage\.com
    5. Once done, click the ‘Save’ button to apply the filter

    Step 4: Filter Crawler Spam

    We can remove crawler spam by applying a filter to hide traffic from known spam websites (source). This is a little more complex, when it comes to removing referral spam from Google Analytics, as spammers will always find a way to make the traffic appear to come from a new and different website each time. However, due to experience, we know the most common websites that spammers typically use and by filtering these out, we are guaranteed to remove almost all of the spam traffic from our website analytics reports.

    At the time of writing, I have tested the filters below which have removed 100% of crawler spam for both mine and my client websites.

    Go to the admin screen and click ‘Filters’ under the ‘View’ menu. Click the ‘Add Filter’ button to add a new filter to the current view and follow these steps:

    1. Give the filter a name in the ‘Filter Name’ box
    2. Under ‘Filter Type’, select ‘Custom’ and select the ‘Exclude’ radio button
    3. Under ‘Filter Field’ select ‘Campaign Source’ from the dropdown list
    4. In the ‘Filter Pattern’ text box, copy and paste the following:
      semalt|ranksonic|timer4web|anticrawler|uptime(robot|bot|check|\-|\.com)|foxweber|:8888|xtraffic\.plus|(christopherblog|tammyblog|billyblog)\.online|traffic4free|bottraffic|easy-website\-traffic|bot4free|trafficbot
    5. Once done, click the ‘Save’ button to apply the filter

    Now, follow the same steps again to set a second filter, but add the following as the ‘Filter Pattern’:

    (axcus|dotmass|artstart|dorothea|artpress|matpre|ameblo|freeseo|jimto|seo-tips|hazblog|overblog|squarespace|ronaldblog|c\.g456|zz\.glgoo|harriett|webedu|barbarahome|verabauer|deirdre|ninacecillia|reginanahum|deniseconnie|firstblog|maxinesamson)\.top

    Step 5: Filter Fake Languages

    Language spam is one of the newer forms of referral spam in Google Analytics, where spammers inject messages into the language HTTP header, as you can see from the below example:

    How To Filter Fake Languages In Google Analytics

    Go to the admin screen and click ‘Filters’ under the ‘View’ menu. Click the ‘Add Filter’ button to add a new filter to the current view and follow these steps:

    1. Give the filter a name in the ‘Filter Name’ box
    2. Under ‘Filter Type’, select ‘Custom’ and select the ‘Exclude’ radio button
    3. Under ‘Filter Field’ select ‘Language Settings’ from the dropdown list
    4. In the ‘Filter Pattern’ text box, copy and paste the following:
      \s[^\s]*\s|.{15,}|\.|,|^c$
    5. Once done, click the ‘Save’ button to apply the filter

    Step 6: Filter Spam Networks / ISP Domains

    You may of noticed that some of your organic search reports show amazon keywords, which comes from Bing with the keyword of ‘amazon’ and a network domain that includes ‘paloaltonetworks’. To filter out these visits add a new filter following the same steps as your crawler spam filters, but change ‘Campaign Source’ to ‘ISP Domain’ and use the following expression:

    paloaltonetworks|scaleway|kcura|^google(\.com$|usercontent\.com|bot\.com)$

    Step 7: Filter Irrelevant ISP Organisations

    Not all irrelevant website traffic will come from spammers; some organisations use their own bots and spiders to crawl websites for general information, such as performance statistics and analytics. Although this traffic is not harmful, they still upset your website traffic data and conversion rates. To filter this out, add a new filter following the same steps as your crawler spam filters, but change ‘Campaign Source’ to ‘ISP Organisation’ and use the following expression:

    hubspot|^google\sllc$|^google\sinc\.$|alibaba\.com\sllc|ovh\shosting\sinc\.|microsoft\scorp|facebook\sireland\sltd|online\ssas|evercompliant|early\sregistration\saddresses|inktomi\scorporation|google\scorporate|google\sswitzerland\sgmbh|kazooisyee|cloud69

    Now, follow the same steps again to set a second filter, but add the following as the ‘Filter Pattern’:

    vultr\sholdings|hos\-329450

    Now, follow the same steps again to set a third filter, but add the following as the ‘Filter Pattern’:

    chinanet\sfujian|putian\scity\sfujian|linode\sllc|amazon\.com\sinc\.|amazon\stechnologies\sinc\.|digitalocean\sllc|linode$|amazon\sdata\sservices

    Step 8: Segment The Current Data

    The steps above will ensure that any new traffic data that Google Analytics records will now have the spam visits removed. However, the new filters will not apply to any previous traffic data that had already been recorded before the filters were created.

    There is a way to apply the filter rules to previous traffic data by adding a new segment to the default, unfiltered view. To do this, go to the admin screen and click ‘Segments’ under the ‘View’ menu. Click the ‘+ New Segment’ button to add a new segment to the unfiltered view and follow these steps:

    1. Give the segment a name in the box at the top of the page
    2. Click ‘Conditions’ in the left column and make sure the first row reads: ‘Filter Sessions Include’ by selecting the correct items from the dropdowns
    3. In the second row, select ‘Hostname’ and ‘matches regex’ from the first two dropdowns
    4. In the text box, copy and paste in your hostnames from Step 3
    5. Click the ‘+ Add Filter’ button to add a second filter
    6. In the second filter box that appears, make sure the first row reads: ‘Filter Sessions Exclude’ by selecting the correct items from the dropdowns
    7. Select ‘Source’ and ‘matches regex’ from the first two dropdowns in the second row
    8. Copy and paste the following into the text box:
      (brateg|budilneg|buketeg|bezlimitko|biteg|boltalko|begalka|alfabot|arendovalka|bank\-rot|abcdefh|aptechko|bukleteg|abc)\.xyz|(magnet\-to\-torrent|torrent\-to\-magnet)\.com|(baixar|descargar)\-musica|wordpress(\-start|\-crew)|uptime(robot|bot|check|\-alpha|\.com)|vitaly|sharebutton|semalt|ranksonic|share\-button|anticrawler|timer4web|free\-video\-tool|responsive\-test|dogsrun|fix\-website\-er|dailyrank|sitevaluation|seo\-2\-0\.|99seo|top10\-way|(videos|buttons)\-for\-your|best\-seo\-(solution|offer)|buttons\-for\-website|profit\.xyz|dbutton|keywords\-monitoring|platezhka|7makemoney|forum69|kings\-analytics|checkpagerank|pr\-cy\.ru|\-\-(production|website|sale)\.com|(audit|dollars|success|top1|amazon|commerce)\-seo|free\-video\-tool|datract|hacĸer|ɢoogl|slifty\.github|\-liar.ru|3\-letter\-|rencer\.ru|foxweber|free\-fbook|goodwriterssales|tourcroatia|spinnerco|justkillingti|suralink|worldtraveler|oldfaithfultaxi|christopherlane|hollywoodweeklymagazine|losangeles\-ads|anniemation|timdreby|pcimforum|yellowstonesafaritours|autoseo|blogarama|for\-placing|brainwizard|casinos4|ḷ\.com|davidsbag|bestonwardticket|presleycollectiblesm|\-backlinks\.com|phoenicx\.co\.uk|be\-escorts|vidyoze|brasseriebread|helvetiiconsulting|johntrapane|cloudsendchef|theautoprofit|:8888|blog1989|incomekey|amazon\-ads\.ovh|krumble\.net|10bestseo|seo\-watch|blog100|seoservices2018|resell\-seo|auto\-?seo|mycheaptraffic|bestbaby\.life|lyfeijiu|yycbtb|tqwh\.net|xtraffic\.plus|xtrafficplus|(christopherblog|tammyblog|billyblog|georgeblog|samanthablog)\.online|(penzu|blogping|blogseo|broderickblog|monicablog)\.xyz|(artblog|howblog|kimberlyblog|seobook|merryblog|axcus|dotmass|artstart|dorothea|artpress|matpre|ameblo|freeseo|jimto|seo-tips|hazblog|overblog|squarespace|ronaldblog|c\.g456|zz\.glgoo|harriett|webedu|barbarahome|annaeydlish|blog2019|compliance-john|compliance-julianna|constanceonline|galblog|greatblog|josephineblog|onlineblog|marketingblog|rosemarie|johnthompson|annierainey|mosesyamtal|candymyers|wikidot|bravenet|daisye|donaldblog|kevblog|livejournal|nancyblog|raymondblog|samlaurabrown|space2019|stylecaster|teresablog|veronicablog|wallinside|verabauer|deirdre|ninacecillia|reginanahum|deniseconnie|firstblog|maxinesamson)\.top|easy-website\-traffic|free\-website\-traffic|traffic4free|bottraffic|bot4free|trafficbot
    9. Click the ‘Or’ button to add a second rule
    10. Select ‘Language’ and ‘matches regex’ from the first two dropdowns
    11. In the text box, copy and paste the following:
      \s[^\s]*\s|.{15,}|\.|,|^c$
    12. Click the ‘Or’ button to add a third rule
    13. Select ‘Network Domain’ and ‘matches regex’ from the first two dropdowns
    14. In the text box, copy and paste the following:
      paloaltonetworks|scaleway|kcura|^google(\.com$|usercontent\.com|bot\.com)$
    15. Click the ‘Or’ button to add a forth rule
    16. Select ‘Service Provider’ and ‘matches regex’ from the first two dropdowns
    17. In the text box, copy and paste the following:
      hubspot|^google\sllc$|^google\sinc\.$|alibaba\.com\sllc|ovh\shosting\sinc\.|microsoft\scorp|facebook\sireland\sltd|online\ssas|evercompliant|early\sregistration\saddresses|inktomi\scorporation|google\scorporate|google\sswitzerland\sgmbh|kazooisyee|cloud69|vultr\sholdings|hos\-329450
    18. Click the ‘Or’ button to add a fifth rule
    19. Select ‘Service Provider’ and ‘matches regex’ from the first two dropdowns
    20. In the text box, copy and paste the following:
      internet\ssecurity\s\-|secure\sinternet\sllc|versia\sltd|altushost\ssweden\snetwork|web4africa\s\-ng|altushost\sluxembourg\snetwork|gz\ssystems\slimited\s\-|hostroyale\sportugal|gz\ssystems\slimited\s\-|north\sstar\sinformation\shi\.tech|putian\scity\sfujian
    21. Click the ‘Or’ button to add a sixth rule
    22. Select ‘Service Provider’ and ‘matches regex’ from the first two dropdowns
    23. In the text box, copy and paste the following:
      chinanet\sfujian|putian\scity\sfujian|linode\sllc|amazon\.com\sinc\.|amazon\stechnologies\sinc\.|digitalocean\sllc|linode$|amazon\sdata\sservices
    24. Click ‘Save’ to save the segment.

    The end result should look something like this:

    Adding Segments In Google Analytics

    Now, whenever you are viewing a report and thinking about how to block referral traffic in Google Analytics, you can add this segment to the current view to filter out the spam visits on traffic data in the past. In time, the new filters will do this for you, so you won’t need to use the segment from this point onwards.

    Give Your Feedback On Removing Referral Spam From Google Analytics

    I hope you found this article on removing referral spam from Google Analytics helpful. The steps above are the exact process that we follow for all our own clients, so if you are already investing in one of our SEO Packages, you can rest assured that this work has already been carried out on your site. If you are not already a client of ours and would like us to implement all of the above for you, then get in touch with us so we can help.

    As new spam sites are found, I will update this article with new filter patterns, so do check back regularly. Also, do leave us your comments below if you find that you are still getting referral spam in Google Analytics after implementing all of the above, so I can investigate and make any needed tweaks to these filters.

    About the author

    Michael is the founder and managing director of Improve Position with a strong background in both web development and technical SEO. His enthusiasm shines through with his passion to help others understand and succeed in the world of online business marketing.

    Latest posts

    Blog categories

    Blog tags

    Like What You See? Then Why Not Give Us A Share:

    2 Comments

    1. Stacy Jackson on 27th May 2017 at 2:44 am

      Hi, Michael — thanks for visiting my blog and leaving a comment to check out this post. You have a very thorough post here on the topic of referral spam! Great tips. I’ll share on my social. Cheers!

      • Michael Hutton on 29th May 2017 at 9:46 am

        Hi Stacy. Thanks for taking the time to pop by, have a read and leave a comment…much appreciated. Really hope people are finding this useful. If you need any assistance in setting this up, either for yourself or clients, just let me know. I have some updates to make to the filters this month, so will add them in shortly. You can signup to my newsletter to get notifications of when these filters are improved further as well. Thanks again, Michael.

    Leave a Comment





    Subscribe To Our Newsletter

    to receive FREE marketing hints & tips every month. We never spam and won't pass your details out to any third parties. Unsubscribe at any time. View our privacy policy.