Prevent bot traffic from ruining Google Analytics
Distil Bot Discovery for Google Analytics is a free offering that will give website owners the ability to understand the impact of bots on their business. The service is provided by Distil Networks, a company specializing in bot detection and mitigation services.
How to differentiate a human website visitor from a bot?
Search engine crawlers are easy to identify, but picking out scrapers, infected PCs, mass scanners, fake traffic generators, and so on is an arduous process.
A year ago I had a pet project where I tried to manually build Splunk functionality for identifying bots by analyzing raw httpd logs. Things change by the minute, so this ended up being just a fun exercise that couldn’t produce any usable results.
To detect bots, Distil Bot Discovery for Google Analytics uses Are You a Human technology (acquired by the company in 2017), which checks all visitors against hundreds of different characteristics, with a special focus on their behavior.
Bad bots
So-called “bad bots” can be used against websites in different ways: scanning for vulnerabilities and identifying potential exploitable threats, price scraping, crawling and copying of the content, targeted DDoS attacks, and so on.
Aside from these security issues, there are also performance issues – it is not uncommon for bots to deplete server resources and consequently lessen the quality of the visitor experience. Another important issue is false data.
Whether you are running a news site backed by advertising or an e-commerce web application, visitor usage data is precious as it’s often used as a basis for important business decisions. Failure to identify bad bot traffic can be a mistake that can cost you money. Here are just some of the bad bot issues I encountered recently: fake Google AdSense traffic, botched conversion metrics, site usage made irrelevant, and a ruined A/B test.
Back in the late 90s I used Webalizer, an open source solution that was the de-facto standard for web analytics. It had its pros, but it was tough to differentiate between what the system identified as hits and actual page views. Determining the real number of human visitors was even tougher.
Google Analytics drastically changed the way analytics is done. It shifted the standard from local log file analysis to basing the process on an asynchronous tracking snippet within the site’s source code. The scope of the data you can get from Google Analytics is immense, but the functionality provided by Distil Bot Discovery for Google Analytics is a must-have.
Humans vs. bots
After creating a free account, use the offered instructions to integrate it with Google Analytics. A simple JavaScript snippet needs to be added to the web site’s code, and you will be ready to start using it within minutes.
A segment is a subset of your Analytics data that lets you isolate and analyze specific parts of the data. As a result of the setup process, you’ll get two new segments to use: Humans – Distil Bot Discovery and Bots – Distil Bot Discovery. These segments can be run across practically all the data in your Google Analytics dashboard, and provide you with a quick snapshot of the bot traffic situation.
Do all the data analysis within the Google Analytics interface, but you can also use your account on the Distil website for simpler, better-looking graphs. If you are reporting to management, some of the data can come in handy for showing the impact of bots on your website audience.
Distil Bot Discovery for Google Analytics is a valuable Google Analytics add-on that provides instant visibility of the bot problem on your website. I like Distil Networks’ approach with this solution: provide Google Analytics administrators with a free, working tool, and some of those that detect serious problems will likely become new clients.