How do I detect and exclude robots / abnormal traffic?

We have certain measures in place to prevent robot traffic from polluting your data: an organisation that records all known robots provides us with a list (IAB/ABC) on which we base our traffic exclusions so that bot traffic is not included alongside your other data in the interface.

However, this traffic can be monitored using an Analyzer NX analysis, which you will find under Traffic > Monitored traffic. There are three types: off-line bots, web crawlers or organisations.

Some organisations will monitor your site's viability with bots: Microsoft System Center Operations Manager, Gomez Agent, Observer, Nagios, etc. They are excluded by default but you can choose to include them by going to Configuration > Parameters > Monitoring/Excluding traffic > Robots.

Off-line bots are not excluded by default, but you can choose to exclude them from the same menu as above. Here are some of the more notable ones: Download Ninja, Heritrix, Webcopier, PageNest Pro, WebZip, etc.

When traffic is excluded, the exclusion is based on the IP addresses or User Agents of robots in the above list.

The IAB cannot however record all robots in existence. Some, especially the more recent ones that have not yet been listed, may bypass our exclusions and cause unusual spikes in your traffic. When this is the case, you can often recognize robots by their abnormal behaviour. Using various indicators, such as the following six, you can check for this behaviour amongst your visitors:

  • Time spent/ pages: if it's short, it may indicate a crawler
  • Page views/ visits: if it's high, same
  • URLs (Referrer sites): a spike in visits from an unknown domain is suspicious
  • Countries (Geolocation): a spike in visits from a country that doesn't usually feature in your stats is suspicious
  • Towns (Geolocation): same
  • Models (OS): a spike in visits from the same device model is suspicious

If you suspect abnormal traffic, please contact the Support Centre while providing context for your suspicions. If we are able to conclusively determine that robots are to blame, we can exclude them on our end.

You can also exclude traffic based on IPs yourself via Configuration > Parameters > Monitoring/Excluding traffic > Monitored/Excluded IP addresses. We may, in certain exceptional cases, be able to provide you with the IP addresses linked to traffic you consider suspect but were unable to conclusively attribute to robots.

To do so, the admin for the contract will need to send us a written request via the Support Centre with the following elements:

  • Client ID and Site ID
  • Domain name of the site where there is suspicious traffic
  • Explicit request for the IP addresses to be shared in order to exclude them from your traffic
Have more questions? Submit a request