The Anomaly Detection and Prediction features are AT Internet’s two new timeseries functionalities seamlessly integrated within Explorer. The Anomaly detection & Prediction functionalities are located within the “Anomaly detection & Prediction” option under the lightbulb button in the graph type selection.
Anomalies are detected when our statistical model identifies suspicious variations for a given metric within the context of the analysis.
Example: A large decrease in visits on a Saturday may seem to be suspicious within the context of a week however prove to be a perfectly ordinary variation as the scale of a year (decreases in traffic can be observed at the end of each week).
Anomalies are illustrated on the graph with round symbols overlaid on the timeseries.
Positive anomalies will be shown in green:
Negative anomalies will show up in red:
The prediction functionality will provide you with the expected future values for a given metric based on your historical data:
- In hourly granularity we forecast the following 24 hours after the last hour of the analysis
- In daily granularity we forecast the following 30 days after the last day of the analysis
- In weekly granularity we forecast the following 12 weeks after the last week of the analysis
- In monthly granularity we forecast the following 12 months after the last month of the analysis
Anomaly Detection & Prediction functionalities are available on all graph granularities: hour, day, week, month. When analysing the current day, the actual curve will be overlayed with the day's forecast giving you hourly targets for the day.
A minimum of 2 weeks of historical data (with less than 30% of the historical data being zeros) is necessary to identify anomalies and to provide forecasts on hourly and daily granularities. A minimum of 2 years worth of historical data is necessary (with less than 30% of the historical data being zeros) for weekly and monthly anomaly detections & predictions. If your site is new, you will have to wait until enough historical data is collected for our model to satisfy goodness-of-fit assumptions. Consequently, an explicit error message is shown until you have enough data. For optimal accuracy in our model, we recommend at least 8 weeks and 3 years of data respectively.
The Anomaly detection & Prediction features are available for all metrics.
When the anomaly option is selected an orange “tolerance” area is shown on the graph. We expect the curve to remain within the boundaries of this area. Any data point of the curve outside this zone will be considered an anomaly.
The trend illustrates macro fluctuations in your data over time. The trend is illustrated using a dashed curve.
The Baseline illustrates how the metric would have evolved without the intervention of external factors. The baseline is illustrated using a dotted curve.
The forecast is a prediction based on your historical data. The forecasted values are shown on the graph using a green dotted line. The forecast has its own level of accuracy which we illustrate using the light green area around the predicted curve.
Technical Information and FAQ
The Anomaly Detection & Prediction features are inhouse, custom-built data science models founded on an hybridation of timeseries decomposition models and clustering methods. The basic models have been tweaked, improved using millions of data points, and adapted to web analytics data.
Does the model compensate for known events such as Christmas or Black Friday?
The Anomaly Detection & Prediction models do not adapt for known recurrent events such as Christmas or Black Friday. Therefore, fluctuations in your data taking place on those dates will be treated similarly to fluctuations on any other day.
The trend doesn’t seem to fit my data properly, is this normal?
The trend reflects macro fluctuations in your data. It is possible your analysis timeframe is too small to show long term fluctuations in your dataset. Due to this phenomenon the trend can appear to be distant from the rest of your data. By increasing your analysis timeframe you will see the trend illustrates macro fluctuations in your data.