This article about negative uplifts belongs to a special series of blogposts, written by our own data wizards. It will offer you a glimpse into the engine room of Mediasynced. In these informative blogposts, we shed a light on the complexity of TV performance measurement in realtime and our robust statistical solutions.
Download PDF or read the full article below
One method that some of our competitors use, is to set all negative uplifts to 0. While this sounds like a good solution, it is statistically unsound.
One of the primary ways to evaluate the effect of your commercial is by comparing the performance before and after airing your commercial. A baseline is calculated based upon the performance before the commercial and is then subtracted from the performance after airing. The difference is then called the uplift. This process can be done for all kinds of metrics: we can calculate the uplift in the number of sessions on your website, the uplift in conversions and many more.
While this method can give you a good estimation of how well your commercial performed, it is not perfect. One of the most obvious flaws is that it can result in negative uplifts: cases where the performance after airing a commercial is worse than before. It is quite unlikely that users will start boycotting your products after seeing your commercial, so what is going on here? Is it even possible for a commercial to have a negative effect?
What is actually happening is that the commercial did not have a noticeable effect and the results you are seeing are actually due to noise. There is no reason to be alarmed, you did not actually lose any potential customers due to your commercial.
In the example shown above the goal is to calculate the uplift in number of sessions on your website after airing a commercial. The number of sessions is not static, it changes throughout the day in an unpredictable manner. In the graph we can see that right before the commercial aired, the number of sessions started dropping. The effect of the commercial was not enough to compensate for this drop, resulting in your actual performance being lower than the calculated baseline.
So now that we know that these negative numbers are not representative for what is actually happening, what can we do to improve them? As discussed above, the problem is that your commercial did not have a significant effect. One method that some of our competitors use, is to set all negative uplifts to 0. While this sounds like a good solution, it is statistically unsound. There are also cases where the exact opposite is happening such as the example shown below.
The effect of your commercial is also insignificant in this example. The number of sessions started increasing right after the commercial was aired. This is not an effect of the commercial, but caused by random fluctuation in visitors. However, this situation will result in a large uplift. If you set all negative uplifts to 0, then you must do the same for the positive-uplift case (which is much harder to identify) where the effect of the commercial was insignificant, yet the uplift was still large. If this is not done your results will be positively skewed. Your overall results will be much more positive then what is actually the case. Often this results in surprisingly large uplifts for small channels, as these channels are much more likely to have small effects which may result in incorrect positive and negative results. The negative results are set to 0 so you are left with an average uplift for this channel much higher than what is actually the case. When you leave these negative numbers as they are, incorrect positive and negative results will cancel each other out over a campaign period. This results in a much more accurate estimation for a channel.
So if we can’t ignore these numbers then what can we do? The statistical reality is that we simply can’t know for certain whether a session or conversion is caused by your commercial or has a different reason. As this problem is a fundamental one in all of statistics and is not only limited to the problems in our branch.
What can be done is limiting the effect of this problem by employing methods that filter the real data from these noise and predictive models that can make strong estimations of the real data.
Mediasynced is constantly improving its methods and models using the statistical and Artificial Intelligence tools that we have developed to provide an accurate prediction. Such an example is the method we use to calculate the baseline. You can read about this method here.
Hopefully, you’ve enjoyed reading our second blog post of this special series. Every month we will release a new article, so stay tuned!