Identifying Click Spam Deterministically
Within the gamut of techniques resorted by fraudsters to ad-fraud, Click Spam is the most common SIVT (Sophisticated Invalid Traffic) method used to spoof the performance. Being the most common technique, 40-50% of the marketing dollars lost due to ad fraud is eaten up by the fraudsters through Click Spam.
So how do we tackle Click Spam deterministically?
Two main tests are carried out on any campaign to identify Click Spam and its impact.
i) Click-Install Time Series
ii) Outlier Publishers
i) Click-Install Time Series analysis: In this first essential step, the behavior of click to install is analyzed to understand the pattern over some time. The time gap between click and the install cannot be comprehensive in any genuine traffic source. A user will click a source and then install an app. It cannot be that a user views a campaign and installs it later after a considerable gap.
On the contrary, in bogus traffic sources, the installs will show abnormal plotting, which interprets as users installing apps after an interval once they click a campaign or an advertisement.
Logically, this is never possible. Even if one may argue that the user would have seen the campaign on the go and later decided in spare time about installing the app. Or, a scenario where the user discovers an app while surfing for something and later in the evening decides to install the app discovered during the day.
Yes, all these scenarios are real and can result in abnormal distribution on a time series analysis. But this cannot happen in large volumes. These are unique and isolated behaviors that cannot be generalized to the masses.
ii) Outlier Publishers: Data can tell almost everything. The Click to Time analysis cannot determine between genuine and fake installs. There are other factors to consider before establishing Click Spam sources. For this, it is essential to identify the outlier publishers.
A baseline analysis is done by studying the click rates of different publishers running a campaign. Logically, the app should target similar users showing more or less the same behavior. This means the publishers should also get some behavior on their campaigns. A baseline analysis helps understand the expected genuine clicks/installs on a campaign. Historical data analysis is also helpful in establishing a baseline.
Once the baseline is established, the click rates achieved by various publishers are plotted. It is understood that the publishers cannot exactly fall on the baseline. Hence, a range of tolerance is defined using a proprietary algorithm that factors several parameters. If the publisher falls within this range, it still delivers valid traffic.
However, if the publisher shows performance way beyond this range, it is detected as an outlier, resorting to click spam to spoof the performance. There is no magic wand with any publisher to achieve substantially different results than other publishers.
Conclusion: The campaign analysis helps determine the click spam fraud rate and impact unambiguously. Together, these two tests identify the sources fetching invalid traffic, which is a direct dollar loss for the advertiser. Only by blending the analysis of Click to Install time with identification of an Outlier Publisher, mFilterIt deterministically pinpoints the fake sources, resorting to Click Spam to fake performance and getting paid for non-performance tricking the advertisers.
Let's engage in a detailed conversation on Click Spam ad-fraud technique and how it's impacting brands bleeding their marketing dollars. Connect with me by writing to firstname.lastname@example.org.