Is it Fraud? Illustrating some marginal to bad practices in ad-tech

A couple weeks ago I was looking over geographic location data to inform the return on investment of an advertising campaign. A significant portion of metropolitan data was showing a large percentage of the users sitting in the exact same location. It was as if they all checked-in on their apps while they were standing at the center of the city. That seemed unlikely. Was there something wrong with the data? Well, yes and no. More importantly: was it fraud?

Was it fraud? Welcome to the gray area of ad-tech. eMarketer shows that display advertising fraud loss is hovering around $6 billion worldwide. They likely derived this data from very black and white definitions of what is and what isn’t fraud. In this post I’ll be lining up not only explicitly fraudulent behavior, but also some marginal tactics of media owners, supply side companies, advertisers and demand side companies to illuminate the fringes of acceptable behavior.

Pacing Algorithm for Advertising Campaigns and Inventory Allocations

I was trying to figure out what to do with my Sunday. My options were: build a little header bidding ad server plugin for WordPress; run, sleep and eat; or write up some blog post on a pacing algorithm, because people still seem to be producing crappy ones. Since you’re reading this, you can probably guess which choice I made. I mean, it’s not the first post I’ve written on the subject.

It showed up again last week. I didn’t expect it, but I guess I never do. A saw-tooth pattern on a chart, indicative of a capping of sorts. A chart that says, “I want a thing to happen, but only so much.” In this case it was a traffic allocation. This was a surprise.

A little (bad pacing algorithm) history

Most of the time when I run into a bad pacing algorithm it’s in the form of a campaign trying to limit itself. It only needs to acquire a few thousand impressions every five minutes, for example. So the hastily written algorithm might divvy up the impression allocation into five minutes buckets. Effectively that’s 12 buckets every hour. So it takes an hour’s worth of impression needs and divides it by twelve. One twelfth of the impressions are purchased every five minutes. Unfortunately at that point it switches to a simple counter that says, “for the next five minutes buy impressions until the number purchased reaches 1/12th of what I need in this hour.”

Sawtooth Pattern exposes a bad pacing algorithm

Audience Forecasting and Campaign Pacing

Audience Forecasting and Campaign Pacing“In online advertising, how can I predict/forecast the traffic (number of requests) for a day ?
For a given day, I would like to get the estimated number of eligible impressions a campaign will have, in order to allocate my budget and implement a traffic based pacing algorithm.”
This question was asked on Quora, below is my answer.

The estimated number of eligible impressions, or audience forecasting or “avails” as they say in the industry, can be derived in several ways. I will illustrate two of the methods.

The long, but easy method

The easiest way to estimate your avails would be to just take a whole day’s worth of data and determine how many of your target users are in there. The problem with this method is that it can take a whole day. If you have a day to spare, this is a good way to go.

The short, but difficult method

For this to work you'll need the total traffic available for some previous day, or week. You'll want that data broken down by hour or maybe 15 minute interval. With more traffic, your breakdown can be smaller. For the sake of this example let's look at an hourly breakdown and a single day's worth of data.

Disrupting the Bid in the RTB Auction

RTB Bid Keys

Your eyeballs are on the block, but they don’t always go to the highest bidder.

“In RTB, will the bid with the highest CPM always win? If not, what are the other factors?”

This question was asked on quora, below is my answer.

In a pure auction, the highest bid should always win. In many cases an RTB auction ends with this result, but not always. There are two or three things that will adjust the auction mechanics to give a lower bidder the impression. Most of the time a modified auction is at the behest of the publisher.

Alternative to the CPM

Online advertising transactions are all CPM based. You might think my wild assertion is out of line. You might think you’re buying ads on a CPC or a CPA basis. But when a publisher is looking to sell ad inventory, they’re thinking about the CPM. “How many dollars can I get for every thousand ad views?” And when that CPA deal or that CPC deal comes in the publisher’s doing the math to convert that number into a CPM.

Blowing up the CPMFor a CPA deal they’re estimating how many acquisitions they can send to the buyer for every thousand ad views. For CPC, how many clicks per thousand ad views. They’re boiling it down to a CPM because that’s how they can compare the deals. It works like this all the way up and down the funnel.

The CPM has been around for a long time. With the advent of the RTB auction model, the CPM is very dynamic. Each impression up for auction is individually valued based on countless bits of information about the user, the page, the size, the date, the historical performance and a variety of other variables. Even though a separate auction is run for each impression, the bid prices are still in the form of a CPM. It’s in our blood. It is the end result of normalizing the value of an ad impression so that it can be compared to its peers.

Some Problems

I want to point out a couple of problems with the CPM. First and foremost, it's a single number. This aspect causes a couple of secondary problems that put the buyers at risk. One of the big ones is that there's no guarantee that the ad will actually show on the page. Steps have been taken to address this by several companies. The result of this problem is a topic near and dear to my heart: discrepancy.