My post was originally published on the Rubicon Project blog. It was written with contributions from Dr. Neal Richter and Jonathan Zhuang.
Pacing algorithms come in a few basic forms at the Rubicon Project. The most basic is one called “as fast as possible” which can hardly be shown to do anything that resembles pacing. The Pacing controller is supposed to spread out impressions served for a campaign throughout the day. In general it takes the goal of impressions to serve in a given day as input and calculates a serving schedule for the campaign.
A naive pacing algorithm will break the day up into 24 segments, let’s call them hours. It will allocate equal amounts of campaign impression for each hour and then recommend the campaign get impressions until the hour’s allocation is exhausted. This algorithm has a couple of pretty big flaws. Primarily it tends to serve the campaigns at the beginning of each hour and then once the allocated impressions are used up it stops serving until the next hour. The second flaw is that it has no understanding of the traffic distribution throughout the day. The 1AM hour doesn’t have the same amount of traffic coming in as the 10AM hour. If the inventory is relatively scarce the campaign will under-serve during the early hours, catch up during the peak hours, and then under-serve again in the later hours. Ultimately campaigns using this algorithm may not achieve their goals at the end of the day and it won’t serve evenly in a given hour.
Our team looked for an analog of the pacing algorithm that they wanted to introduce to the REVV platform. As it turned out, a toilet ball exhibited the most desirable behavior for campaign pacing. See, a toilet ball floats on top of the water, that’s where it wants to be. It’s attached to a lever that controls the amount of water that gets into the toilet tank. As the ball drops (perhaps after a flush) the valve opens up wide and lets in a lot of water. The tank fills and the ball floats up. As it gets close to the top it starts to close the valve and the water trickles in at a lower rate. If, for some reason, too much water gets into the tank there’s an overflow tube that lets some of that water out.
As it goes with most of the things we think we invented, someone had thought of this before. It’s called proportional control. We use this and other adaptive controller every day. The cruise-control feature in modern cars is an example. The central concept of an adaptive controller is to define a goal function, a feedback mechanism and a method to change outputs to match the goal function.
To apply this to ad serving we treat the tank as the impression volume goal for the campaign and the water input as the inout from the total available impressions. The current water level is equivalent to the percentage of the goal impressions for the campaign. The valve is implemented as a sampling percentage allowing available impressions to ‘flow’ into the campaign. Interestingly this scheme can be applied recursively by setting goals for each division of time during the campaing’s flight dates.
Given an ad with a total cap, the objective is to build a pacing controller to serve the impressions smoothly over the length of the campaign. The daily impressions distribution should be mostly even, with a bias to the beginning of the campaign. The hourly impression distribution should follow the traffic distribution over the site(s) that the campaign is running on. This can be represented mathematically over any period of time, see examples below:
Daily goals (cap)
Given these inputs:
- T – The total cap
- D – Number of days in the campaign
- I(n) – Number of impressions served on day n
The impression goal/cap for day d should be:
where O is the rate we want to over-burn to start with (e.g. 1:2)
Hourly goals (cap)
Given these inputs:
- G – The daily goal/cap that we want to reach for the day
- t(k) – Proportion of traffic for hour k
- i(k) – Number of impressions served on hour k
The hourly goal for hour h is:
The key output of the system is the sampling percentage g(h). This is the rate at which the ad should be active in the impression flow. Note the feedback loop baked into the equation above, i(k) is number of impressions served so far in the time period. If this is behind relative to the goal, the g(h) will rise in response.
Campaign Pacing
As results of our toilet ball pacing algorithm our campaigns pace very smoothly, taking a percentage of the volume of traffic and following the traffic curve throughout the day. The line in red is a publisher’s traffic distribution.
This blue line graph shows a campaign paced throughout the day.
By adjusting the over-burn we were able to hit the goals of the campaign to with an over-serve rate less than a 100th of a percent.
This algorithm can be applied to any range of time so long as the historical volume data is relatively accurate. Additionally the controller can adapt and recover from changes in available.
Other algorithms have attempted to address this problem. While some of yielded satisfactory results, most have at least one flaw.
- An algorithm for campaign pacing evenly throughout the day generally sets a fixed amount of impressions per hour, essentially dividing the day into 24 even parts with the exact same target for each hour. This algorithm has two problems: 1. It does not increase the number of impression allocated to the campaign when traffic volume increases. 2. It tends to scramble for any available impressions at the end of the day when traffic volume decreases.
- An algorithm for pacing as fast as possible for a given day will fill any candidate impression as quickly as possible until it fills the quota. There are a multitude of problems with this algorithm but the most apparent is that it tends to finish campaigns off before people are out of bed.
- An algorithm that uses an hourly allocation, even one that considers the traffic volume distribution throughout the day, will serve an amount of allocated impressions each hour until the requirement is satisfied. Without using a traffic percentage based algorithm this solution will fill most of the required impressions that the beginning of each hour, perhaps the first fifteen minutes, and then set the campaign essentially dormant for the rest of the hour.
It’s our hope that information on the algorithm we’ve described can be applied to improve pacing across the online advertising space. Full details will be described in an upcoming technical paper.
Update: Our hopes were realized. We saw better pacing of campaigns. Specifically, the jump in bidding activity just after midnight has been greatly reduced.