Apologies for the rather slower pace of posts of late, but life has been a little busy here, what with moving house in Seattle, taking a trip to London to attend E-metrics, and succumbing to a nasty cold this week. Hopefully this post, the latest in my ‘mini-series’ on online marketing measurement techniques, will make up for things.
In my first and second posts on this topic, I discussed the techniques that ad servers (and other kinds of marketing delivery systems) and web analytics apps use to decide which marketing has delivered a click to an advertiser’s website. If you know what marketing has delivered a specific visit, you can associate any ‘conversion behavior’ (e.g. the customer buying something) with that marketing, and use this information to generate ROI information about that marketing. Here’s a simple example:
- Keyword “bananas” delivers 1,000 visits to www.bananas-r-us.com
- CPC (cost per click) for this keyword is $1; so campaign costs $1,000 to run
- Of 1,000 visits, 30 include a purchase
- Average purchase value is $50 (that’s a lot of bananas)
- Total revenue generated – 30 x 50 = $1,500; so ROAS (return on ad spend) is 50% ((1,500 – 1,000)/1,000)
So far, so much egg-sucking tutorial. The wrinkle is that not all visitors who click through the ad will go on to buy something there and then (that is, within the visit that started with the ad click). A proportion will visit the site, undertake some serious banana research, and then go home in the evening to consult with their spouses about whether their family really needs a 60lb bargain bucket of over-ripe bananas., and then place the order. In the industry, this is known as a deferred conversion.
In certain retail sectors, such as technology, travel, and insurance, deferred conversions are the norm. So you might be spending lots of money on search engine marketing, but your web analytics tool is telling you that no one is buying anything on the back of your keywords, whilst what is in fact happening is people are coming back later of their own accord and buying stuff.
Individual marketing delivery systems (like Google Adwords) solve this problem by giving the user a cookie when they first click through an ad, and then tying subsequent conversions (typically within 30 days) back to this click if that same user (identified by the cookie they still have) comes back and buys something. However, as I mentioned in my first post on this topic, this doesn’t work well when you are using multiple marketing channels (e.g. search + e-mail), as they will ‘compete’ to claim credit for the same conversions, and over-report the ROI of the marketing you’re doing.
Under the influence
The only way round this is to get your web analytics system (which will not double-count conversions, because each conversion only occurs once in a web analytics system’s database) to make some intelligent decisions about what marketing actually drove (or at least ‘influenced’) the conversion. Consider the following example sequence of visits to a site from the same user (who, for the sake of this example, we can assume has a persistent cookie for the duration of the set of visits):
Date | Marketing Source | Purchase value |
April 1 2007 | Paid Search | $0 |
April 10 2007 | $0 | |
April 13 2007 | none [direct] | $1,000 |
In this scenario, how do you decide what (if any) marketing contributed to the $1,000 purchase? There are a number of methods you could use:
1. ‘In visit’ allocation
This method allocates the conversion to the visit that contained it, and no other. This method would allocate the $1,000 to a “no marketing” or “direct URL” bucket; i.e. the paid search and e-mail campaigns would get no credit.
2. Last marketing source
Here, the last marketing that drove this user to the website gets exclusive credit for the conversion. So in this example, the e-mail campaign would be credited with the $1,000. Paid search gets nothing, despite the fact that it may have been this marketing that alerted the user to the site in the first place.
3. First marketing source
The first marketing that drove the user to the site gets the credit – in this example, the paid search campaign gets the $1,000 in its ROAS calculations. E-mail gets nothing, despite the fact that it drove the customer’s most recent interaction with the site. Another problem with this approach is that when users churn their cookies, their “true” first marketing source is lost, and a new (semi-random) one is allocated based upon the first marketing they respond to since cookie-churning.
4. Simple shared allocation
All historical marketing gets a share of the credit for the conversion. So paid search gets $500 and e-mail gets $500. This is probably closer to the truth (both had some hand in creating this eventual conversion), but is a pretty crude model, since different kinds of marketing have radically different ‘engagement profiles’ associated with them. For example you could argue that paid search clicks have a higher engagement profile than banner ad clicks, since when someone clicks a paid search ad they’ve already entered a relevant search term, so are clearly in the market to some extent.
5. Age-based shared allocation
Another over-simplification in the last method is that the age of the click (i.e. how long ago it happened) is not taken into account. The e-mail click happened only 3 days ago, whereas the paid search click was 13 days ago. Taking this into account, you could allocate, say, $250 of the $1,000 to paid search, and $750 to e-mail. The maths for doing this systematically are non-trivial – you have to model in influence ‘curves’ that tail off to zero at some point (say, 30 days back) and then allocate the conversion value on a pro-rata basis based upon each click’s position along the curve.
6. Age and channel-based shared allocation
This method combines the idea of age-based and marketing channel-based allocation together to create what is probably the truest (but certainly the most complex) picture of conversion influence. In this method you create influence curves for each marketing channel that you’re using, which reflect the different rate at which the ‘influence’ of each channel wanes over time (the influence of an ad might wane very quickly, for example, whilst an e-mail’s influence might linger longer).
For each historical clicks that was driven by marketing, you then locate the position on the influence curve for that kind of marketing and read off the number; and you then add up the values from each curve and pro-rata allocate the conversion value on this basis.
In our example, you might find that the paid search curve yields a value of 4 at the “-13 days” position, whilst the e-mail curve yields a value of 6 at the “-3 days” position, reflecting the fact (or, more accurately, someone’s opinion) that paid search click influence lasts longer than e-mail click influence. Allocating the $1,000 on this basis would mean that paid search gets $400, whilst e-mail gets $600.
Confused?
You may be asking, “but where do these influence curves come from?” The answer is, your head. Or mine. or the head of (the head of) your marketing agency. But they don’t exist at the moment, and, with all the other uncertainties and lack of standards in the online marketing world, I can’t see a commonly agreed-upon set of marketing channel influence curves coming out any time soon.
The tricksy thing about this field is that the more you look at the way marketing is allocated to conversions at the moment, the more you realize how broken and simplistic those methods are. I’ve never met a client or agency who’s implemented anything like nos. 5 or 6 above, though I have seen no. 4 used.
But in a world where everyone increasingly understands that you need to use multiple touch points to reach consumers, intelligently allocating conversions to multi-channel marketing efforts will become increasingly important, even to a little guy running some paid search campaigns with a bit of e-mail thrown in. So if you can come up with some plausible influence curves, turn them into a fancily-named methodology and set yourself up as a consultant, and you can make a bunch of money.
You write about “simplistic […] methods” and you are probably right. I am always surprised by the arguments behind PPC or conversion rates as they presume deterministic relationships at the micro level. I have a PhD in marketing and one of the things we learn early on is that it is extremely difficult to model the impact of decision variables.
Nano-level data could turn out to give the illusion of precision.
I would personally put more emphasis on aggregate data and systematic experiments (i.e. you vary the intensity of paid search, the intensity of banners, the intensity of email campaigns and track aggregate results).
This is not to say that nano-level data is worthless. Simply that these data must be used with care.
Quite an interesting series. The Web Analytics apps tend to focus more on the “visit” bucket, whereas we know that everyting is in fact happening in the “visitor” bucket. And that darn visitor is an undiscipline chap: why don’t they just do everything on the first visit? And those cookies; can’t they stop erasing them, or changing computers??
So, what you are adressing in this series is very important, because we will definitely need to soon better grasp, no, *measure* the multi-visit “fact of life”. I have in mind here such efforts by Eric Peterson’s Engagement Index and its critics by Gary Angel.
Whatever model we can all come up with, I strongly believe that it will bring web analytics to a higher, more solid ground. And this is necessary if we want the coming integration with customer data systems to go smoothly.
fantastic post, Ian. I’m enjoying the series & look forward to the next part.
daniel
Stephane –
Yes, you raise an interesting point. One way out of the mess of tracking individual responses is to look at the data in aggregate and express conversion rates as percentages rather than absolute numbers. Some parts of Google Analytics work this way, though I believe they could do a better job of making their calculation methodologies clear, as many of the numbers can be maddeningly vague.
Jacques –
Yes, it’s all about the visitor, though I can see both sides of the argument in Eric’s disagreement with Matt Belkin at Omniture about the use of visits vs visitors. A key investment that any site can and should make is in finding ways to persistently identify users on the longest term basis, and ideally across multiple machines. I think I feel a blog post coming on on this topic…