our views and our knowledge in analytics and other releveant topics

our blogs

Taking fashion tips from geeks - how to make your (data) model work it, baby!


Hello, I want to talk to you for a bit about modelling. Not the glamorous kind, where over-muscled meatheads/under-nourished waifs (delete as appropriate) prance up and down, swishing fancy clothes at you that would very definitely not look as good on you as they do on them. No, I mean the data kind of modelling, specifically in this case, attribution modelling (and obviously, I don’t actually want to talk to you about it, I'd rather be in the pub).

See (and maybe the fashion metaphor will serve a purpose after all) Attribution modelling is like the poor cousin of the fancier, swankier statistical modelling methods that the industry is all abuzz with. You’ve got your regression models, pattern recognition models and predictive models strutting their stuff on the bright lights of the data catwalks, flashing their tightly honed gaussian distributions and making eyes at big data budgets, while attribution modelling is plugging away in the background. Doing a bit of catalogue work here and there. Maybe the odd advert for a high street store, that kind of thing.

But think about it – do you buy your work clothes from super-high-end designer brands all the time? Or do you, y’know, sometimes get your trousers from John Lewis/Primark/Topshop/Oxfam/Your sibling’s hand-me-downs/a bin (again, delete as appropriate)? I’m willing to bet you do, and I’ll bet that that paisley print shirt, beige corduroy trouser and neon bobble hat combo (what? I’m an analyst, if you’re taking styling advice from me, something’s gone wrong somewhere down the line) worked just fine in terms of value for money. Maybe even turned a few heads here and there. That’s attribution modelling for you. No, really. You just need to do it right.

Like its fashion counterpart, data modelling has a core of sensible practicality wrapped in layers and layers of unnecessary complication, smoke and mirrors.  You need some clothes, and it might be that you could use some guidance on making them look nicer by putting the right ones together, but you don’t need to spend hours and thousands of pounds to get a look that works for you. Equally, you have some data, and mixing it together with a bit of maths might make it work a bit harder for you, but again, there’s no need to go spending vast sums of money getting to that point (unless you’re spending it on Station10 consulting, because we’re worth it). Contrary to what you might have heard in swanky presentations about machine learning, data models don’t “learn” in any human sense.  They can take in more data, and the maths you’ve provided it with can adapt to that data, but the model can’t improve its own mathematical structure without human input.  With attribution models, indeed any other models (even the algorithmic versions available now), you need to be very careful when setting your constant weightings, because you are (by necessity) biasing your inputs at the initial construction point.

Wait, I realise I haven’t actually said what attribution modelling is.  Basically, you take your incoming traffic, split it by channel - be it paid marketing or organic, online or offline, whatever your business decides. Then you assign each channel a weight, fixed or variable based on whatever factors you like (say the proximity to a final conversion or an important point in your customer journey). Finally you split up the revenue you make according to those weightings and how many times your customers have hit each channel. Voila, you have an unequivocal, unarguable, perfectly comparable model that says which of your channels is contributing the most to your bottom line, and which might as well have their entire budgets repurposed into buying all the staff some tastier biscuits. Or you would do if it weren’t for all these pesky subjective factors – Which things are channels anyway? How should they overlap? Is it more important that a channel acquired the customer in the first place or was the last one before conversion? Is it more important a customer touches a channel lots of times? Is the purpose of every channel even the same?

Add to that the political nature of the output (I’m sure you don’t work in a company where every marketing department head is happy to be told they’re investing their budgets in the wrong place without any kind of a fight, but if you do, can I have a job? With free ice-cream and a company unicorn?) and you can see that while there’s a definite benefit in getting it right, getting it right is a big headache when you know you’re going to be under the microscope.  So here are some top tips for skipping through the minefield with minimum debilitating injuries:

  • Get buy in defining your channels and keep your stakeholders informed
  • If you have channels measured by other means (eg PPC budgets are going through AdWords or somesuch) make sure you know how they work and how they compare to both your model and your tracking methods. You will get asked.
  • Know what you want it for – are you going to feed media mix models? Are you just looking to boost conversion and hang the secondary metrics? Make it clear, then make a model that does that and only that, and communicate your goals. Loudly.
  • Obfuscating the model with complex algorithmic variables isn’t going to make it better – if you can’t explain how it works to someone without a statistics degree, don’t expect anyone to believe your outputs.

I’ve seen clients quietly build simple, explainable attribution models that have enabled marketing budgets to be tweaked to the tune of millions in uplifted revenue, and I’ve seen clients waste huge sums constructing monolithic hyper-complex models with no buy-in that died a death weeks later. Of course, if you want to make absolute sure you avoid the latter, you could always get some experts in to help you out… I promise we know a lot more about data than we do about fashion.