our views and our knowledge in analytics and other releveant topics

our blogs

The Ghost in the Machine (Learning)


We recently had our electronic code lock on our office external door reprogrammed. Every now and again, the buildings manager changes the code, mostly, as you would expect, to maintain security but partly, I’m sure, to keep us on our toes by not telling us for a few hours.  But this latest episode taught me something that had always puzzled me about these door code systems.I have used many different door code entry systems in many different buildings, at many different companies during my career. The keypad has always been based on a 3×3 numerical grid (numbers 1-9), with 0 on the next row, along with other commands.  And I have always noticed that there has been a pattern to the codes that you have to enter.  It’s never the same pattern, of course, but the codes never seem entirely randomly selected.  Sometimes they will be based on a diagonal across the grid, sometimes a pattern down one side, or along the bottom, or creating a triangle shape on the grid.  This doesn’t mean that they are hackable. In practice, you wouldn’t have enough time to test all the combinations (unless you had the resources of somewhere like the original Station10, Bletchley Park), and besides, you would still need to get past the security guard.

I always thought it was so that the codes would be memorable for the end users. Fairly rapidly, punching in the codes becomes a form of muscle memory, and I’m sure this is helped by the fact that the codes are in patterns that you don’t have to think about too hard.  However, my recent experience changed my view of this.

I happened to arrive at the office a bit earlier than usual, and it happened that the door code engineer was in the middle of reprogramming process.  At that precise moment, he had the keypad console hanging off the wall, with all the wires from each of the numbers connecting it to the wall.  And he was using his screwdriver to change the settings for the code.

At this point, it became apparent that it was much easier for him to access certain areas of the console with his screwdriver than others. The console was sort-of hinged to the wall on one side, limiting access to something long and thin and metallic.  He also seemed to notice that people were starting to come into the building, so I think he might have felt some pressure to finish before the main flood of people arrived.

Suddenly, it made sense as to why the codes on the doors followed patterns.  It was because it was easier for the engineer to create a pattern based on his access to the (literal) back end of the system. It was too awkward to reach the settings in the middle or far end of the keypad, you can’t use the same number twice, and he is probably feeling under pressure to get the job done.  That means there are only a certain number of configurations available (straight lines, diagonals, corners and so forth) and the fact that these turn out to be mnemonics for the end user is just a happy by-product.  All that then happens is the engineer needs to tell the buildings manager what code he has programmed in (as it obviously couldn’t be pre-programmed), which explains why we are always told about it a few hours later.

This made me think about the way machine learning for insight and attribution modelling is set up. Too many people don’t consider about the engineer or modeller that created the attribution model or algorithm in the first place and what criteria or logic they were following at the time.

I recently spoke with a client who said that they needed help with their attribution model.  They had set it up four years ago, using additional behavioural metrics which were thought at the time to be good proxies for identifying favourable downstream or offline outcomes.  However, digital measurement has moved on, and you can now ingest actual offline data, or indeed use more detailed digital metrics to provide a more accurate picture of long-term customer behaviour in the run-up to purchase, and so feed these into your online attribution model.

The original model for this client had been built in-house, but no-one who had built it originally was still working at the company.  This meant that the model was effectively not maintained, but worse, no-one could remember how to update or amend the model that had been built, or precisely why it had been done that way.  And yet, this attribution model was still at the heart of the evaluation of the organisation’s digital media spend.  There was therefore a corporate “ghost in the attribution machine” that was still working and allotting budget and measuring results, despite the fact that the model was out-of-date and no-one understood why it was doing what it was doing.

  • It’s vital to keep tabs on your model;
  • ensure you have several people in your team who understand it;
  • maintain both the algorithm, and the documentation around it;


To be fair, it’s not just in-house, or organic models that can cause problems. Automated attribution modelling systems have similar challenges.  It can be useful to take the onus away from individuals within the business to create and manage an attribution model, so you can remove any corporate or departmental bias from this.

If, for example, the model is maintained by the Display and Paid Search team within a marketing department, there is a strong risk that the model will, unconsciously or not, factor in elements that might favour the results coming out of that team.  So, it can be very valuable to remove that bias (or, more commonly, the potential accusation of bias), by moving to an automated algorithmic model for attribution.

This means that the algorithm will build and maintain the rules itself, based on the trends and factors it identifies within the data itself; it is not dependent on one of your team having to spot patterns themselves.  Using our door code analogy, this is equivalent to digitising your door code, and getting a computer to produce the codes, which are more likely to be genuinely random, rather than patterned.

But that is not the same as outsourcing your attribution model. You can’t just abdicate responsibility for this vital part of your operations to a machine.  And, hopefully as the analogy helps to show, just because it is automated does not mean that you shouldn’t understand what it is doing. You haven’t just bought the equivalent of a 24-hour manned security agency that will guard the premises but also instantly recognise your staff and open the door for them.  It’s still your door security that you need to manage. It’s just that the automated algorithm will help take the burden of some of your team.

To avoid the scenario of our client above, it is in fact vital to understand how your attribution or insight model works, so that you are able to evaluate its impact.  This will allow you to review it on an occasional basis to make sure it is being fed the right type of information, or to update it more holistically in the future.  The worst thing to do would be to recruit, in effect, a pre-packaged “ghost in the machine” that continues to work, but no-one understands.

If you would like to update your insight modelling, or if you can’t remember what the model is actually doing any longer, give Station10 a call, and we can help exorcise any ghosts that might be lurking in dark corners of your marketing.