What Taylor Swift can tell us about Machine Learning
There was a fascinating programme on the BBC recently, called “The Secret Science of Pop” about Machine Learning. What was fascinating about it was it unintentionally demonstrated the pitfalls of Artificial Intelligence programmes, whilst also highlighting their potential. And it all comes down to what you are trying to achieve.
The programme was presented and led by Professor Armand Leroi, an amiable and talented scientist, whose background was studying the lifecycle of worms and is now head of the Social and Cultural Analytics Lab of the Data Science Institute at Imperial College, London. Through this, he had gone into collecting vast amounts of data, and so had started using data processing and algorithms in his studies, and now, after 20 years of academic research, is a leading light in AI, and had a team of data scientists behind him. In other words, this was not a random presenter with an idea for a TV show – this was a credible expert in the Data Science space with a clear track record.
The challenge he had set himself was to use machine learning to identify the constituents of a chart-topping pop song, and then use this to create one. The premise was that there is about 50 years of weekly data for the UK pop charts, which means that these trends can be analysed to identify the traits of the perfect pop song. Alongside this, he was collaborating with Trevor Horn, one of the great record producers of the 70s and 80s. They already had their song, written and performed by an unsigned artist, and so the real question was what was the best arrangement to create the hit.
This is an admirably ambitious target, but there were clearly some reservations on behalf of the record producer. The Professor’s conviction that because he could analyse the data in such minute detail, his lack of a background in music was irrelevant was clearly an awkward moment for Trevor, whose 50 years in the music business had built up huge knowledge of what works and what doesn’t. And this is what a lot of data scientists also claim – “we are experts in the data, so we don’t need to know the business”. This is a bold claim, but you clearly need to be able to identify the opportunities, and crucially, what is not an opportunity, within a particular business. As it turned out, Trevor had a point.
Even so, the Professor and his team started analysing the data from all top 40 hits in the UK charts since it started in the early 1960s. This involved “translating” the music into machine-readable data, including its themes, tempo, lyrics, and so forth. This part was left relatively undefined and made to look complicated by the Professor standing in front of a wall of fast-moving code, to represent how much data processing was going on. In any event, this was a comprehensive method, involving some very clever people and some serious computer power. The part that the programme left unsaid was the cost; clearly, this experiment was being made for television, so the business case was very different, but if this were a music “start-up”, this would involve major initial investment; it clearly took months of work, and this was to produce just one song. From an ROI point of view, the team had all their eggs in one basket, with only one potential revenue stream at the end.
What happened next was in fact very interesting, and the most successful part of the project. With all the data points translated and modelled, the team identified some key metrics by which they could analyse historical trends of the UK charts. This produced an “innovation” tracker, which identified how much popular music was changing over time. This flagged the musical revolutions of the last 50-60 years, but in a different way to what you might expect. For instance, the late 70s showed up as a period of major change, but not because of punk, which was too small to be statistically significant in the overall analysis; it was because of the growth of disco and the change in tempo and musicality that brought. It also showed a further revolution of the mid to late 1990s, with the growth of electronic dance, trance and techno music.
This was interesting, and provided a long-term view of the data. This was genuinely different to many other analyses of pop music, which are often biased by the interests or prejudices of the writer or presenter. For example, many British aficionados would flag the significance of punk and the impact this had on music and the generation of teenagers who grew up with it. But, as the Professor pointed out, there were in fact very few songs that could be categorised as punk, so their actual effect on the wider music trends was low. This is exactly what Machine Learning is good at, and good for; to remove any human-led bias on the analysis to identify insights that might otherwise be missed, or at least underplayed. However, what you do with that insight requires clear knowledge of the sector; as it stands, this was an interesting thought piece, but the applications were small, unless you are in a period of musical innovation (which, it turns out, we are not).
When it came to the key insight of what drove a top 10 hit right now, which was the cornerstone of the whole project, it demonstrated one of the pitfalls of Machine Learning. After all the weeks of number crunching, the computer concluded that the best formula for a hit was to be as close to the “average” of the key metrics as possible, pointing out that a Taylor Swift song from a couple of years ago was the paradigm of pop. So, after massive investment, the algorithm said, “Just aim for the middle, mate”. Which of course is no insight at all, because it doesn’t pass the “well, I could have thought of that” test. This led to a tricky meeting with Trevor, the music producer with no Deep Learning training, who made one of the most statistically savvy points of the programme; surely, the idea of the average only exists because you have been looking at all the data. In other words, if you had done the same thing in 1964, or even 1974, there wouldn’t have been enough data points to have a meaningful average, so you would have just got on with making music. By looking at a large data set, you have in fact created the concept of “data bias”, in which you may have a tendency to the norm, and so the analysis emphasises the average; we have recently seen something similar with electoral polls in the UK, where there have been accusations of “herding” towards the consensus of the different polls, because to stand out with an outlier result would be too “risky”.
Ironically, whilst the Professor had to accept “this isn’t the best advert for Machine Learning ever”, he managed to misinterpret Trevor’s point, where he took it as a point about music, rather than an inherent bias within his own methodology; I suspect at this point, he was too invested, emotionally and organisationally, to pivot or to withdraw. The team, now including Trevor’s record producers, started to find different ways to innovate with arrangements to their song to try to jazz it up. They were always trying to aim for the average, and this increasingly tantalising Taylor Swift paragon-song, but all their attempts actually took it away from the average, according to their criteria.
At this point, Trevor made the telling point about industry knowledge. As the Professor increasingly desparately asked to put rap riffs and dance beats underneath the lovely female vocals of the original ballad, Trevor pointed out that the song is a ballad, and that doesn’t work with dance beats. In other words, people want their ballads as ballads, powerful vocals left relatively unadulterated by other genres; trying to put dance tunes underneath “just makes it sound like a remix”. This strikes me as a relatively straight-forward point about music that anyone with knowledge of the sector, and who wasn’t staking his data science reputation on the outcome, would have known. This debunked the Professor’s previous argument about not needing to know about music to produce great insights from just the data.
At this point, the Professor effectively accepted defeat. The song was not released, because they couldn’t find a version of the song that either made sense from a musical point of view, or fitted with the Data Science team’s model. He realised, at length, that producing a musical hit was what he scientifically called a “non-linear event”, which means it’s not predictable, or requires elements, like talent and creativity, that don’t allow themselves to be defined by an algorithm. This also occurs to me to be something that could have been identified before the massive expenditure began; it’s an industry that is the ultimate preserve of one-hit wonders – normal speak, I think, for “non-linear events” – and so there had to be more than a reasonable risk of the investment on the single revenue stream outcome being a flop, or not even seeing the light of day. And that strikes me as straight-forward business analysis.
Even so, I doff my cap to the Professor. We should always think big, and, if something isn’t a bit scary or daunting, it’s probably not worth doing; sometimes, you just don’t know whether you will succeed until you try. And, in a sense, it did succeed, but not in the way it intended. The trend analysis the team delivered in the programme highlighted what Machine Learning is good at, even if the Professor somewhat put that particular light under a bushel, by not explaining this point. And if they had focused more on that, the experiment could have been deemed a success.
But it didn’t, because it tried to analyse and influence events that were not sufficiently predictive or actionable to make work. And in the end, that wasn’t down to the quality of analysis, which was clearly excellent. It was down to a lack of understanding of the business sector the organisation was operating in.
So, if anyone ever comes to you and says “I don’t need to know about your business problems, just give me the data and I’ll work them out”, don’t believe them. You have to start with an understanding of the business problems to make insight programmes successful; that’s why we at Station10 always start any insight programme by working with the client to identify the real business questions that are important to you and your business. If you would like to discuss more, please get in touch.