Tuesday, 23 October 2012

Determinism, Chaos Theory and Klout

Firstly, let me clarify, I'm not a mathematician or a physicist, I'm a marketer who has a side interest in science (closet geek). The following post derives from my belief (yes, belief, I can't actually quantify this) that a certain element of hubris has taken hold in the tech community, driven by the genuine incredibleness that is Big Data.
A couple of centuries or so ago, following the development of Newtonian physics, humans got it into their heads that it should be possible, given the right grasp of the variables, to predict the future using mathematics. The Enlightenment brought with it the entrenchment of scientific Determinism, a belief that took a couple more centuries to shake (thank you Chaos Theory).
Looking at it in hindsight, we know was an act of hubris, but at the time it seemed so possible, science was learning ever more about the world around us and it seemed to be so predictable... if you could work out the orbits of the planets and predict eclipses, why not the weather given the right computing horsepower? Unfortunately. It never quite worked out that way - there were just too many variables for some things and the Uncertainty Principle derived out of the other area of physics - Quantum Mechanics - confirmed that for us.
Today with the growth of Big Data we seem to have developed a modern version of this hubris - the belief that if we have the right algorithm we can measure things like social influence. This seems to be driven primary by marketing folks such as myself who want to figure out ever more so our marketing dollars are used with maximum efficiency - if we can identify key influencers they can do our job via word of mouth. And on the surface it all seems so... possible! We'd like to think that from all this data we can somehow develop a complete picture, to know everything about who is who, and how they influence others through the analysis of data. But it is my belief that this is just not possible, there are too many variables. Sure we can get a general picture of things (as we can with the weather and other complex systems), but as we add more data, things become ever more variable. Witness the problems of Klout... the more detailed their algorithm becomes, the more errors it throws up (I tweeted once about the EU and I became an influencer on "Europe" - seriously).
At its heart, the basic problem behind these efforts is the premise itself - you cannot measure social media influence solely through data collected through online interaction, as you miss the far more important side of things which is real world influence. If I look through my (rather thin) collection of Twitter followers and those I follow, most of those people are ones I have met or heard about through offline contact... in fact I mentally up-weight their influence based our real world interactions. I'm fairly certain I'm not the only person who does this, it's instinctive. So many important, yet difficult to measure 'human' aspects come into play - not the least elements of human instinct. How on earth can you measure that?
Naturally this will upset marketers who are seeking to identify social media influencers. As a marketer myself I can see why you do it, but I feel compelled to inform those from my industry who try, that in many ways they're kidding themselves and relying on a number in place of a degree of instinct (a very important aspect of good marketing) as well as courage (in my books even more important, taking calculated risks is what leads to the best campaigns). It will also upset those who's Klout score or number of Twitter followers is a source of pride. I'm sorry to say it, but there are just some things that can't be defined by a single number.

When it comes to the interactions between humans, fortunately, the old aphorism holds true; not all that can be measured matters, and not all that matters can be measured.

Personally I think things are better that way.


  1. You make some very good points about physics, and how we don't have a holistic model of physics, a 'theory of everything' yet. I agree with you. I would argue if it's important at all?

    Does the fact that Newtonian theory cannot explain everything invalidate Newtoinan theory? In my view it doesn't.

    Yes, we needed new physics to build quantum computers, but Newtonian theory alone got us quite far. We built bridges, trains, even completed moon-landings using mainly the predictions of our 'incomplete' model of the physical world.

    I love the fact you mention weather forecasting, it's one of my favourite metaphors http://bit.ly/zcaPhZ . It's not perfect, we know this. Practically, does it matter? Does the fact that weather is not 100% predictable invalidate what poineers of weather forecasting did? We can't predict stuff with 100% certainty. But we can forecast hurricanes, predict approximately where they're going and act on these predictions, which saves lives. Climate Corp. uses weather models to improve the efficiency of agriculture. And so on, there are just too many successful applications of our 'incomplete' model of weather to list them all.

    So I agree with you: We won't have a fully satisfactory model of influence anytime soon, and probably never. But that's not the point. The point is that we are constantly improving our understanding, and in the meantime providing new insights into social behaviour at scale. Even using 'incomplete' theory, we can build services that users, partners and clients see value in and can benefit from.

    As for the real world influence thing: You wrote something very insightful, maybe without perhaps noticing the power of the sentence: "If I look through my (rather thin) collection of Twitter followers and those I follow, most of those people are ones I have met or heard about through offline contact." When I read this sentence I see that as a validation of what we're doing. We can't measure offline influence as such, but your offline networks and behaviour will - to some degree - be reflected in things we can measure: your followers and your interactions on-line.

    This goes back to the distinction between measurement and inference. Some things we can measure, some are hidden from us. But based on observables we can infer a whole lot more hidden things that's not directly observable. So my answer to 'How on earth can you measure that?' is: we can't and we don't want to. But we may be able to infer it from measurable data.

  2. Ferenc, thank you for your very comprehensive response... certainly more of a pleasure to read than the "you're full of crap mate" responses that come along sometimes!

    I think you've raised the essential and fundamental difference between the scientist and the marketer in this situation. As you say, we don't need to fully understand all physics to be able to make very good use of it. At their best, scientists are highly practical types who recognise this and this sort of utilitarian approach is enormously helpful - inference is fine.

    My concern is that marketing types like myself (and the occasional scientist) can get carried away in the belief that the numbers tell all. It happened all to recently in the finance industry. All I hope to do then is to warn against hubris

    1. Ah yes, there is definitely a hole in the public's understanding of risk and uncertainty. I'm often thinking about how to communicate uncertainty on our websites, but it's not trivial. If we reported a confidence interval, would that help?

      I quite like Prof David Spiegelhalter's work, who is Professor of Public Understanding of Risk at Cambridge University: http://www.statslab.cam.ac.uk/Dept/People/Spiegelhalter/davids.html He studies how to best communicate probabilities and uncertainty. He finds that even if you represent the same quantitative information in different formats, people will perceive risks differently.

      So yes, uncertainty is hard to communicate, but I hope by staying honest, people will eventually realise we're not claiming we can predict everything, and they will start seeing the real value and understand the extent of risk associated with our profiles.

    2. Thank you for the tip, I think I've seen him on BBC once or twice, he's very informative. Re confidence intervals, it would be good, though you run into the problem there that the common use of something like "i'm 99 percent confident" is subtly but crucially different from the mathematical one.

      The tough job you'll have (and as a communicator I know this) will not be how you communicate what you do, but how others communicate on your behalf... devilishly hard to control, but vital to manage as much as possible.