Martha Bennett, originally published on Computer Weekly
One of the perils inherent in Big Data projects
About 20 years ago, we were running our first data mining pilot at the large insurer I was working for at the time. I’m reminded of this project whenever I listen to the advice given in connection with big data projects. “Know what questions you actually want answered before starting out” is cited by many as a best practice you ignore at your peril. And that’s sound advice, even in the context of projects whose main purpose is ‘discovery’. But it should be accompanied by a health warning: there will be answers that give you a lot more than you bargained for.
Let me go back to that data mining pilot. We discovered a bunch of things about our data. For example, that we needed to put in place better foundations for data mining – our data was just too ‘dirty’, the systems weren’t designed to capture much of the information the business people actually wanted, and so on. That was all worthwhile insight, and we knew what to do with that.
But the most valuable insight also turned out to be a showstopper. A key question the business had been asking was: is it possible to figure out which policyholders are likely to lapse their contract in the first year? We were able to identify the factors that made it probable (very probable, in fact) that a customer would stop paying their premiums or pension contributions within 12 months of taking out the product. But that was the easy part, as it turned out.
To cut a long story short: we ended up doing nothing. Nobody could agree what was an acceptable level of business to reject in order to reduce churn rates; how to deal with salespeople who were likely to complain about loss of commission; who would have the final decision whether to accept or reject; how to deal with any potential reputational issues (and that’s in the days before social media); the list goes on and on. All told, it was much easier just to leave things as they were.
What we experienced is exactly the kind of situation many organisations embarking upon ‘big data’ projects are facing today. Knowing the answers to particular questions can have far-reaching implications for a company’s processes and decision making structures, as well as product portfolio and merchandising strategies, sales incentives, commission structures, and so on. But it’s likely that very few will have put in place any plans for how to deal with the consequences arising from the insights that have been gained. Aside from the waste of resources and potential missed opportunities, depending on the question that was asked, this lack of preparedness can also have a destabilising effect on the entire organisation.
It’s not realistic to expect every scenario to be catered for, and there’s no point in making detailed plans for eventualities which may never arise. Nevertheless, every time somebody says “if only we knew ….”, the question “what will we do if we get the answer?” should follow immediately. Or indeed “what can we do?”, as it may not actually be legal or ethical to exploit certain data.
The range of issues thrown up by the ensuing discussion is wide and varied, and it’s essential not to get bogged down in too much detail at this stage. What matters is agreement in principle on the degree to which the organisation is prepared to make changes or take risks. Putting in place a framework that allows decision makers to assess the ‘what if’ scenarios that arise from obtaining answers to particular questions puts in place the foundation for ensuring that organisations focus resources on the most important questions. It also means that executives as well as the specialists tasked with data work are in a position to take appropriate action when those answers do come in.