[vc_row][vc_column width=”1/1″][vc_column_text]What is the future of big data? It will be all about predictions! Predictions based on data have come into our world and we often do not even know it. In many cities in the US, for example, it is no longer a coincidence when you meet a police officer: they are getting dispatched based on the models created by George Mohler, a seismologist who has found a way to help predict where the next crime is about to happen (read more here). When you get a flyer in the mail, it might be because your neighborhood retailer has tried to predict what you will soon need. Sometimes they do this too well: Target once made it into the news (read more here) because they knew that an underage girl was pregnant long before her own father knew it.
But let’s not get carried away by the big data world. Predictions are nothing new. Magicians like the famous Alexander Seer promised at the beginning of the last century to “know, see and tell” it all. Despite being new technology, predictions based on data are the most difficult data products we have to work with. Technically, the difference between predictions and recommendation engines (read here about them) is small. Most recommendations could be re-phrased as a predictions. The difference lies in our own free will.
Data products are differentiated by the amount of support that is needed before ‘actionable insights’ can be made out of the data. Benchmarking, the most basic data product, requires the interpretation of an “analyst” to make sense of the data. Recommendations only need the “user” to decide what to do next. And predictions? Predictions need no one. Read that sentence again! Predictions know the answer already– there is no further need to investigate or to make choices.
The idea is that the more data we have, the better our recommendation engines will be – so that they can become predictions. This view is best summarized as being the “end of theory.” Chris Anders (@chrisanders) argued that, in the future, we will have sufficient data to predict anything, and thus there will be no need for theoretical models anymore.
But often, it is not the amount of data that matters when creating a valid prediction. For example, the Incas predicted the best time of year to plant crops. Their dataset might have been as little as 3560 data points (= 10 years) – which is virtually nothing in our big data world. 500 years later, we have companies like Google that measure a lot about our online behavior, but despite all of this data, predictions are not necessarily easy to make. For example, New York Times bestselling business author Carol Roth once complained in her blog that Google infers that she is a male over age 65, when in fact she is a woman, and decades younger.
Why is this? Because not all of the data Google has aggregated is really helpful to the specific prediction they try to make. The fact that not all data is useful was best seen at the onset of social media. Suddenly, there were massive amounts of data and many of us thought that this could predict amazing things. For example, we saw many companies who claimed that they could predict stock price movement through social media content. Most of them (if not all) have vanished by now, since it turns out that social media chatter is just to “noisy” and thus cannot really help to make predictions.
Allow me to make a prediction about data products as such: predictive algorithms will become a bigger part of our life, and will probably change our society more then the Internet has. The Internet enabled us to do things faster and more conveniently. However, predictions based on our data trails aims even further because they enable us to forecast human behavior in a way we have never been able to before.
The biggest danger to the success of predictions is us – the “users” – who do not yet understand that a prediction is just a trained algorithm that could go wrong. Even if the right data set was used – for example, the wisdom of the crowd – they might not be the right crowd for us. Think about the student who is required to change his major because he was “off track” for too long, and so the algorithm assumes a low likelihood of success and the student must change majors (read more about this here). Such strict rules might mean the end of ”out-of-the-box” thinking.
Our world is full of wrong predictions – even if they are based on data – and a wrong prediction might easily destroy our future. But when we learn as consumers to take predictions as what they are – as likelihoods that advise (but do not dictate) our lives – predictions based on data will benefit all of us.
Do you want to learn more? Subscribe to my newsletter for more free resources on data products.
(The article was original published by FORBES)[/vc_column_text][/vc_column][/vc_row]