Big Data

The Reason Why We Love Big Data: Recommendations

[vc_row][vc_column width=”1/1″][vc_column_text]Recommendation Engines have gained the most attention in the big data world. Why is that? Big data has created three distinct types of data-driven products:

  • Data used to benchmark
  • Data used for recommendation and filter systems
  • Data used for predictions

Benchmarking is often the first quick win when embarking into the world of big data. We have, however, been doing benchmarking for centuries and benchmarking is not the reason for the big data hype. Benchmarking often needs an educated decision maker at the other side of the screen to explain why to benchmark: Why is A performing better then B? Why did the curve drop? Thus, benchmarking products often are not scalable – the more dashboards we build from big data, the more educated decision makers (also called “the analysts”) are needed.

The fame of data products is driven by something else: recommendation engines. Recommendations narrow down what could have been a complex decision into just a few recommendations. Big data allowed us to do recommendations on a new scale that we had not seen before. The most well-known example is how the Google search algorithm trumped AltaVista by recommending the best websites to view. Another well-known example is Amazon’s recommendation engine, based on the past reading behaviors of other readers. Both of those systems are based on algorithms that “learn” from past data.

A recommendation system outdoes benchmarking because it does not need an analyst at the end. It reduces big data to small data (see my opinion on why small data is important). A recommendation system suggests a few data points out of a large pool of data. Take LinkedIn as an example: The data product “people you may know” recommends only a few members out of a database of 300,000,000 members.

Thus, recommendation engines are becoming more and more important. Logically, the world of startups is filled with companies creating recommendation products in one way or the other. alone lists hundreds startups claiming to “recommend.” From the right restaurants (recommenu by Jake Bailey) to films (foundd by Lasse Clausen) to products (Linkcious by Weichang Lai) … All of those companies try to find a smarter way of making sense of data.

But what is a recommendation engine exactly? I asked Anmol Bhasin, who is one of the leading experts in the field of recommendation. Watch this 2 minute video to learn about the difference between Content Based Recommendation Engines vs. Collaborative Filtering.

But, before you now rush off and invest your money in recommendation engines, beware: life might not be that easy. There are major technology challenges in recommendation engines:

  • Cold Start Problem
    The heart of a recommendation system is that a computer learns from data, i.e., who has read this book before, who connected to this person before, etc. One of the biggest challenges can be that there is not a sufficient amount of historical data at the start. Take FOUNDD, a young Berlin-based startup for movie recommendations. It did not have a long purchase history, such as Netflix would presently have, thus the algorithm would not be able to recommend anything useful in the beginning. Fully aware of that issue, the founder Lasse Clausen created a “hot or flop” page in the beginning. Each customer has to rate 10 movies before the system begins to recommend anything.
  • No Surprises: Let’s say there was a sufficient amount of historical data, then the second problem with recommendation engines – if executed badly – is that there might be no surprises. Advising someone to read the book Harry Potter 3 after they looked at Harry Potter 6 might not be all too insightful. It just states the obvious. Recommendation engines therefore work best in the long tail of the data – because here are the unexpected results.

The two main industries that at this moment benefit strongly from recommendation engines are the retail industry and the media industry, because both have a lot of data in the long tail, and both have a lot of data to overcome the cold-start problem.


(Adapted from Oğuzhan Abdik under the Creative Common’s licence)

But, as other industries are beginning to use recommendation engines more and more, such as the transportation industry, we see more and more intelligent navigation systems for either personal use (waze) or being used as traffic control systems (IBM). Or, you can take a look at the airline industry – GE started a Kaggle competition to find the best routes to save energy for the airline industry.

The recommendation engine is the shining star of big data and we will see way more applications in the future. Read the next post (Sep 16th) to learn more about the third and last element of the data products: predictions. Can’t wait? Subscribe to my newsletter to get an some free resources about data products – such as my latest talk at the Harvard Business Review Conference.

(The article was original published by FORBES)[/vc_column_text][/vc_column][/vc_row]

Leave a Reply

Your email address will not be published. Required fields are marked *