Author Archives: Lutz Finger

About Lutz Finger

Big Data Guy, Entrepreneur, Quantum Physicist, Author ... Competing on Data - How to Use Social Media Analytics and Big Data to Transform Your Business.

The New Facebook Likes

Social Media really becomes more and more like a survey tool. You can now express different types of “likes”…. Thumb up or not? It’s your choice! “Not every moment is a good moment” said “Zuck”. That new stickers open up a lot of new abilities to survey or analyze peoples posts. But as always… more options makes less data… and less data makes analytics more complex… I am curious to hear how the analytics team will do it.

final (1)

The full overview was reported by the Next Web:

facebook Like Stickers

How Cornell Trains Future Data Managers

How do you create amazing new data driven products! By studying deep, deeper and the deepest machine learning algorithms? Nope – by enabling business folks to talk and think data .

Recently I had the opportunity to test this theory. I taught a course for MBA students at Cornell University’s Johnson Business School and Cornell Tech campus. In the course, we covered the basic components of data science including modeling, visualization scraping technologies and databases. At the end each team built a stunning data app: from predicting the of future Starbucks locations to setting prices for old vinyl records.

data to manager

It is the business mindset that drives many of our product ideas. Thus while data scientists are tough to hire (I am hiring at LinkedIn – join our great team) it is the business focus that is missing. McKinsey recently called out that we face a shortage of about 150k Data Scientist as well as 1.5 million managers . Really? Are those data scientists so hard to manage that we need 10 managers to take care of them? No. The truth is that we are missing data minded managers as we need to include a business-related component in any data discussion. If you hear “actionable analytics,” it only means something like “business thinking inside”.  Analytics should be focused on an action to improve or change our business.

My course at Cornell had two main objectives: to take away the fear of big data and to create a common language for MBAs to use. But how do we “take away the fear”? The answer is by building your own data applications. Many data science syllabi often teach students tools such as R (a programming language for statisticians), Python, and the like. Don’t get me wrong: these are great tools, but they are useless for teaching MBAs. No one will remember a few months down the road how to even load a dataset.

Thus I focus on simplicity. For data scraping we used (a great tool founded by Andrew Fogg), for visualization we used (a very simple visualization tool by Matt Sundquist) and for the predictive layer we used eitherBigML or Excel. Yes, Excel. It can – with a bit of hand-holding – recreate many Data Science models. It is learning by doing. If you want to dig into this, I highly recommend the book Data Smart from John W. Foreman.


Scraping, plotting, and crunching are important… but it does not take long for any smart MBA to ask the “so what question.” So we start with a framework of actionability and applicability discussed in my recent book Ask Measure Learn(find it here). Students learn that data and algorithms are nothing if you cannot create an action or build a product.

The course finishes with a term project where teams can use any data they can get as well as any complexity of model, as long as they define their own product use-case for the app. This complete freedom created amazing results. One person who is an excited collector of vinyl records built an engine to better determine their price point. Another team built an app to find for the best location for party-and-bar-loving MBAs  at a given price. Yet another team analyzed the feedback from various classes to determine how professors need to change to improve their teaching.

More important, some of these projects went far beyond being cool and made a very strong business case. Here are just two examples from our course that showcase what MBAs can do with just a little data science background:

  • Jacob Jordan (Jacob Jordan) predicted with an 80% likelihood where Starbucks will open their next store. With his model, he went even further and analyzed the claim that Starbucks drives gentrification, but could not find a high correlation with typical gentrification factors.

At the end of the course there were many high-quality business apps powered by data. A complete list of all project videos can be found here. This Cornell course is proof for me that in order to unleash amazing capabilities for innovation, companies need to teach business managers basic data science techniques .

This fall I will teach this course again at Harvard Business School (together withProf. Datar) as well as at Johnson’s Cornell. Let’s see what kind of innovation we will get!


(Lutz Finger – talks about his book “Ask Measure Learn” at Cornell University (photo: Bryan Russett))

This article was first published in Forbes.

Spring 2015 – Best of Cornell

How do you create amazing new data driven products! By studying deep, deeper and the deepest machine learning algorithms? Nope – by enabling business folks to talk and think data. I recently had the opportunity to test this theory. I taught a course about Data Science for MBA students at Cornell University’s Johnson School and Cornell Tech campus. The student projects are listed here: from predicting the of future Starbucks locations to setting prices for old vinyl records.

Screen Shot 2015-09-15 at 11.03.34 PM

Fair Rental Finder
This tool identifies whether an apartment in Boston is fairly priced.

Analyzing Yelp – Small Business Tool
A tool to help restaurant owners to analyze most efficiently and effective use reviews to improve their restaurants.

Predictive Pricing Guidance on Amazon for Vinyl Records
This tool analyzes the pricing strategy of amazon and inform potential steps for action (go/no-go) in a music executive’s decision to sell vinyls on Amazon or even produce vinyls.

Car Value by Location Estimator
The data product predicts much should I pay for a particular car in my region, and what regions have the cheapest prices for it?

Automatically populating product lines onto e-commerce sites
This tool help to simplify the listing process at sites like ebay by analyzing and inferring attributes through their descriptions.

Zillow Predictions
The tool will predict which Zillow object will help investors to quickly recoup their investment.

Starbucks Predictions
This tool allows quickly to identify potential store locations of new Starbucks shops. Also it helps you to determine whether Starbucks store location strategy is changing or not.

This tool analyzes the students comments about Professors and matches it with their respective ratings. The Professor2Rockstar data product can help professors determine what aspects of their course they should prioritize in order to improve overall course satisfaction.

Analytics of Cornell’s Animal Health Diagnostic Center
This analytics tool helps to analyze the Cornell’s Animal Health Diagnostic Center’s customer feedback survey.

Nightlife and Apartment Finder
This tool helps young people to find an affordable apartment within their budget that is also close to a desired number of bars and clubs. The app analyzes apartment rentals and combines them with bar listings.

Predicting StartUp Funding Rounds
This tool aims to predict the likelihood of successful funding round. It thus would allow investors to gauge the likelihood of safe and lucrative investment.

Why LinkedIn Is As Important As The Capital Market

Recently, my favorite INSEAD Professor, Kevin Kaiser, came over to LinkedIn with a group of executives from around the world. We chatted about value creation in Silicon Valley. Like no other part of the world, Silicon Valley has seen stunning success, with a positive reinforcing cycle of startup success and funding opportunities (see the Startup Genome blog here for more information). Why? Because Silicon Valley creates value that changes our world. Let’s look at professional social networks. The following is solemnly the opinion of Kevin Kaiser and myself and not necessarily the opinion of my employer LinkedIn, one of the biggest professional social networks.

Professional Social Networks = Economic Enablers

Professional networks are as important to our economy as banks and capital markets. (Click to Tweet).Tweet this Clearly they are not the same, but both provide transparency as well as a transactional layer that supports our economy. Let’s first look at banks. Recent years have been tough for them, but in their heyday, banks provided capital, created and facilitated business relationships, and fostered trust amongst borrowers. They created transparency in an otherwise obscure and highly fragmented market. They were eventually complemented by capital markets, which provide similar functions, but to larger numbers of human beings and with more transparency and with additional diversification of risk.

Let’s look at the comparison as such. Is a professional network connecting business opportunities to people comparable at all to a bank or a financial market? Yes, because there is no distinction between human capital and financial capital as such. Any given object (for example, take your mobile phone, where you are reading this article) is only possible because of human capital. Those who followed the demise of the various anti-market experiments, most notably the Soviet Union or more recently Venezuela, witnessed what happens when a large percentage of human capital becomes misallocated.

Transparency Through Professional Social Networks

A professional network reduces risks in our economy by offering transparency into the job market . In a perfect world, every one of us would find the perfect job in a domain where we excel and are most effective and create the most overall value our customers, ourselves, our company and thus, society as a whole. There is room for improvement – 70% of the US workers are disengaged (read more about this here).

Consider the potential of LinkedIn, as an example. Each of their 347 million members have documented their skills and careers online. This data enables LinkedIn to understand which position requires what kind of skills. “Who should one hire?” LinkedIn matches candidates to jobs, and predicts the best matches for each opportunity with their member profiles. Based on their data, LinkedIn can go even further and suggest potential career paths as far as 5 years down the road (Click here to read about how LinkedIn predicted the future of a Mashable reporter). By using data, LinkedIn offers transparency, and as anyone with a finance background knows, transparency improves the allocation process, resulting in less perceived risk with better returns.

LinkedIn's power to Economy

Transactional Layer

There is another parallel between today’s professional networks and financial markets. They provide a transactional layer. Before the onset of banks, it was difficult to do business. The cost of a transaction was high and the process was complex. Banks helped to standardize the process, similarly to what we now see happening with professional networks. They make the process of searching and applying for a new job easy (Career-switcher? Read about the data driven approach – which industry should you choose?)

We are living in a world of creative destruction and constant change, resulting in an onslaught of new and innovative products and methods. This requires a workforce capable of continuous adaptability, performing tasks and jobs, which has never before existed. Who knows which skills will be valuable to society in the next 20 years? In a few hundred years, we have progressed from approximately 90% of our population working in farming, while food was scarce and the quality poor, to only 2%. This has enabled the other 98% to create entirely new industries, many of which resulted in the technological innovations, which made food more plentiful and affordable and accessible to more people than ever before, not to mention the advancements in areas from medical care to entertainment which enrich all of our lives in immeasurable ways. But as we “fire the farmers,” it is imperative that they find these new jobs and develop the skills to perform in them. There is a need for a “transactional layer”. Thus, by making it easier to search, find and apply to any job, we can react better to change as a society.

Data is the new LIFEBLOOD

The key to these improvements is data. This was recently well-described by Ann Winblad, co-founder and managing director of Hummer Winblad Venture Partners, in a recent interview: “I went from our alumni site on LinkedIn and to the LinkedIn created site and looked at the density of data they had on every one of the alum. Then I started digging and digging and digging in LinkedIn and realized, it’s not a human resource company, it’s a data company.” Data is what Jeff Weiner (@jeffweiner) describes as the Economic Graph. This dataset shows how members, companies, jobs, skills, schools and content are connected together.

The concern for PRIVACY

While the potential benefit of analyzing data to society is immense, it always brings up questions about privacy. These are valid concerns because as much as I want my bank data not to be public, I would like to keep the data conclusions of my professional network data private. It is not yet clear how to achieve such a level of privacy. Should we just stop economic opportunity by interdiction? An example I described in my book, “Ask Measure Learn” (Get your free sample chapter is here) is Schufa, the German credit-rating company, and the Hasso-Plattner-Institut from the University of Potsdam (HPI). Together, they started to investigate whether data from social networks could help to reduce credit risk. They planned on doing a test to analyze whether information on social networks such as Facebook and Xing (German LinkedIn) could help to predict credit worthiness.

Imagine the impact. If a financial provider can better assess your creditworthiness through data provided from social networks, the overall risk in providing loans would go down. Money would become cheaper and more accessible to more people. This is the core recipe for a booming economy. The Nobel Peace Prize-winning Garmeen Bank showed us this exact scenario. They did not use Facebook information, but they leveraged social networks formed by villages.

The project in Germany was stopped due to public outrage about privacy concerns. History however has taught us that the act of prohibition never really worked well. But there are companies like Wonga or Prosper both founded by the “who is who” in Silicon Valley who aim to disrupt the banking space with social network information.

The ImportanceofPrivacy

THE ROAD AHEAD: Data is the future

Professional networks are only the beginning. More areas of our life await the transparency that can be created with data. President Obama announced DJ Patil as the first Chief Data Science Officer of the White House at this year’s Strata. Their aim is to foster more business models that can foster opportunities and reduce risk in our economy with open data, even though the questions about privacy are not yet solved. The opportunities are amazing or as Jeff Weiner (CEO from LinkedIn) notably said it… “The creation of economic opportunity might be the defining issue of our time.”

This Column was co-authored by Lutz Finger, Director of Data Science at LinkedIn and author of the book “Ask Measure Learn” by O’Reilly Media and Kevin Kaiser, Professor of Management Practice, INSEAD, and co-author of “The Blue Line Imperative” and “Becoming a Top Manager”.


(republished from Forbes)

Social Media Bots

How To Spot Social Media Bots – They Are Often Lonely

Social media bots have become an increasingly challenging issue. They can trick you into buying stuff or even influence your opinion (read more about the trouble they can cause here). But one way to spot – and stop – bots is by using their own friends. Who wants to have friends who are just there to ‘sell you stuff’. Correct! No one! Being a bot is being lonely – or hang out with other bots.

To demonstrate this I teamed up with Affinio, a company co-founded by Tim Burke (@t1mburke) that looks at social communities for brands. Often brands measure social media activity as a whole, where every engagement counts equally. That is not entirely correct however, because normally a brand wants to only address their specific target group. Does your Twitter account equals your brands target group? Not necessarily. Affinio showcases this. As a nice side effect one can easily spot bots since they form – unintentionally – communities of themselves.

Let’s take some Top Social Media “Influencers” (Link to List). We analyzed the communities from @jeffbullas, @briansolis and @AmyJoMartin.Tweet this

Amy Jo Martin’s Followers

At a first look Amy Jo Martin’s twitter followers look rather inactive. Many of them have no uploaded image, but show only the pre-set “egg” user image from Twitter. Random Sample from @amyjomartin

However that might not necessarily mean that those guys are bots – it might just be that those tweeps are just less active – they did not even bother to upload an image. The lower level of activity, however, is easily seen. If one clusters the Twitter followers for both Jeff and Amy. One will find that Jeff’s least active followers tweet about 15 posts a month. While Amy’s has five groups that tweet below 1 post a month.

Jeff Bullas followers amyjomartin followers

Affinio’s strength is to analyze communities and identifying who is “influencing” each member of the community – meaning who do most tweeps in this community follow. If, however, one purchases a thousand bots like I did in my experiment with @spotthebot (see this movie about it), these bots are often sold to others as well. The bot owner had build them and now resells them over and over again. The consequence: all bots follow the same persons. Said differently these influencer of one group of bots can be seen as the “customer list” of these bots.

@AmyJoMartin has one community that follows 92% the same tweeps. Tweet this  This is a uniformity we have not seen since breakup of the Soviet Union. Meaning – this is not human. By comparison, Jeff Bullas and Brian Solis have only a maximum follower similarity in their different audiences of 45% and 40% respectively.

unique influencer

Thus Amy has bots who follow her. Looking at the community becomes easy to see that not every “influencer” is as influential as we might have thought. If you want to learn more about ‘influencer’ and the way to measure them read this free chapter on marketing from my Book “Ask Measure Learn” by O’Reilly Media.

Please note that Amy has not necessarily purchased those bots. As pointed out in this post, it might just be that others bought them to harm her or maybe a bot programmer used Amy to make their bots look more natural.

Another way of spotting bots is by looking at their behavior, particularly when this behavior is too regular over time. If someone tweets and engages constantly, it is most likely not a human being. (At least I personally value my sleep!)Spotting Bots

An Arm’s Length Race

If you are a software engineer you already might think that all those issues with bots can be easily circumnavigated. Bots could upload images automatically. Bots could be more active – just tweet from an rss feed. And yes, bots could be more careful who they follow and how regular they tweet. You are right! Every time a network spots and removes a bot the programmer understands that she will need to change the algorithm.

Because of this unintended feedback look a well well-known dating service took action and no longer removed identified bots. Instead they moved them into a virtual chatroom where spam bots meet up with bots from the dating site. This way the programmer will not know for a while that her bot was detected.

In the end, we need to trust that it is in the best interests of social networks to do all they can to remove bots. Most of them now offer a way to get certified or to identify a bot now. By doing so they can train computer programs to spot bot activity.

Bots are a reality and they will try their best to influence us. And they might be even more successful in the area of Big Data. Bots are generating data as well, and this data migh skew our algorithms such as trending content. We will need to master them as much as we have mastered SPAM, and learn to fend them off in the same way we have learned by now not to send money to a Nigerian prince. But maybe someday soon bots will outsmart us. But only if the computer become more intelligent as Nick Bostrom discussed in his excellent book “Superintelligence“. Until then, we should just be cautious about which prediction systems we believe.

Republished from Forbes