All posts by Lutz Finger

About Lutz Finger

Big Data Guy, Entrepreneur, Quantum Physicist, Author ... Competing on Data - How to Use Social Media Analytics and Big Data to Transform Your Business.

Why LinkedIn Is As Important As The Capital Market

Why LinkedIn Is As Important As The Capital Market

Recently, my favorite INSEAD Professor, Kevin Kaiser, came over to LinkedIn with a group of executives from around the world. We chatted about value creation in Silicon Valley. Like no other part of the world, Silicon Valley has seen stunning success, with a positive reinforcing cycle of startup success and funding opportunities (see the Startup Genome blog here for more information). Why? Because Silicon Valley creates value that changes our world. Let’s look at professional social networks. The following is solemnly the opinion of Kevin Kaiser and myself and not necessarily the opinion of my employer LinkedIn, one of the biggest professional social networks.

Professional Social Networks = Economic Enablers

Professional networks are as important to our economy as banks and capital markets. (Click to Tweet).Tweet this Clearly they are not the same, but both provide transparency as well as a transactional layer that supports our economy. Let’s first look at banks. Recent years have been tough for them, but in their heyday, banks provided capital, created and facilitated business relationships, and fostered trust amongst borrowers. They created transparency in an otherwise obscure and highly fragmented market. They were eventually complemented by capital markets, which provide similar functions, but to larger numbers of human beings and with more transparency and with additional diversification of risk.

Let’s look at the comparison as such. Is a professional network connecting business opportunities to people comparable at all to a bank or a financial market? Yes, because there is no distinction between human capital and financial capital as such. Any given object (for example, take your mobile phone, where you are reading this article) is only possible because of human capital. Those who followed the demise of the various anti-market experiments, most notably the Soviet Union or more recently Venezuela, witnessed what happens when a large percentage of human capital becomes misallocated.

Transparency Through Professional Social Networks

A professional network reduces risks in our economy by offering transparency into the job market . In a perfect world, every one of us would find the perfect job in a domain where we excel and are most effective and create the most overall value our customers, ourselves, our company and thus, society as a whole. There is room for improvement – 70% of the US workers are disengaged (read more about this here).

Consider the potential of LinkedIn, as an example. Each of their 347 million members have documented their skills and careers online. This data enables LinkedIn to understand which position requires what kind of skills. “Who should one hire?” LinkedIn matches candidates to jobs, and predicts the best matches for each opportunity with their member profiles. Based on their data, LinkedIn can go even further and suggest potential career paths as far as 5 years down the road (Click here to read about how LinkedIn predicted the future of a Mashable reporter). By using data, LinkedIn offers transparency, and as anyone with a finance background knows, transparency improves the allocation process, resulting in less perceived risk with better returns.

LinkedIn's power to Economy

Transactional Layer

There is another parallel between today’s professional networks and financial markets. They provide a transactional layer. Before the onset of banks, it was difficult to do business. The cost of a transaction was high and the process was complex. Banks helped to standardize the process, similarly to what we now see happening with professional networks. They make the process of searching and applying for a new job easy (Career-switcher? Read about the data driven approach – which industry should you choose?)

We are living in a world of creative destruction and constant change, resulting in an onslaught of new and innovative products and methods. This requires a workforce capable of continuous adaptability, performing tasks and jobs, which has never before existed. Who knows which skills will be valuable to society in the next 20 years? In a few hundred years, we have progressed from approximately 90% of our population working in farming, while food was scarce and the quality poor, to only 2%. This has enabled the other 98% to create entirely new industries, many of which resulted in the technological innovations, which made food more plentiful and affordable and accessible to more people than ever before, not to mention the advancements in areas from medical care to entertainment which enrich all of our lives in immeasurable ways. But as we “fire the farmers,” it is imperative that they find these new jobs and develop the skills to perform in them. There is a need for a “transactional layer”. Thus, by making it easier to search, find and apply to any job, we can react better to change as a society.

Data is the new LIFEBLOOD

The key to these improvements is data. This was recently well-described by Ann Winblad, co-founder and managing director of Hummer Winblad Venture Partners, in a recent interview: “I went from our alumni site on LinkedIn and to the LinkedIn created site and looked at the density of data they had on every one of the alum. Then I started digging and digging and digging in LinkedIn and realized, it’s not a human resource company, it’s a data company.” Data is what Jeff Weiner (@jeffweiner) describes as the Economic Graph. This dataset shows how members, companies, jobs, skills, schools and content are connected together.

The concern for PRIVACY

While the potential benefit of analyzing data to society is immense, it always brings up questions about privacy. These are valid concerns because as much as I want my bank data not to be public, I would like to keep the data conclusions of my professional network data private. It is not yet clear how to achieve such a level of privacy. Should we just stop economic opportunity by interdiction? An example I described in my book, “Ask Measure Learn” (Get your free sample chapter is here) is Schufa, the German credit-rating company, and the Hasso-Plattner-Institut from the University of Potsdam (HPI). Together, they started to investigate whether data from social networks could help to reduce credit risk. They planned on doing a test to analyze whether information on social networks such as Facebook and Xing (German LinkedIn) could help to predict credit worthiness.

Imagine the impact. If a financial provider can better assess your creditworthiness through data provided from social networks, the overall risk in providing loans would go down. Money would become cheaper and more accessible to more people. This is the core recipe for a booming economy. The Nobel Peace Prize-winning Garmeen Bank showed us this exact scenario. They did not use Facebook information, but they leveraged social networks formed by villages.

The project in Germany was stopped due to public outrage about privacy concerns. History however has taught us that the act of prohibition never really worked well. But there are companies like Wonga or Prosper both founded by the “who is who” in Silicon Valley who aim to disrupt the banking space with social network information.

The ImportanceofPrivacy

THE ROAD AHEAD: Data is the future

Professional networks are only the beginning. More areas of our life await the transparency that can be created with data. President Obama announced DJ Patil as the first Chief Data Science Officer of the White House at this year’s Strata. Their aim is to foster more business models that can foster opportunities and reduce risk in our economy with open data, even though the questions about privacy are not yet solved. The opportunities are amazing or as Jeff Weiner (CEO from LinkedIn) notably said it… “The creation of economic opportunity might be the defining issue of our time.”

This Column was co-authored by Lutz Finger, Director of Data Science at LinkedIn and author of the book “Ask Measure Learn” by O’Reilly Media and Kevin Kaiser, Professor of Management Practice, INSEAD, and co-author of “The Blue Line Imperative” and “Becoming a Top Manager”.


(republished from Forbes)

Social Media Bots

How To Spot Social Media Bots – They Are Often Lonely

Social media bots have become an increasingly challenging issue. They can trick you into buying stuff or even influence your opinion (read more about the trouble they can cause here). But one way to spot – and stop – bots is by using their own friends. Who wants to have friends who are just there to ‘sell you stuff’. Correct! No one! Being a bot is being lonely – or hang out with other bots.

To demonstrate this I teamed up with Affinio, a company co-founded by Tim Burke (@t1mburke) that looks at social communities for brands. Often brands measure social media activity as a whole, where every engagement counts equally. That is not entirely correct however, because normally a brand wants to only address their specific target group. Does your Twitter account equals your brands target group? Not necessarily. Affinio showcases this. As a nice side effect one can easily spot bots since they form – unintentionally – communities of themselves.

Let’s take some Top Social Media “Influencers” (Link to List). We analyzed the communities from @jeffbullas, @briansolis and @AmyJoMartin.Tweet this

Amy Jo Martin’s Followers

At a first look Amy Jo Martin’s twitter followers look rather inactive. Many of them have no uploaded image, but show only the pre-set “egg” user image from Twitter. Random Sample from @amyjomartin

However that might not necessarily mean that those guys are bots – it might just be that those tweeps are just less active – they did not even bother to upload an image. The lower level of activity, however, is easily seen. If one clusters the Twitter followers for both Jeff and Amy. One will find that Jeff’s least active followers tweet about 15 posts a month. While Amy’s has five groups that tweet below 1 post a month.

Jeff Bullas followers amyjomartin followers

Affinio’s strength is to analyze communities and identifying who is “influencing” each member of the community – meaning who do most tweeps in this community follow. If, however, one purchases a thousand bots like I did in my experiment with @spotthebot (see this movie about it), these bots are often sold to others as well. The bot owner had build them and now resells them over and over again. The consequence: all bots follow the same persons. Said differently these influencer of one group of bots can be seen as the “customer list” of these bots.

@AmyJoMartin has one community that follows 92% the same tweeps. Tweet this  This is a uniformity we have not seen since breakup of the Soviet Union. Meaning – this is not human. By comparison, Jeff Bullas and Brian Solis have only a maximum follower similarity in their different audiences of 45% and 40% respectively.

unique influencer

Thus Amy has bots who follow her. Looking at the community becomes easy to see that not every “influencer” is as influential as we might have thought. If you want to learn more about ‘influencer’ and the way to measure them read this free chapter on marketing from my Book “Ask Measure Learn” by O’Reilly Media.

Please note that Amy has not necessarily purchased those bots. As pointed out in this post, it might just be that others bought them to harm her or maybe a bot programmer used Amy to make their bots look more natural.

Another way of spotting bots is by looking at their behavior, particularly when this behavior is too regular over time. If someone tweets and engages constantly, it is most likely not a human being. (At least I personally value my sleep!)Spotting Bots

An Arm’s Length Race

If you are a software engineer you already might think that all those issues with bots can be easily circumnavigated. Bots could upload images automatically. Bots could be more active – just tweet from an rss feed. And yes, bots could be more careful who they follow and how regular they tweet. You are right! Every time a network spots and removes a bot the programmer understands that she will need to change the algorithm.

Because of this unintended feedback look a well well-known dating service took action and no longer removed identified bots. Instead they moved them into a virtual chatroom where spam bots meet up with bots from the dating site. This way the programmer will not know for a while that her bot was detected.

In the end, we need to trust that it is in the best interests of social networks to do all they can to remove bots. Most of them now offer a way to get certified or to identify a bot now. By doing so they can train computer programs to spot bot activity.

Bots are a reality and they will try their best to influence us. And they might be even more successful in the area of Big Data. Bots are generating data as well, and this data migh skew our algorithms such as trending content. We will need to master them as much as we have mastered SPAM, and learn to fend them off in the same way we have learned by now not to send money to a Nigerian prince. But maybe someday soon bots will outsmart us. But only if the computer become more intelligent as Nick Bostrom discussed in his excellent book “Superintelligence“. Until then, we should just be cautious about which prediction systems we believe.

Republished from Forbes

Why President Obama Needs A Chief Data Scientist

Why President Obama Needs A Chief Data Scientist

During this year’s STRATA conference, President Obama introduced Dr. DJ Patil as his new Chief Data Scientist in a video message. DJ is a very well known data scientist and is even credited by some with coining the term “data science”. During his introduction of DJ, Obama said that he wanted to do a joke about Data Science but noted “half of the stuff my staff came up with was below average.”

Let’s decode this sentence into “stats speak” for a moment. What Obama meant was the median of the quality of those jokes was less than their mean. Thus the quality of the suggested jokes were skewed towards the end of bad quality, therefore he decided to drop the joke. That could be wrong, however, because all that was needed was one Joke. Thus if even all but one joke were terrible, it is the one joke that he could have used to start off his intro of DJ.

This omission is precisely why we need data scientists like DJ. We do not need all of the big data, we need the right data – and sometimes it is even only ONE dataset that we need. Even when most of the data within our Big Data cloud is ‘bad’ (aka. useless) we might be able to pull off a great prediction if we get the right dataset.

The same point is valid for models. We do not need many models (with perhaps the exception of a method called “random forest”), but rather we need only one model that is sufficient in its balance between accuracy and speed.

DJ, your knowledge and insights are needed. Let’s look for the right data within those 135 000 datasets that were made available to the public. We are looking forward to great changes based on data…

DJ - the first CDS of the US