Data, and by extension data analytics, are becoming increasingly important for business. At the same time, the data deluge makes making sense of it all a bigger challenge every day.
Here are three trends to should keep in mind for 2016.
1. Don’t shoot from the hip
Numbers are becoming more popular for most people, but the more numbers we get the more useless most of these seem to be. Especially if they are drawn out of a hat; why would you take that into consideration in your decision-making process?
Polling your audiences is fine.
But that is not a statistic that adds up exactly to something like 97%, is it?
Or are you keeping tallies of your straw polls and then doing the statistics?
Google Flu Trends is an example that illustrates this problem further. For instance, it:
– looks at historical data – descriptive analytics and research, and
– tries to predict what might happen – predictive analytics – with the help of a model that was developed.
The results are supposed to help us better understand how the flu will spread next winter. Unfortunately, in the Google flu trends versus National Institutes of Health (NIH) challenge, the winner is? NIH! Google estimates are simply far off from the actual data the NIH produces for policy makers and health professionals.
2. Bad data result in bad decisions
Publishing rankings or product tests is popular. Since some readers devour such rankings, publishers can sell more copies, which keeps advertisers happy.
A real win-win situation, right? Not so. Wrong decisions can result in outcomes that are not desirable. For instance, attending the wrong college or polluting more than the test results indicate (think Volkswagen and #dieselgate) is not something we want.
…I would be exceedingly displeased to learn that the bankers to whom I was handing over a king’s ransom were being taught that errors were perfectly acceptable.
This mistake-loving nonsense is an export from Silicon Valley, where “fail fast and fail often” is what passes for wisdom. Errors have been elevated to such a level that to get something wrong is spoken of as more admirable than getting it right.
By collecting data and using flawed methods we produce rankings or test results that will can seriously hurt people. For instance, when drug certification tests are done improperly and the regulator has no idea, unknown side effects can kill people.
Using the wrong test results to approve or certify a car can result in dismal effects as well. Volkswagen is accused of manipulating tests, and the public got more pollution than it bargained for. VW is working on fixing the 11 million vehicles affected by the diesel cheat, but this will not un-do the damage to the firm’s reputation and our health.
3. Check before you trust the method used
It is always wise to take 5 minutes to do an acid test with any study report we see, such as:
– what does the methodology tell us (e.g., we asked university deans to rank their competitors); and
– does the measure or measures used make sense (e.g., one question about how university developed / improved study programs – result = ASU is more innovative than Stanford or MIT… who are you kidding?).
The Art Review publishes an annual ranking of the contemporary art world’s most influential figures. In short, it helps if you live in London or New York so the Art Review editors or journalists are aware of who you are.
I asked for an explanation of how these numbers develop:
Dear Sir or Madam
I would like to know more about the methodology you used for the ArtReview’s Power 100 List.
Can you help… this would be great to use with my students in a class.
I could not find anything on the website that I could show my students.
Professor Urs E. Gattiker, Ph.D.
14 days later I got an answer from the makers of the ranking:
Subject: Re: Message from user at ar.com
We are not following a grid of criteria per se, and the list emerges from a discussion between a panel of international contributors and editors of the magazine, who each advocate for the people they feel are most influential in their region. The influence of the selected people on the list is based on their accomplishments in the past 12 months. I have attached here the introduction to the Power 100, which might help you in defining our approach.
I hope that helps,
A grid of criteria, what is that? Of course, the office clerk answering me has no clue about research methodology used, as the answer indicates. One could start believing that this Top Art list came from a discussion or using a straw poll. Totally chaotic approach.
You can view the attachment that explains this sloppy method below.
A friend of mine smiled, and said:
For me this is a great list, Urs. Those on the list rarely if ever represent value for money for serious art collectors. Instead you get buzz and have to pay for their image. The list tells me who we do not need to work with. We use other experts. These give us more value for money. They help us to complement our award-winning collection.
We all know that data quality is important and frequently discussed. In fact, the trustworthiness of data directly relates to the value it can add to an organisation.
As the image above suggests, doing quality research takes a decent method that results in data that permits careful analysis. Sloppy data are cheap to get, but dangerous if used in decision-making. Such findings are neither replicable nor likely valid.
However, we are increasingly required to present findings in order to attract more readers. Some master this very well like Inc. Another example of theirs I came across was:
Though truly quantifying “best” is impossible, the approach Appelo’s team used makes sense, especially when you read the books that made the list.
And here’s the methodology:
The purpose of our work was to find out which people are globally the most popular management and leadership writers, in the English language.
Step 1: Top lists
With Google, we performed a lot of searches for “most popular management gurus”, “best leadership books”, “top management blogs”, “top leadership experts”, etc. This resulted in a collection of 36 different lists, containing gurus, books, and blogs. We aggregated the authors’ names into one big list of almost 800 people.
Step 2: Author profiles
Owing to time constraints, we limited ourselves to all authors who were mentioned more than once on the 36 lists (about 270 people), though we added a few dozen additional people that we really wanted to include in our exploration. For all 330 authors, we tried to find their personal websites, blogs, Twitter accounts, Wikipedia pages, Goodreads profiles, and Amazon author pages.
So you defer to 36 people and their lists and include those that are mentioned more than once. Fine, if that does then not include the ones you believe should be on the list because you read these books and liked them, no worries. You add a few dozen people (60) and voilà, you have 330 authors (how they ranked them is totally unclear, but interesting – blog reputation, Twitter followers, etc.).
What is your take?
– what will you change in your data #analytics and #analysis work in 2016?
– what is your favourite example for 20015, illustrating GREAT analytics work and research?
– how do you deal with this data deluge?
– what would you recommend to a novice (ropes to skip)?