Guest Blog by Eric Siegel, PhD, founder of Predictive Analytics World and headlining keynote speakers at the Sixth Annual Great Lakes Business Intelligence & Big Data Summit on March 15, 2018 in Troy, MI
Data is the world's most potent, flourishing unnatural resource. Accumulated in large part as the byproduct of routine tasks, it is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is inherently predictive. Thus begins a gold rush to dig up insightful gems.
Does crime increase after a sporting event? Do online daters more consistently rated as attractive receive less interest? Do vegetarians miss fewer flights? Does your e-mail address reveal your intentions?
Yes, yes, yes, and yes!
We’ve entered the golden age of predictive discoveries. A frenzy of number crunching churns out a bonanza of colorful, valuable, and sometimes surprising insights
Predictive analytics' aim isn’t limited to assessing human hunches by testing relationships that seem to make sense. It goes further, exploring a boundless playing field of possible truths beyond the realms of intuition. And so it drops onto your desk connections that seem to defy logic. As strange, mystifying, or unexpected as they may seem, these discoveries help predict.
Welcome to the Ripley’s Believe It or Not! of data science—the Freakonomics of big data.
Below are nine colorful discoveries, each pertaining to a single predictor variable—from the likes of Walmart, Uber, Harvard, Shell, Microsoft, and Wikipedia. These examples are new in this year's Revised and Updated edition of my book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, bringing the book's more extensive "Bizarre Insights" table up to 46 total. (For more information about the examples below, access the book's Notes PDF—provided at no charge at www.PredictiveNotes.com–- and search by organization name.)
And now a word of warning! In the table of examples above, do not give much credence to the “Suggested Explanation” column’s attempt to answer “why” for each insight. For each one, there are also other plausible explanations, and, in most cases, only intuition rather than scientific evidence behind the particular answer provided. The reasons behind each discovery in the left column are generally unknown. Every explanation put forth, each entry in the rightmost column, is pure conjecture with absolutely no hard facts to back it up.
The dilemma is, as it is often said, correlation does not imply causation. The discovery of a predictive relationship between A and B does not mean one causes the other, not even indirectly. No way, no how. My Quartz article on this topic explores it in detail.
But do not fret. When applying predictive analytics, even though we generally don’t have firm knowledge about causation, we often don’t necessarily care. For many projects, the value comes from prediction, with only an avocational interest in understanding the world and figuring out what makes it tick. The freak show of surprising discovers delivers predictive value even when it does little to explain itself.
- Eric Siegel
About the Author
Eric Siegel, PhD, is the author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die, Revised and Updated Edition (Wiley, January 2016), founder of the Predictive Analytics World conference series, executive editor of The Predictive Analytics Times, and a former computer science professor at Columbia University.
About the Great Lakes Business Intelligence & Big Data Summit
This one-day event includes keynotes from industry experts, case study sessions, vendor software demonstrations, hands-on workshops, and plenty of networking opportunities. Attendees will learn about the latest BI and big data software, best practices, and success stories to help them capitalize on big data, business intelligence, machine learning, analytics, and data visualization opportunities.