Predictive Analytics: Separating the Hope from the Hype
Data mining and predictive modeling tools can enhance BPM projects, but they must be used with care.
Good buzzwords are difficult
for marketers to resist, and this seems especially true when software companies
want to mark their territory by using new acronyms and feature descriptors — or
revamping old ones — in the overlapping areas of business intelligence (BI),
business performance management (BPM), and the product space recently branded
as predictive analytics and data mining (PA/DM). If, like me, you’ve been
observing the evolution of these software categories from a strictly BPM
perspective, you’ll have noticed how the meanings of certain terms shift and
slide, and how, as a result, the dividing lines between these sectors have
become a bit blurred.
Take “predictive,” for example.
Five years ago, some BPM product literature used the word rather loosely to
describe any software capability that enabled the visualization of meaningful
metrics, or any functionality which helped the decision-maker to manage by
exception. There wasn’t much actual prediction involved.
Currently, though, when a
BPM product touts predictive ability, it’s the real deal. Traditional data
mining and predictive modeling methods, often now together referred to as
“predictive analytics,” have become much more readily available in BPM vendors’
product lines. Marketing messages and related articles hammer away at their
utility, their necessity, and even their unassailable superiority over more
traditional approaches.
There’s no question that
predictive analytics can powerfully enhance a BPM project. However, as with any
tool, in determining how and when it should be used, it’s sometimes a good idea
to take a step back and think about what you need, what you’d like to
accomplish, and how you intend to proceed. Here are some key points to keep in
mind:
It’s an Evolution, Not a
Revolution
Let’s take that step back.
What has really changed in predictive technologies? Traditional data mining
methods, which focus on the discovery of new relationships and patterns in large
data sets, aren’t new, although they have benefited from increasing
computational capabilities and new approaches. It can also be argued that there
haven’t been any recent, major breakthroughs in the areas of information
visualization or predictive modeling.
On the other hand, it’s true
that extract, transform and load (ETL) products, middleware integration, and
related deployment tools have steadily improved and that integrated data
preparation tools are now more widely available in data warehousing solutions.
In addition, the storage of massively large data sets has become very
inexpensive, which comes in handy since extensive data availability — for
example, from detailed, real-time, customer activity measurement and social
media sources — is now possible.
BPM decision-makers might
find this interesting, but many will still regard predictive tools as much more
useful for focused tasks such as detecting fraud, analyzing marketing
campaigns, or generating customer recommendations than for, say, plotting a
company’s overall direction. However, as these computing trends continue,
statistical analysis and related practices for analyzing large data sets are
becoming more mainstream. This is good for BPM; one can expect users to have
more arrows in their quiver to predict, measure, and hit their targets.
The question then should be:
How can these tools help improve the efficiency of processes and the accuracy
of BPM-related decisions? Real competitive advantage will result from learning
how to use each tool, knowing when to use it, and understanding how much
reliance to place on its results.
You Can’t Predict
Everything
Pick up any book or article
on data mining or predictive analytics, and you’ll be bombarded with success
stories. They’re typically very selective, and they posit — using seemingly
unassailable statistical measures — that given a large enough data set, nearly
everything and everyone is wholly predictable.
Just in case you were
predicting that this article would take a similar direction, don’t worry — it
won’t! It’s no fun being so predictable.
Current product literature
in the PA/DM space often launches immediately into regression, clustering, rule
induction, and other subjects that can be quite daunting for most of us
possessing only an average familiarity with statistical methods. In general, at
its core a product’s predictive ability is enabled by its use of these
traditional statistical methods. They can yield impressive results; however, they’re
so often misused and their results so commonly misunderstood that business
decision makers are constantly challenged to determine what degree of reliance
should be placed on their recommendations.
And as if that isn’t problematic
enough, the methods themselves also have potential weaknesses. It can be
dangerous to base important decisions on a poorly designed statistical study or
to rely too heavily on the results of an analysis that aren’t fully understood.
It’s no wonder that BPM practitioners tend to be slow to adopt these new tools.
To a degree, their hesitation is justified.
Nothing Trumps Human
Judgment and Common Sense
Take heart, and don’t let a
statistician tell you that your decision-making days are over! If you feel that
you’re still smarter than your computer, you’re not crazy. Human beings,
despite their lack of raw computational power, have amazing pattern recognition
abilities that no computer can come close to matching.
On the downside, our brains
are certainly limited by the amount of data we can process. In addition, our
decision-making ability can be diminished when it’s negatively affected by
certain experiential patterns, emotions, social pressures, and similar factors.
But a well-tuned human critical reasoning process can successfully deal with
these and leverage the information that’s needed to make good decisions,
including the results of well-designed statistical studies.
Getting to that point isn’t
an easy task; there’s a lot to consider when using these computational tools. The
dangers include over-reliance on the results of a poorly constructed study or
misunderstanding the results of a well-crafted study. Even regarding the
statistical methods themselves as flawless can get you in trouble. For example,
it can be a bad decision to hang your hat on results from a single study just
because they’re “statistically significant”; improbable factors not measured by
the study can be the cause of the effect under investigation.
Biases in human
decision-making often turn up in certain predictive studies — for example, when
sampling is less than random, or when Bayesian approaches are incorporated into
a computational design. (A Bayesian approach or inference may include prior
studies’ results or other informed guesses about certain hypotheses, which may
introduce biases into the study).
Not to worry — the risks can
be managed, and the results of such studies can be measured against our
informed common sense. We regularly decide whether or not to incorporate the
recommendations of our computing tools into our final decisions. Anyone who intuitively knows when to
ignore their car’s GPS or when they’ve wasted enough time on the phone talking
to a voice recognition program understands the difference between computational
intelligence that complements their own know-how and a system that seems to
insult their IQ.
The key to leveraging the
benefits of any PA/DM tool is to understand how it arrived at its
recommendation; without this, any predictive calculation becomes a mysterious
black box. It would be irresponsible to surrender your decision-making ability
to such a gadget. This is a primary challenge to successfully leveraging
predictive BPM and predictive analytics as a whole.
Understanding predictive
analytics shouldn’t require you to forgo common sense; just the opposite, in
fact. The next time you’re told that the economy is doing well because jobless
claims increased by X percent less than expected, ask yourself — what was
expected? What were the economic
measurements these growth figures were based on? What were these numbers five,
ten, or 20 years ago? What has changed with those factors since then?
It’s tempting to register
statistical measurements at face value without questioning them. Within the
world of PA/DM, it’s equally easy to avoid questioning, let’s say, the value of
a certain standard deviation measurement when it’s one of dozens of factors to
be considered during your next important business decision-making process. If
you can picture yourself tossing such an interpretation task to your fresh-out-of-school
MBA employee rather than tackling it yourself, you’re probably placing too much
reliance on results given to you by your PA/DM tool set.
The key to utilizing predictive BPM tools for competitive success is to improve your understanding of these methods and their application to your industry. The competitive advantage that can be gained by using these tools is, of course, heavily reliant on the quality of your organization’s systems and data governance, its ability to adapt, and other foundational factors. The challenges are real, but they’re not insurmountable — and there’s no good reason to ignore these new tools when you’re measuring, managing, and designing business performance efficiencies.
Chris Iervolino holds a
doctorate in computing studies and is a senior managing director at ITEC
Consulting Inc., a BPM consulting organization specializing in all aspects of
corporate performance application design and implementation.

