It Takes More than Excel to Make Data-Driven Business Decisions

UNLIMITED DATA | BY JAMES KULICH | 5 MIN READ

Imagine that you are getting ready to leave on your big vacation. What to pack? What will the weather be?

Suppose the only information you could get is that the average temperature in paradise this month is 65 degrees and the average precipitation is 4.68 inches. That’s great, but you want to know what the weather will be next week, not some average collected from the past 30 years. You have decisions to make. Will you be able to swim or should you plan for more museum visits? Should you rent the convertible?

How different is this from the way we often use data to inform business decisions?

We gather the routine business data we need. We consider some basic statistics like averages. We produce some nice visuals displaying summary information by, say, region. We explore some special cases.

But we don’t do the kind of deep dive into the data that could pinpoint what is really happening now—what we need to know in order to make good business decisions.

Sample Case: When Excel Isn’t Enough for Data Driven Business Decisions

Compare the travel scenario above with a case developed by one of my colleagues, John Aaron, based on his extensive consulting experience.

You get a spreadsheet filled with good details about on-time delivery by branch for the past year. You compare averages and notice that the Northeast is not doing so well. You may even do some basic statistics to determine that on-time delivery in the Northeast for the past year really is lagging other regions. You start to ask questions of the experts.

The vice president of supply chain tells you about his plan for business transformation. He is building a bench of Six Sigma black belts who will take the disciplined approach necessary to bring internal processes behind the delivery operation under good control. He shows you his collection of hand-drawn process flows.

You dig a little deeper. You learn from the quality manager that this company offers a wide range of product lines and that she believes the product mix maintained in inventory is the problem behind late deliveries. She also complains about poor workmanship in some of the manufacturing centers in the Northeast. A look at some graphs showing assigned codes for late delivery reveals that poor workmanship is indeed the most cited reason.

The operations manager tells you a different story. He complains about staff cuts and mentions that current IT systems are not up to the job. The IT manager confirms some of this, noting that various systems have been cobbled together over the years and that there is no good system for data governance. He believes that the company must invest in a new enterprise resource planning (ERP) system.

The finance manager tells a similar story. She points out that she only gets personally involved if inventory runs low but wonders why inventory for one particular product—let’s call it “Component D”—is always high. She muses that it has always been this way.

All useful information, but you are no closer to making a recommendation for useful action. The expert opinions are valuable, but the data behind them is simply too flat. You have averages and counts of the sort a tool like Excel provides, but little insight into root causes and little ability to predict what might result from possible changes to business processes.

Where Machine Learning Trumps Excel for Data Driven Business Decisions

Don’t get me wrong. I love Excel. I use Excel all the time for the many things it does well. But you need more than what the basic tables and graphs Excel can provide if you are going to make headway in solving a complex business problem like this.

You need to be able to sort through the data in ways that will reveal root causes that suggest good courses for action. You need the capabilities of modern machine learning.

In this sample case, a machine learning approach leads to some interesting conclusions. A simple decision tree derived from a machine learning algorithm pinpoints the need for excessive time on task at a manufacturing center—let’s call it Manufacturing Center C—as a primary factor behind late deliveries and shows that high inventory levels for Component D matters, but less. Why might Manufacturing Center C take so long to finish its work? Are its processes in need of revision?

Perhaps, but your conversation with the branch manager reveals that they are constantly interrupting their runs to meet the shifting demands of a high-profile customer. And, in order to do this, they need to keep a lot of Component D on hand.

Aha! You now understand a root cause. Greasing the squeaky wheel may or may not be the right thing for the company to do, but now it can make a decision informed by real information gain.

Taking the Right Approach to Data Science

My colleague’s sample case does a great job of demonstrating why you must always keep the business question of interest at the center of your data science efforts. The combination of good data science, deep human expertise and a focus on solving real problems is incredibly powerful.

It is the approach we teach throughout our Master’s in Data Science program at Elmhurst University.

About the Author

Jim Kulich is a professor in the Department of Computer Science and Information Systems at Elmhurst University. Jim directs Elmhurst’s master’s program in data science and teaches courses to graduate students who come to the program from a wide range of professional backgrounds.