Forecasting plays a pivotal role in most business decisions and few would disagree that that high-quality forecasting is a strong competitive advantage for businesses. Companies rightly invest significantly time and money in projects aimed at improving the quality of their forecasts. The approaches vary—be it processes, tools, data, or algorithms—but the singular goal is to have the most reliable forecast possible.
The practice of forecasting is built upon a simple foundational principle: the quality of a forecast is measured by its accuracy or, inversely, by its error rate. That businesses need a reliable forecast is not up for debate; a forecast with a low error rate is undoubtedly of better quality than one with a significant error rate. But let’s explore a more subtle question: Do our businesses truly need a reliable forecast in terms of ‘accuracy’?
But is forecast accuracy as important to corporate success as many people think? It’s quite a provocative question. It challenges a universal truth, one considered absolute in our field. Contemplating this questions requires taking a step back and reevaluating our preconceived notions of what really drives value and reassessing our daily practices.
In this and a following article, we’ll share the rather surprising—and perhaps concerning—results of a supply chain study within a Retail industry. This study is based on the analysis of a large dataset (more than 32,000 time series) to explore forecasting from the perspective of its added value and its economic contribution to the enterprise. It asks two key questions:
- Are ‘accuracy’ and ‘value-added’ as strongly correlated as we think?
- When should performance be deemed sufficient? When do further improvements to the forecast become irrelevant?
These two key questions are ones that no enterprise can afford to ignore. To start, let’s address the first question: Is better accuracy a guarantee of added value for the company? The answer is a simple no. On the contrary, as we will demonstrate, increased accuracy can even alter decision-making and lead to financial losses.
Setting Up the Experiment
The M5 competition, organized in 2020 by the Makridakis Open Forecasting Center (MOFC), is a global forecasting competition. It focused on forecasting demand for a subset of products and stores at Walmart, thus in a retail context. At the close of the competition, organizers made public around 130 distinct sets of forecasts: the top 50 deterministic forecasts, the top 50 probabilistic forecasts, and approximately 30 benchmark forecasts based on classical approaches.
This abundance of forecasts enables a deep analysis of the link between accuracy and added value. However, this M5 competition suffers from a significant limitation for our study. It was designed as a pure forecasting competition, completely ignoring the aspects of decision-making and impact evaluation. To conduct our study and explore business value successfully, we had to address this gap by defining our decision-making process and enriching associated data (packaging, supplier constraints, order frequency, cost structure, etc.). Our goal was to closely resemble real-world use cases in business. Thus, we relied on third-party data sources to define the most credible context possible (margin rates by product family, realistic packaging, target service rates, etc.).
Regarding the inventory policy, we established a weekly replenishment process with a 3-day lead time, following a classic “periodic review and dynamic order-up-to-levels” policy. Did it reflect any company’s exact replenishment policy? Clearly not, but it’s not a problem since the essential aspect is that they reflect a credible and coherent policy.
The results of our study do not claim to be universal. However, they demonstrate that there is a gap between the accuracy of a forecast and its economic value. Everyone is encouraged to replicate this analysis in their own context to evaluate if the results tie into their preconceived notions of forecast accuracy or conflict with them.
Once the procurement process was defined, and the data enriched, we developed a simulation tool and applied it for each time series of each forecast set. We employed a 4-step process:
- Forecast ingestion
- Evaluation of ‘accuracy’ (using various metrics)
- Simulation of the procurement decision
- Evaluation of economic performance (especially in terms of gains/costs).
This simulation applied to the M5 competition dataset provided us with numerous and varied data, totalling more than 9.4 million distinct cases.
From Universal Assumptions to Surprising Findings
Before detailing the main results, let’s recall the nearly universally accepted axiom: “If the ‘accuracy’ of forecast A is better than that of forecast B, then forecast A will enable better decision-making and present economic advantage.” To this, we can add a generally accepted limitation: “In some cases, a forecast, although more accurate than another, may not provide any additional added value.” We accept that the improvement might be too minor to have a real influence on replenishment decision making. For example, reducing an error from 4.4 to 4.3 units might have little impact on the replenishment of an item supplied in packages of 12 units.
There were three key findings from the comparison between forecasts, based on their accuracy (expressed here by MAPE) and their economic performance:
- Finding #1: In 80% of cases, improving the forecast had no impact on the decision and thus on economic performance. This proportion exceeds expectations, implying a negative return on investment (ROI) in 4 out of 5 cases.
- Finding #2: In 12.6% of cases, improving the forecast resulted in superior economic performance. This is our expected case. However, this proportion remains low, rewarding efforts to enhance the forecast in only 1 out of 8 cases.
- Finding #3: In 7.3% of cases, improving the forecast altered economic performance. This case, initially considered impossible, occurred in 1 out of 3 cases when the forecast improvement influenced the decision.
Figure 1 | Breakdown of economic performance when forecast accuracy improves
These results, evaluated using the MAPE metric, are similar for other studied metrics (MAE, MSE, MSLE, RMSE, wMAPE). This observation is therefore not specific to the MAPE metric but rather associated with the notion of accuracy itself.
Improving the accuracy of a forecast does not guarantee better economic performance. Yet, this doesn’t mean that we should stop improving our forecasts. In fact, shifting the focus from the frequency of cases to the economic performance of the forecast shows that the value created by a better forecast (here $8,376) significantly surpasses the value lost (here $-3,251). The balance remains strongly positive. This is of course quite reassuring. This implies that improving the forecast indeed holds an economic advantage. However, this gain is significantly reduced (~-28%) by the recorded underperformance, which leaves room for further improvements in the forecasting practice.
Advocating for an Economic Approach
These conclusions do not claim to be universal. Transferring them from one context to another would be inappropriate. Indeed, a simple change in decision-making, cost structure, or constraints could produce radically different results.
However, the general conclusion is worrisome. The importance given to the accuracy of a forecast might not be as fundamental as assumed. The belief that improving the accuracy of a forecast is necessarily advantageous, is a myth. In the business realm, we are therefore wrong to be so obsessed with accuracy. Perhaps we’ve become so focused on improving our forecasts that we’ve lost sight of the fact that we’re not in an accuracy competition. Our sole goal should be to generate value. And in business, value—although it can take various forms—is primarily economic.
The future of forecasting lies in better integration of decision-making and its impacts into our assessments. The challenge is commensurate with the opportunity it represents! In an upcoming article, we will explore how to improve both efficiency and performance in forecast generation, detailing an approach to target areas where effort expended has a tangible impact on economic performance while identifying those where investment would be economically irrational.
This article first appeared in the spring 2024 issue of the Journal of Business Forecasting. To get the Journal delivered to your door every quarter, become an IBF member. Member benefits include discounted entry to all IBF training events and conferences, access to the entire IBF knowledge library, and exclusive members workshops.