Forecasting is part of everyday life. We watch the weather channel before we make weekend plans and text our family our ETA when we leave work. However, forecasting in the business context does not happen naturally. It is the responsibility of the demand planning team and as far as the rest of the company is concerned, how we arrive at these forecasts is a mystery. However, it doesn’t have to be this way.

For companies just starting out in demand forecasting, I would like to offer some very practical advice.

Forecasting anything is about determining mathematical dependencies between variables. The dependencies can be linear or nonlinear. Typically, having more data means you have a better chance at figuring out those dependencies and, as a result, developing a decent forecasting model.

This doesn’t mean that Neural Network or any other advanced algorithm. In fact, a simpler model is often better.

Simple Models Mean Buy-in From Stakeholders

First of all, if you have a choice between a linear and nonlinear model, choose linear even if it means losing a few percentage points on accuracy. The main reason is that a linear model (e.g. a regression)  is easy to translate into a formula and can be easily understood by stakeholders.

If your team can’t explain the predictions, they have no value. There will be no user buy-in, no follow-up questions, no discussions. The wise consumer of a forecast is not a trusting bystander but a participant and, above all, a critic. It is almost impossible to offer any meaningful criticism to something that is difficult to understand.

As your team develops a forecasting model their ultimate goal is to come up with something that stakeholders will digest and retain. Those should be simple things like “if X goes up 10% our sales are likely to go up 5%”.

Secondly, while having external drivers as components is a potential game-changer in the forecasting world, it is absolutely not required to get started. The best foundational model can be achieved by creating drivers (features) that are rooted in the time series data itself. Some examples of those features can be month number, quarter number, and a rolling average value for a number of previous periods.

MonthPriceMonth NumberQuarter2-month rolling average
Jan10011
Feb9021
Mar1503195
Apr12042120
May11052135

Source data (price by month) and the three features engineered for the simplest forecasting model

 

Having 100 external drivers (oil prices, labor market statistics, search word frequency, etc.) might look good on paper and result in a higher accuracy, but the business stakeholders will likely be bewildered and close the deck, never to open it again. The optimal number of causal relationships is between three and five; this way stakeholders can actually remember what they are.

Aim to Be Directionally Correct, Not Perfectly Accurate

The third and final point is that being directionally correct is the most important forecast characteristic. It is a very intuitive one, but it is often overlooked in the data science world. To illustrate this, let’s evaluate three different forecasts, and we’ll use Root Mean Square Error (RMSE) – one of the most common forecast accuracy metrics – as a way to compare them.

 

MonthActual SalesForecast 1Forecast 2Forecast 3
Jan100808080
Feb9070110100
RMSE202015.8

Comparing three forecasts

 

In the world of data science, the lowest RMSE wins. So would we pick forecast 3 as  the best one in this case? Not so fast. Out of the three models here only one correctly indicates a downward trend for the month of February. There are lots of use cases where being directionally correct is far more valuable than landing on a value that’s closest to the actual one. For example, in a Demand Review, the knowledge about a downturn in the market is a powerful weapon to wield. With that being said, in this case we would choose Forecast 1 as the best one.

An actionable forecasting model has stakeholder buy-in, is explainable and directionally correct.

 To sum up: an actionable, practical forecasting model is not the one that uses the highest number of external drivers, has the most advanced mathematical algorithm or even the highest accuracy metric. It is the one that has stakeholder buy-in, is explainable and directionally correct. This way your team can be sure that it will drive meaningful discussions and result in actions that bring value to the business.