Is Regression/ Causal Modeling for Forecasting Underutilized?

Kevin Gray

Kevin Gray

As readers know, we basically have two ways of doing forecasting:

1. Extrapolating from historical trends – univariate forecasting (ie. Time Series Forecasting)

2. Including independent variables such as price that we believe influence movements in sales – causal modeling or regression modeling

Comparing the two approaches, the chief advantage of univariate forecasting is that it is simpler. When historical patterns have been very regular for a long time, the univariate approach may be good enough. But we probably will only be able to speculate about what is causing these historical patterns, and usually the data aren’t that easy on us.  Sudden jumps or declines or other breaks with the past aren’t unusual. Causal modeling can help us understand the key sales drivers and a good causal model will do better at forecasting future periods.

Yet, according to Institute of Business Forecasting & Planning, IBF’s benchmarking studies, fewer than 20% of organizations use causal modeling for forecasting.  Why is this?  The surveys did not ask why causal modeling is or is not used but my own experience suggests several reasons, including:

  • Causal data are not available or are spotty
  • The data are available but expensive or difficult to obtain in a regular or timely fashion
  • Determining future values of the causal variables to use is problematic.  The forecaster may not have finalized marketing plans, for instance.  Exogenous variables such as economic conditions are another example.
  • Lack of internal specialist modeling resources
  • A perception that causal modeling really doesn’t work any better than than univariate or time series forecasting

We’d be interested in hearing your views on and experiences with causal modeling.  For example:

  • Does your organization use causal modeling for demand forecasts?
  • If so, for all SKUs or just a selected group of SKUs?
  • Are there any special challenges obtaining the data you need?
  • Are there modeling issues that have been problematic?

Any other thoughts or comments on this topic would also be welcome.

Kevin Gray
Cannon Gray LLC

16 Responses to Is Regression/ Causal Modeling for Forecasting Underutilized?

  1. The one time I actually used a regression model in business it was quite successful.

    The difficulty is getting a good consistent set of data.

    I was trying to model Steel consumption in the UK economy and was able to get data for 48 consecutive quarters from the DTI (dept of Trade and Industry) for the 4 major consumers of steel in the UK (motors, engineering, construction and (this was some time ago) shipbuilding.

    Whatever assumptions you put into it the answer was a Black Hole which was duly provided by the redoubtable Mrs Thatcher. It was a technical triumph.

    Unfortunately it was ignored by the management because I had succeeded in thinking the unthinkable.

    However it told me that the problems of a business that totally depended on the UK steel industry (we made specialist cranes) were insoluble so I jumped ship to the IT industry in 1981. 2 years later they shut the gates.

    So I’m a big fan of regression – I’ve just never managed to get enough good data to do it again.


  2. It is completely underutilized.

    Using causal variables are certainly worth the effort. Even if you don’t have these at your disposal then even using something simple like a holiday dummy variable can really make a big impact on the model and then the forecast.

    At the IBF Meeting this week in Orlando, a practitioner shared that they were using POS data as a leading indicator to forecast orders. They were using graphs as their method to gauge this. They could take the next step and use the POS data in a regression to help forecast orders.

    Here’s an analogy. Using the historical data is like using the rear view mirror to drive. Using causal variables is like using the rear view mirror and front windshield to drive.

    Tom Reilly
    By Tom Reilly Vice President of Sales at Automatic Forecasting Systems

  3. Indeed completely underutilized. I have already applied the concept twice. Once with a gas supplier, where we set up a fully automatic scheme using 2 lineair regression models. One model was based on seasonality, the other on the link between temperature and gas consumption, followed by an automatic selection for the best of the two. Both models work on Customer-SKU level.

    The second implementation was for decision support on stock control and purchase requirements, also on SKU level.

    The only special issue is that indeed the data set and \ or results need to be screened in order to avoid nonsense-forecasts. Other than that, to my experience, the technique is easy to understand for users and highly communicative as well as robust.

  4. I agree that causal modeling is underutilized.

    It is a very powerful tool that if used properly can add a lot of insight relative to market drivers as well as increase forecast accuracy. In my experience one of the biggest barriers to more widespread use is readily available data that can be easily collected each month. It is easy to do a one-off analysis with a static data snap shot, but the real power is in using the model continuously each month (or week).

    POS data is a great candidate for causal modeling! However converting to orders (at least in the CPG world) is difficult given the fun world of trade inventory and POS coverage factors.

    Leo MacDonald

  5. I agree re casual modeling being underutilized for a large number of applications including demand forecasting. And, one of the most overlooked sources of data are perceptual surveys.

    If we believe that price fluctuation in the marketplace is a causal factor, which is the more powerful predictor: actual price fluctuation or customers’ “perception of price fluctuation”? If customers (e.g., purchasing agents) perceive that the price is x or is going to x then it doesn’t matter what the actual price is, does it? The customer may affect demand by delaying a purchase decision waiting for the price to reach his/her expectation.

    Same thing with shortage of supply, demand may reflect a perception of shortage rather than the facts and circumstances.

    Unfortunately, most companies don’t have the resources for sophisticated modelling and/or once they do establish a model, it is not “updated” to fit changing circumstances.

    A dated causal model may be no more accurate than the extrapolation from historical records.

  6. It is hard to say if regression in underutilized. To the casual observer, it may appear to be not in use.

    Unless you are an insider within these companies, you will never really know for sure if these companies are utilizing good econometric modeling techniques.

    That said…If these companies are not utilizing strong analytical talent to help derive their pricing/utilization curves…they should be shopping for economists/econometricians capable of helping them maximize utilization, revenue and profits.

  7. Thanks to all for your thoughtful comments. To follow up on Tom’s post, I’ve been in situations where, say, only 1-2 causal variables were readily available and the decision was made to abandon any kind of statistical forecasting.

    Humans are prone to binary thinking and there is sometimes the notion that if statistical forecasts aren’t “perfect science” they are no better than gut feel.


  8. I am not sure if it would be right to say that casual models are ‘better’ than simple time series. Knowing the difficulty in getting the data and the inherent assumptions in regression, a casual model may be equally weak in forecasting.

    The key is to use the right method for the right purpose. Casual models work best for aggregate planning and long term business mapping. Time series methods are great for day to day short term forecasting of individual product mix.

  9. Our recent discussions with revenue managers who need sales forecasts to manage pricing and capacity release of airline tickets, hotel rooms and other ‘perishable’ goods suggest that:
    – patterns of consumer buying behaviour have changed post-Lehman in response to the credit crunch/recession, increased use of the internet to seek out the best deals and the plethora of copntinuously changing deals and discounts on offer;
    – forecasts based on historical patterns in time series have, not surprisingly, become less accurate and hence less useful to the business.

    There are several possible responses to this, one being to switch to more sophisticated forecasting methods such as causal/regression modelling as is suggested; so I would expect interest in these techniques to increase. There are other possible responses, for example making the business less dependent on accurate sales forecasts; better understanding the drivers of consumer behaviour and how suppliers own actions may be exacerbating unncertainty in the market; increased use of auctions to probe the market. I’m sure there are many other possible responses to forecast uncertainty and wouild be interested to hear of any practical experinces.

    By Ian Rowley

  10. In the more micro world of healthcare where one wants to predict who will be high cost patients, trend analysis doesn’t work because of regression to the mean. High cost this period tells you nothing about next period. So, things such as Markov models are more likely to be helpful as are data mining algorithms that can detect non-monotonic patterns. So, in addition to cases such as Ian mentions where time series (trends) may have worked at one time but no longer do, we have to add cases where trends never provided an adequate predictive analysis.
    By Sam Kaplan

  11. In my 30+ years of experience in forecasting, I would have to agree that regression/causal models are way under utilized. My other observation is that such models are not just a little better, they are dramatically better. I have seen reductions of forecast error as high as 50% when moving to causal models. Later in my career, my standard approach was to start with a causal model and add time series components later. Unavailability of causal data is not an acceptable excuse for not using causal models. There is a plethora of causal data on demographics, weather, and the economy that are available for free or for a very reasonable price. Buy it. It’s worth it. Often internal causal variables like price are not available in a form that is amenable to forecasting, especially automatic forecasting. This is not an insurmountable problem. Demonstrate to your management how much better your forecast will be if you have this information in the correct form and then get together with your IT people to make it happen. If we sit back and wait until the world comes to us, it will never happen. We need to state what we want and fight to get it. In so doing, we not only improve our forecast, but become an integral part of managing the business.

  12. Further to Fred’s comment about data availability, when companies begin Data Mining the data are seldom all in one place and ready to go. They usually are scattered in various data bases in different parts of the company or exist externally. So there is an initial investment necessary but that’s business.


  13. I’ve been in many great organisations that rely on causal modelling to help them figure out strategies for the future. Missing data is rarely a good excuse and in my experience if you did, you’ll generally find what you (think) you need.

    Causal forecasting is especially useful for forcasting consumer demand for products and services. You can automate the collection of data quite quickly these days and it’s always interesting to see the way people modify their decision process once they have better data.

    My company is about to release a forecasting package to complement our modelling package and this will feature a simple comparison between time series and econometric methods so that the expert can compare the results. Many people have asked us for this feature and it will be interesting to see the results!

  14. I’m lucky in that I’m in a business that does have good access to causal data that is measured relatively consistently across companies.

    My firm belief is that both causal and univariate models need to be compared and contrasted against each other when developing a statistical forecast. As per Nicholas Nassem Taleb – there is the risk when using causal analysis that you don’t really have data on all of the causal factors (was it weather, competitor availability, promotions, price, the economy, footfall, distribution, market share, etc. etc.).

    The risk with univariate is you don’t always strip out underlying causes (e.g. distribution gains / losses etc.)

    The next step in our one number forecasting process will be to incorporate both in a statistical model as well as having the option for account managers to over-write these calcs with their latest ROS/ Distribution and promotional plans.

  15. Since univariate modeling is most often supplemented by event adjustments for short-term forecasting and planning, regression modeling represents an excellent opportunity to get more reliable empirical estimates of events and factors used for these adjustments. This would include promotional lift, price elasticities, new product introduction effects, competitive response, market conditions, economic conditions, etc.

    Of course, regression models are much more important in long-term forecasting and planning where the “certeris paribus” assumptions implicit in time series models do not hold. This would encompass annual bugeting, strategic planning, long-term business planning, and market development planning.

    Many (perhaps most) companies would have benefited substantially in the last couple of years from regression modeling given the effects of the recession on their businesses and the dramatic departuture from patterns of behavior that were “baked into” the demand data in non-recessionary conditions. The relatvely higher operating costs and relatively higher working capital investments incurred due to the assumptions of unadjusted time series models have been difficult and costly for many companies.

    Cause and effect modeling is certainly under-utilized in forecasting and planning, and can benefit both short-term and long-term forecasting and planning activities. It is more complex; and it requires more business knowledge, care and thought than time series modeling. But it can materially improve forecasts, plans, and business decisions when judiciously developed and applied.

    Mark Lawless

Leave a Reply

Your email address will not be published. Required fields are marked *

WordPress Anti Spam by WP-SpamShield