CPG Data Insights’ article “

The two most important factors in defining your category are how consumers define it and how retailers define it.

Consumers view categories from a needs basis, i.e. what need is the consumption of this product satisfying? While some products have very few substitutes, others like popcorn could easily be substituted with other salty snacks, or other relatively healthy snack foods.

From a retailers perspective, in store location matters. Products that consumers and manufacturers may view as substitutes for each other, can be physically separated in the store. CPG Data Insights uses the example of fresh vs. frozen orange juice, and notes that retailers usually manage these two categories separately.

Two other points:

How does your management define your category? Are there certain internal reporting or data structure requirements for certain categories?

How do IRI/Nielsen define your category? For the sake of this blog series, this is an absolutely necessary question to answer. Nielsen's classification hierarchy, for example, is Department / Super Category / Category / Sub Category / Segment. Knowing this breakdown for your products is the foundation of everything to follow.

There are two primary options for getting syndicated data: IRI and Nielsen. Getting data from either of these two companies ensures that it is accurate and as comprehensive as possible. We prefer Nielsen because of their Connected Partner Program, which allows partners like Fiddlehead to access and analyze the data directly.

Specifically, we look at both the historical observed values, and Nielsen's baseline for the following data points:

- Historical sales dollars, equalized sales volume, and average price per equalized sales volume

- Historical promotional activity related to feature ads, in-store displays, and temporary price reductions

Just with the data that meet these criteria, you'll be looking at about 30 historical data series per UPC. Generating forecasts at the lowest aggregation possible with the syndicated data that Nielsen publishes, means that the forecasts are specific to a given UPC, the Nielsen-defined Syndicated Major Market (which roughly translates to "cities approximately the size of Boston"), and the Nielsen Category.

The data being used is related to both the product which forecasts are being generated for, as well as all of the competing products within the respective segment. Depending on the category and market being modelled, anywhere from 10,000 to 20,000 input variables can be expected for each point forecast, representing the historical observations and additional relevant information. This amount of input data causes quite a few challenges, and requires rethinking the modelling strategy.

The problem is inherently multi-dimensional. The amount of product sold at time t+1 is not only a function of the volume sold up to time t; it's also dependent on the promotions that the manufacturer runs, promotions that the retailer runs, how many competing products are present in a given market, what their promotional strategies are, etc. These factors all contribute to the overall demand for a product, and the relationships between each of these factors may vary over time or with respect to each other.

Do you see the disconnect? In-store promotional tactics need to be taken into account when generating these forecasts. Price reductions, in-store displays, bonus offers, and flyer ads are all specifically designed to increase the demand for a product over what it would have been.

Univariate models don't allow modelling this directly. As a result, the data needs to be adjusted in order to compensate and find the "true" or "baseline" demand. This usually relies on a lot of assumptions about the products being sold, promotional strategies, or competition, and in our experience is prone to bias and model misspecification.

A better way to deal with this is to directly include the promotional information in your forecasting model, alongside the raw demand – promotions and all. This will require a causal modelling approach, rather than the autoregressive or smoothing approaches that are so common. The classical statistical model for this kind of problem would be a linear regression.

With a linear regression model, you can specify one term for each input variable, and determine how each impacts the baseline sales. This approach can be effective in cases where the inputs are independent, but it has many drawbacks when working with time series data. This is especially true for multivariate time series data.

1. One full set of inputs needs to be specified for each historical lag included. If there are 30 input features and only 52 weeks of history, then that's already 1560 inputs! It only gets worse with longer history horizons and more input variables (not to mention information about your competition within a given market).

2. The assumed independent, linear relationship between the inputs and outputs is rarely true in practice. A display might produce lift of 100 cases, but that same display might be twice as effective when paired with a 10% discount. It might be 10x as effective if the discount is increased to 20%, or more. This kind of relationship can't be modelled accurately with a classic linear model.

3. Capturing non-linear interactions requires defining special interaction terms. The number of terms will grow exponentially with the number of features and the number of lags. In our example of 30 parameters and 52 weeks, we would end up with a model that has about 10^462 terms. For comparison, the number of atoms in the observable universe is estimated to be about 10^80, and remember that increasing that exponent by 1 means a 10-fold increase in the overall value.

The solution to these problems is to be smarter about model specification. That's where deep learning comes in. In part 3 of this blog series, we will get into how deep learning can be applied to find non-linear relationships and make highly accurate predictions from syndicated data sets.

All of these questions and more need to be addressed in order to generate usable predictions, and that takes both time and technical know-how. Management may not want to invest the resources into solving that problem right now, if what you have at the moment is good enough.

If the data prep and modelling costs are going to be too great, then you'll need to think about what you need from your forecasts. Does the business prefer to under-forecast a promotional spike and lose out on potential sales, or over-forecast and end up with high waste and storage costs? How much forecast error can your supply chain tolerate? Does your sales team tend to react to competitive promotions, or can you get by with just doing your own thing? Each of these questions represent a trade-off that will help your business make a call on whether investing in better modelling is worth the associated costs.

Alternatively, cost effective solutions do exist from companies like Fiddlehead, who's entire focus and expertise is providing elegant predictive solutions to FMCG companies.

In part 2 of this series, we'll get in to how different functional areas, such as Insights Groups, Trade Promotion/Marketing, Revenue Management, Sales, and Brand/Category Management, benefit from better competitive forecasts, and how they can use them to their advantage.

This site uses cookies: Find out more.