Time Series Seasonal Decomposition

Using Seasonal Decomposition to Inform the SARIMA Model Selection of Soybean Prices in Python.

Andrea Yoss
The Startup

--

There are a variety of approaches you can use when working with time series data, such as linear models, ARIMA models, exponential smoothing methods, and recurrent neural networks (RNNs). In this post, I will focus on an extension of the ARIMA model that also accounts for seasonality in the data: the SARIMA model.

Brief Overview of SARIMA Model

A SARIMA model consists of four pieces,

  • Seasonal (S): Accounts for seasonality that occurs over a fixed time period.
  • Autoregressive (AR) : Accounts for any long-term trends in the data by regressing future values on past values.
  • Integrated (I): Ensures stationarity in the data, which is an assumption of the SARIMA model.
  • Moving Average (MA): Accounts for any sudden shocks in the data by regressing future values on past errors.

In order to discern which pieces to include, it is useful to look at the seasonal decomposition of the data.

Seasonal decomposition allows you to break (or “decompose”) time series data into its seasonal, trend, and residual components. By analyzing these components, we are able to identify some pieces of our SARIMA model to include.

Illustrative Example: Daily Soybean Prices, 1990–2019

Soybean prices are notoriously volatile, as they are based not only on prior prices and crop seasonality, but also on external macroeconomic and supply/ demand factors. For example, some external factors that may contribute to the volatility shown in Figure 1 include, but are not limited to:

  • global trade policies
  • currency strength
  • weather
  • regulation changes
  • increased demand for meat/ livestock
Figure 1: Average Daily Price of Soybeans, 1990–2019

To understand how much of these price fluctuations can be explained by a long-term trend (like inflation, for example) or crop seasonality, and how much are due to these external, “random” factors, we should look at it’s seasonal decomposition. Using the seasonal_decompose() function from the Python StatsModels library, we can break the time series data into its components.

# Import
from statsmodels.tsa.seasonal import seasonal_decompose
# Decompose time series into daily trend, seasonal, and residual components.
# Note that the settlement price = average daily price.
decomp = seasonal_decompose(data['Settlement Price'], period = 360)
# Plot the decomposed time series to interpret.
decomp.plot();

Figure 2 shows the seasonal decomposition of the average daily price of soybeans, or the “settlement price.”

Figure 2: Seasonal Decomposition of Soybean Prices

Let’s interpret each component of the decomposition:

Trend Component

Figure 2a: Seasonal Decomposition: Trend

Based on the plot in Figure 2a, there appears to be an overall upwards trend in the data. Even without knowing the specific long-term trends contributing to price, we can still capture this in our model by including an autoregressive (AR) piece.

Seasonal Component

Figure 2b: Seasonal Decomposition: Seasonality

From Figure 2b, we can see that there is definite seasonality in the data, causing prices to fluctuate by 0.5 over the course of a year. This makes sense intuitively, as the supply of soybeans changes depending on where it is in its seasonal production cycle of planting, growing and harvesting. Because of this, we would want to include a seasonal (S) piece in our model.

Residual Component

Figure 2c: Seasonal Decomposition: Residual

We can see from the plot that there appears to be randomness in the data. Because prices are susceptible to random “shocks,” we know our model will benefit from the inclusion of a moving average (MA) piece to smooth out these shocks.

This illustrative example is based on a project I recently completed to predict daily soybean prices using time series analysis. To view the project in its entirety, please follow this link.

--

--