Constructing the Forecasting Model
Baseline model
Firstly, you’ll create a naive baseline model to make use of as a reference. This model predicts the last value based on a given seasonal periodicity.
For instance, if seasonal_periodicity = 24 hours, it’ll return the worth from “present – 24 hours”.
Using a baseline is a healthy practice that helps you compare your fancy ML model to something simpler. The ML model is useless for those who cannot beat the baseline model together with your fancy model.
Fancy ML model
We are going to construct the model using Sktime and LightGBM.
Take a look at Sktime documentation [3] and LightGBM documentation [4] here.
When you are into time series, try this Forecasting with Sktime tutorial [6]. When you only want to grasp the system’s big picture, you’ll be able to proceed.
LightGBM will likely be your regressor that learns patterns throughout the data and forecasts future values.
Using the WindowSummarizer class from Sktime, you’ll be able to quickly compute lags and mean & standard deviation for various windows.
For instance, for the lag, we offer a default value of list(range(1, 72 + 1)), which translates to “compute the lag for the last 72 hours”.
Also, for instance of the mean lag, we have now the default value of [[1, 24], [1, 48], [1, 72]]. For instance, [1, 24] translates to a lag of 1 and a window size of 24, meaning it’ll compute the mean within the last 24 days. Thus, ultimately, for [[1, 24], [1, 48], [1, 72]], you should have the mean for the last 24, 46, and 72 days.
The identical principle applies to the usual deviation values. Take a look at this doc to learn more [2].
You wrap the LightGBM model using the make_reduction() function from Sktime. By doing so, you’ll be able to easily attach the WindowSummarizer you initialized earlier. Also, by specifying strategy = “recursive”, you’ll be able to easily forecast multiple values into the long run using a recursive paradigm. For instance, if you would like to predict 3 hours into the long run, the model will first forecast the worth for T + 1. Afterward, it’ll use as input the worth it forecasted at T + 1 to forecast the worth at T + 2, and so forth…
Finally, we are going to construct the ForecastingPipeline where we are going to attach two transformers:
- transformers.AttachAreaConsumerType(): a custom transformer that takes the realm and consumer type from the index and adds it as an exogenous variable. We are going to show you ways we defined it.
- DateTimeFeatures(): a transformer from Sktime that computes different datetime-related exogenous features. In our case, we used only the day of the week and the hour of the day as additional features.
Note that these transformers are just like those from Sklearn, as Sktime kept the identical interface and design. Using transformers is a critical step in designing modular models. To learn more about Sklearn transformers and pipelines, try my article about Methods to Quickly Design Advanced Sklearn Pipelines.
Finally, we initialized the hyperparameters of the pipeline and model with the given configuration.
The AttachAreaConsumerType transformer is sort of easy to grasp. We implemented it for instance to point out what is feasible.
Long story short, it just copies the values from the index into its own column.
IMPORTANT OBSERVATION — DESIGN DECISION
As you’ll be able to see, all of the feature engineering steps are built-in into the forecasting pipeline object.
You would possibly ask: “But why? By doing so, don’t we keep the feature engineering logic within the training pipeline?”
Well, yes… and no…
We indeed defined the forecasting pipeline within the training script, but the important thing idea is that we’ll save the entire forecasting pipeline to the model registry.
Thus, once we load the model, we can even load all of the preprocessing and postprocessing steps included within the forecasting pipeline.
This implies all of the feature engineering is encapsulated within the forecasting pipeline, and we will safely treat it as a black box.
That is one strategy to store the transformation + the raw data within the feature store, as discussed in Lesson 1.
We could have also stored the transformation functions independently within the feature store, but composing a single pipeline object is cleaner.