AI Horizon Forecast

AI Horizon Forecast

Share this post

AI Horizon Forecast
AI Horizon Forecast
Retail Forecasting with IBM’s Tiny Time Mixers (TTM): Step-by-Step Tutorial

Retail Forecasting with IBM’s Tiny Time Mixers (TTM): Step-by-Step Tutorial

Putting the popular foundation model to the test!

Nikos Kafritsas's avatar
Nikos Kafritsas
May 14, 2025
∙ Paid
5

Share this post

AI Horizon Forecast
AI Horizon Forecast
Retail Forecasting with IBM’s Tiny Time Mixers (TTM): Step-by-Step Tutorial
2
1
Share
Use links below to save image.
Created with DALLE*3

Foundation models are already reshaping time-series forecasting.

Take the VN1[1] retail forecasting competition for example:

Nixtla’s TimeGPT ranked 2nd using zero-shot forecasting[2] — no training, ensembling, or postprocessing (e.g., no manual forecasting of zero-sales products).

Similarly, finetuned MOIRAI-base achieved 1st place in the same competition[3]!

This shows strong potential for foundation models — when used correctly. But not all are mature enough for every case. Retail forecasting often needs exogenous variables, which some zero-shot models don’t support yet.

In this article, we’ll walk through a hands-on tutorial using IBM’s Tiny-Time-Mixers on a retail forecasting task. We’ll use a Kaggle dataset with deeper hierarchies than VN1.

Let’s dive in!

✅ Find the Tiny-Time-Mixers notebook for this article in the AI Projects folder (Project 17)

Enter Tiny-Time-Mixers

I’ve covered TTM extensively on my blog, including a hands-on tutorial for Electricity Demand Forecasting:

Tiny Time Mixers(TTMs): Powerful Zero/Few-Shot Forecasting Models by IBM

Tiny Time Mixers(TTMs): Powerful Zero/Few-Shot Forecasting Models by IBM

Nikos Kafritsas
·
June 4, 2024
Read full story
Tiny-Time-Mixers R2 (TTM): More Accurate Predictions with Exogenous Feature Mixing - Complete Tutorial

Tiny-Time-Mixers R2 (TTM): More Accurate Predictions with Exogenous Feature Mixing - Complete Tutorial

Nikos Kafritsas
·
Mar 24
Read full story

To recap, there are the advantages of TTM:

  • Multi-level modeling: TTM first trains on univariate sequences, then integrates cross-channel mixing during finetuning to learn multivariate dependencies.

  • Dynamic Patching: TTM adjusts patch lengths across layers, letting each time series use its optimal resolution for better generalization.

  • Frequency-Aware Encoding: TTM embeds time-series frequency (e.g., monthly, minutely) to improve prediction accuracy across different temporal resolutions.

  • Open-Source: Apache License!

Moreover, TTM is actively developed. Two months ago, version 2.1 was released — better suited for daily and weekly seasonalities. That’s the variant we’ll use here.


Prepare the Dataset

In this project, we'll use a dataset from the Kaggle Tabular Competition [4].

This dataset tracks daily sales of 4 books across 2 stores in 6 different countries, spanning from 2017 to 2021. Our goal is to predict sales for each book, in each store, across every country.

This is a perfect dataset for TTM, because:

  1. It includes additional static features and hierarchies of countries, stores, and products.

  2. It includes the COVID-19 pandemic, which introduced a major regime change in sales patterns.

Let’s start with some visualizations of the dataset:

This lets us evaluate how well the foundation model handles sudden, real-world changes. Then:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Nikos Kafritsas
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share