Retail Forecasting with IBM’s Tiny Time Mixers (TTM): Step-by-Step Tutorial

Putting the popular foundation model to the test!

May 14, 2025

∙ Paid

Use links below to save image. — Created with DALLE*3

Foundation models are already reshaping time-series forecasting.

Take the VN1[1] retail forecasting competition for example:

Nixtla’s TimeGPT ranked 2nd using zero-shot forecasting[2] — no training, ensembling, or postprocessing (e.g., no manual forecasting of zero-sales products).

Similarly, finetuned MOIRAI-base achieved 1st place in the same competition[3]!

This shows strong potential for foundation models — when used correctly. But not all are mature enough for every case. Retail forecasting often needs exogenous variables, which some zero-shot models don’t support yet.

In this article, we’ll walk through a hands-on tutorial using IBM’s Tiny-Time-Mixers on a retail forecasting task. We’ll use a Kaggle dataset with deeper hierarchies than VN1.

Let’s dive in!

✅ Find the Tiny-Time-Mixers notebook for this article in the AI Projects folder (Project 17)

Enter Tiny-Time-Mixers

I’ve covered TTM extensively on my blog, including a hands-on tutorial for Electricity Demand Forecasting:

Tiny Time Mixers(TTMs): Powerful Zero/Few-Shot Forecasting Models by IBM

Nikos Kafritsas

June 4, 2024

Read full story

Tiny-Time-Mixers R2 (TTM): More Accurate Predictions with Exogenous Feature Mixing - Complete Tutorial

Nikos Kafritsas

Mar 24

Read full story

To recap, there are the advantages of TTM:

Multi-level modeling: TTM first trains on univariate sequences, then integrates cross-channel mixing during finetuning to learn multivariate dependencies.
Dynamic Patching: TTM adjusts patch lengths across layers, letting each time series use its optimal resolution for better generalization.
Frequency-Aware Encoding: TTM embeds time-series frequency (e.g., monthly, minutely) to improve prediction accuracy across different temporal resolutions.
Open-Source: Apache License!

Moreover, TTM is actively developed. Two months ago, version 2.1 was released — better suited for daily and weekly seasonalities. That’s the variant we’ll use here.

Prepare the Dataset

In this project, we'll use a dataset from the Kaggle Tabular Competition [4].

This dataset tracks daily sales of 4 books across 2 stores in 6 different countries, spanning from 2017 to 2021. Our goal is to predict sales for each book, in each store, across every country.

This is a perfect dataset for TTM, because:

It includes additional static features and hierarchies of countries, stores, and products.
It includes the COVID-19 pandemic, which introduced a major regime change in sales patterns.

Let’s start with some visualizations of the dataset:

This lets us evaluate how well the foundation model handles sudden, real-world changes. Then:

AI Horizon Forecast