Using MOIRAI-2 to Outperform Statistical Models on Sparse Data
A hands-on tutorial with MOIRAI-2 on the BOOM dataset
Last year, the time-series community got excited with the first wave of foundation models.
TimeGPT, TimesFM, MOIRAI, Chronos, and MOMENT showed something remarkable: zero-shot forecasting was possible — and the results were strong.
But there was a catch. The gains showed up mostly in benchmarks, not always in real-world use. A tuned specialized model, even a simple statistical one, could still win on practical datasets. Finetuning a foundation model was also an option, but that defeats the purpose of zero-shot forecasting, also adding latency.
This year, foundation models are being upgraded on multiple fronts—training methods, data quality, and richer input capabilities. Models like MOIRAI-2, TimesFM-2.5, and Toto-2 unexpectedly surpass specialized baselines, showing significant improvements on challenging datasets.
This article focuses on the latest SOTA, MOIRAI-2.
Specifically, the article:
Tests MOIRAI-2 on BOOM, an industrial dataset with sparse time series.
Compares it with top models like AutoETS and AutoARIMA.
Builds a benchmark template to evaluate MOIRAI against any Nixtla model.
Let’s get started!
✅ Find the notebook for this article here: AI Projects Folder (Project 23)
BOOM Dataset
Along with Toto, Datadog released the BOOM dataset.
BOOM is unique, since it contains sparse operational data, a common feature of real-world datasets.
The dataset can be downloaded from Hugging Face. Since it is large and disk-heavy, we’ll start by exploring it with a script and then download only the parts we need: