Toto Part 2: A Hands-On Guide to Zero-Shot Forecasting

Jul 14, 2025

∙ Paid

Part 1 explored Toto and highlighted its unique features.

To recap:

Toto is a 151M parameter model, pretrained on 2.36 trillion tokens with ~70% coming from Datadog’s private telemetry dataset.
Datadog also released along with Toto the BOOM dataset, a new dataset with 350M observations across 2807 distinct multivariate time series—twice the size of the GIFT-eval benchmark.

In this 2nd part, we’ll walk through 2 tutorials and use Toto for:

Long-context forecasting on the Electricity dataset — with rolling forecasts across the full test series for more rigorous evaluation.
Zero-shot forecasting on sparse time series — using an example from the BOOM dataset.

Let’s get started!

AI Horizon Forecast