TIME-MOE: A Tutorial to Zero-Shot Forecasting with Mixture-of-Experts
Forecasting on new data with a Billion-Scale model
Our previous article discussed Time-MOE, the largest time-series foundation model released to date.
Despite its size, Time-MOE is fast. It uses the Mixture-of-Experts (MOE) architecture—a sparse setup that activates only the necessary parameters to make precise predictions.
Interestingly, just 2 weeks after Time-MOE's release, another foundation model, MOIRAI, integrated MOE. The resulting model, MOIRAI-MOE, demonstrated performance improvements over its predecessor.
This article will provide a mini-tutorial on Time-MOE using a multi-series dataset and compare it with some statistical baseline models.
Let’s dive in.
Time-MOE Tutorial
You can also find the full project on the AI Projects Folder (Project 11)
The challenge with Time-MOE is that it was pretrained on most available public datasets.