File size: 3,588 Bytes
f741a5c
 
 
 
 
b8584bb
f741a5c
a0f5e35
 
f741a5c
 
 
 
8fdbc9b
f741a5c
 
8fdbc9b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
title: VisionTSpp
emoji: πŸ“š
colorFrom: purple
colorTo: purple
python_version: 3.10.14
sdk: gradio
# sdk_version: 5.44.1
sdk_version: 5.34.0
app_file: app.py
pinned: false
license: mit
short_description: space for VisionTSpp
pipeline_tag: time-series-forecasting
---

# VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones

This repository hosts the **VisionTS++** model, a state-of-the-art time series foundation model based on continual pre-training of a visual Masked AutoEncoder (MAE) on large-scale time series data. It excels in multivariate and probabilistic time series forecasting by bridging modality gaps between vision and time series data.

The model was introduced in the paper:
[**VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Vision Backbones**](https://arxiv.org/abs/2508.04379)

Official GitHub repository: [https://github.com/HALF111/VisionTSpp](https://github.com/HALF111/VisionTSpp)

Experience **VisionTS++** directly in your browser on this [Hugging Face Space](https://huggingface.co/spaces/Lefei/VisionTSpp)! You can upload your own custom time series CSV file for zero-shot forecasting.

## About
VisionTS++ is built upon continual pre-training of a vision model on large-scale time series, addressing key discrepancies in cross-modal transfer from vision to time series. It introduces three key innovations:

1.  **Vision-model-based filtering**: Identifies high-quality sequences to stabilize pre-training and mitigate the data-modality gap.
2.  **Colorized multivariate conversion**: Encodes multivariate series as multi-subfigure RGB images to enhance cross-variate modeling.
3.  **Multi-quantile forecasting**: Uses parallel reconstruction heads to generate quantile forecasts for probabilistic predictions without parametric assumptions.

These innovations allow VisionTS++ to achieve state-of-the-art performance in both in-distribution and out-of-distribution forecasting, demonstrating that vision models can effectively generalize to Time Series Forecasting with appropriate adaptation.


## Installation

The VisionTS++ model is available through the `visionts` package on PyPI.

First, install the package:

```shell
pip install visionts
```

If you want to develop the inference code, you can also build from source:

```shell
git clone https://github.com/HALF111/VisionTSpp.git
cd VisionTSpp
pip install -e .
```

For detailed inference examples and usage with clear visualizations of image reconstruction, please refer to the `demo.ipynb` notebook in the [official GitHub repository](https://github.com/HALF111/VisionTSpp/blob/main/demo.ipynb).

## Citation
If you're using VisionTS++ or VisionTS in your research or applications, please cite them using this BibTeX:

```bibtex
@misc{chen2024visionts,
      title={VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters}, 
      author={Mouxiang Chen and Lefei Shen and Zhuo Li and Xiaoyun Joy Wang and Jianling Sun and Chenghao Liu},
      year={2024},
      eprint={2408.17253},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2408.17253}, 
}
@misc{shen2025visiontspp,
      title={VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones}, 
      author={Lefei Shen and Mouxiang Chen and Xu Liu and Han Fu and Xiaoxue Ren and Jianling Sun and Zhuo Li and Chenghao Liu},
      year={2025},
      eprint={2508.04379},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.04379}, 
}
```