iVideoGPT: Interactive VideoGPTs are Scalable World Models

This repo provides official code and checkpoints for iVideoGPT, a generic and efficient world model architecture that has been pre-trained on millions of human and robotic manipulation trajectories.

News

🚩 2024.05.31: Project website with video samples is released.
🚩 2024.05.30: Model pre-trained on Open X-Embodiment and inference code are released.
🚩 2024.05.28: The pre-trained model, inference code, project website, and a demo are coming soon in about one week!
🚩 2024.05.27: Our paper is released on arXiv.

Installation

conda create -n ivideogpt python==3.9
conda activate ivideogpt
pip install -r requirements.txt

Models

At the moment we provide the following models:

Model	Resolution	Action	Tokenizer Size	Transformer Size
ivideogpt-oxe-64-act-free	64x64	No	114M	138M

If no network connection to Hugging Face, you can manually download from Tsinghua Cloud.

Inference Examples

Action-free Video Prediction on Open X-Embodiment

python predict.py --pretrained_model_name_or_path "thuml/ivideogpt-oxe-64-act-free" --input_path samples/fractal_sample.npz --dataset_name fractal20220817_data

To try more samples, download the dataset from the Open X-Embodiment Dataset and extract single episodes as follows:

python oxe_data_converter.py --dataset_name {dataset_name, e.g. bridge} --input_path {path to OXE} --output_path samples --max_num_episodes 10

Showcases

Citation

If you find this project useful, please cite our paper as:

@article{wu2024ivideogpt,
    title={iVideoGPT: Interactive VideoGPTs are Scalable World Models}, 
    author={Jialong Wu and Shaofeng Yin and Ningya Feng and Xu He and Dong Li and Jianye Hao and Mingsheng Long},
    journal={arXiv preprint arXiv:2405.15223},
    year={2024},
}

Contact

If you have any question, please contact [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
ivideogpt/vq_model		ivideogpt/vq_model
samples		samples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
oxe_data_converter.py		oxe_data_converter.py
predict.py		predict.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iVideoGPT: Interactive VideoGPTs are Scalable World Models

News

Installation

Models

Inference Examples

Action-free Video Prediction on Open X-Embodiment

Showcases

Citation

Contact

About

Releases

Packages

Languages

License

thuml/iVideoGPT

Folders and files

Latest commit

History

Repository files navigation

iVideoGPT: Interactive VideoGPTs are Scalable World Models

News

Installation

Models

Inference Examples

Action-free Video Prediction on Open X-Embodiment

Showcases

Citation

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages