ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

⭐️ Our series works: [MMStar] [ShareGPT4V] [ShareGPT4Omni]

🚀🚀🚀 Official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions.

Here is a video for introducing ShareGPT4Video clearly:

demo_clip_v2.mp4

Authors: Lin Chen*, Xilin Wei* Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Bin Lin, Zhenyu Tang, Li Yuan, Yu Qiao, Dahua Lin, Feng Zhao📧, Jiaqi Wang 📧
Institutes: University of Science and Technology of China; The Chinese University of Hong Kong; Peking University; Shanghai AI Laboratory
Resources: [Paper] [Project Page] [ShareGPT4Video Dataset]
Models: [🤗ShareGPT4Video-8B] [🤗ShareCaptioner-Video]
Demo: [🤗ShareGPT4Video-8B] [🤗ShareCaptioner-Video]

💡 Highlights

🔥 A large-scale highly descriptive video-text dataset, 40K GPT4-Vision-generated video captions, around 400K implicit video split captions.
🔥 A general video captioner for various video durations, resolutions, and aspect ratios, approaching GPT4-Vision's caption capability, featuring two inference modes targeted for quality and efficiency, separately.
🔥 A superior large video-language model ShareGPT4Video-8B, lasting 5 hours on 8xA100 GPUs of training respectively.
🔥 Improving Text-to-Video performance with high-quality video captions generated by our ShareCaptioner-Video. Thanks to Open-Sora-Plan.

📜 News

[2024/6/11] The web demo and local demo of ShareCaptioner-Video are available now!

[2024/6/11] The web demo and local demo of ShareGPT4Video-8B are available now!

[2024/6/7] Our paper has been featured as HuggingFace Daily Papers and ranked 1st in 6.7.

[2024/5/27] The ShareGPT4Video-8B model is released!

[2024/5/26] The ShareGPT4Video dataset and project page are released!

👨‍💻 Todo

Training and evaluation code for ShareGPT4Video-8B
Batch inference code fro ShareCaptioner-Video
Web demo and local demo of ShareCaptioner-Video
Web demo and local demo of ShareGPT4Video-8B
Checkpoints of ShareGPT4Video-8B

Quick Usage

You can directly use our ShareGPT4Video model for conversation with your own video by the following command:

python run.py --model-path Lin-Chen/sharegpt4video-8b --video examples/yoga.mp4 --query Describe this video in detail.

Or you can build your local demo for enjoying our ShareGPT4Video-8B with the following command:

python app.py

You can build your local demo for enjoying our ShareCaptioner-Video with the following command:

cd captioner

python app.py

Install

git clone https://github.com/ShareGPT4Omni/ShareGPT4Video
conda create -n share4video python=3.10 -y
conda activate share4video

cd ShareGPT4Video
pip install --upgrade pip
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝

@article{chen2024sharegpt4video,
  title={ShareGPT4Video: Improving Video Understanding and Generation with Better Captions},
  author={Chen, Lin and Wei, Xilin and Li, Jinsong and Dong, Xiaoyi and Zhang, Pan and Zang, Yuhang and Chen, Zehui and Duan, Haodong and Lin, Bin and Tang, Zhenyu and others},
  journal={arXiv preprint arXiv:2406.04325},
  year={2024}
}

❤️ Acknowledgments

LLaVA: the codebase we built upon. Thanks for their wonderful work.
Open-Sora-Plan: an excellent open-source codebase for Sora-like text-to-video implementation. Thanks for their wonderful work.
Open-LLaVA-NeXT: an open-source codebase for re-producing the training procedure of LLaVA-NeXT series.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
captioner		captioner
examples		examples
llava		llava
scripts		scripts
.gitignore		.gitignore
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

💡 Highlights

📜 News

👨‍💻 Todo

Quick Usage

Install

✒️ Citation

❤️ Acknowledgments

Star History

About

Releases

Packages

Contributors 2

Languages

ShareGPT4Omni/ShareGPT4Video

Folders and files

Latest commit

History

Repository files navigation

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

💡 Highlights

📜 News

👨‍💻 Todo

Quick Usage

Install

✒️ Citation

❤️ Acknowledgments

Star History

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages