Skip to content

Latest commit

 

History

History
34 lines (24 loc) · 1.05 KB

README.md

File metadata and controls

34 lines (24 loc) · 1.05 KB

Vision-Language Web Demo

A chatbot demo with image input.

Supported Models

Quick Start

internlm/internlm-xcomposer-7b

  • extract llm model from huggingface model
    python extract_xcomposer_llm.py
    # the llm part will saved to internlm_model folder.
  • lanuch the demo
    python app.py --model-name internlm-xcomposer-7b --llm-ckpt internlm_model

Qwen-VL-Chat

  • lanuch the dmeo
    python app.py --model-name qwen-vl-chat --hf-ckpt Qwen/Qwen-VL-Chat

Limitations

  • this demo uses the code in their repo to extract image features that might not very efficiency.
  • this demo only contains the chat function. If you want to use localization ability in Qwen-VL-Chat or article generation function in InternLM-XComposer, you need implement these pre/post processes. The difference compared to chat is how to build prompts and use the output of model.