ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 35+ MLLMs. (Qwen2, GLM4, Internlm2, Yi, Llama3, Llava, MiniCPM-V, Deepseek, Baichuan2, Phi3-Vision, ...)
-
Updated
Jun 26, 2024 - Python
ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 35+ MLLMs. (Qwen2, GLM4, Internlm2, Yi, Llama3, Llava, MiniCPM-V, Deepseek, Baichuan2, Phi3-Vision, ...)
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
A Framework of Small-scale Large Multimodal Models
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Visual Instruction Tuning for Qwen2 Base Model
This repository contains a web application designed to execute relatively compact, locally-operated Large Language Models (LLMs).
MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
Tag manager and captioner for image datasets
Upload images to Slack with automatic alt text generation using Llava on Ollama
ArchNetAI is a Python library that leverages the Ollama API for generating AI-powered content.
Add a description, image, and links to the llava topic page so that developers can more easily learn about it.
To associate your repository with the llava topic, visit your repo's landing page and select "manage topics."