Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

Open
HakaishinShwet opened this issue Jun 20, 2024 · 1 comment
Labels
question Further information is requested
Milestone

Comments

@HakaishinShwet
Copy link

Mini cpmV 2.5 or other similar chat+vision support gguf models have two files one is chat model which is big file and one is small generally 1gb file which add vision capabilities and combining and running both together will only give complete model which I have seen in ollama but how can we do that in xinference? please guide step wise step in detail so that not just for mini cpmV type of multimodels but for other type of similar models too it will help everyone
Thankyou

@HakaishinShwet HakaishinShwet added the question Further information is requested label Jun 20, 2024
@XprobeBot XprobeBot added this to the v0.12.2 milestone Jun 20, 2024
@ChengjieLi28
Copy link
Contributor

@HakaishinShwet You need to register gguf model to xinference and then use it. You can do it through web UI and refer to the document: https://inference.readthedocs.io/en/latest/models/custom.html

@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4 Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants