QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

HakaishinShwet · 2024-06-20T17:26:17Z

Mini cpmV 2.5 or other similar chat+vision support gguf models have two files one is chat model which is big file and one is small generally 1gb file which add vision capabilities and combining and running both together will only give complete model which I have seen in ollama but how can we do that in xinference? please guide step wise step in detail so that not just for mini cpmV type of multimodels but for other type of similar models too it will help everyone
Thankyou

ChengjieLi28 · 2024-06-21T09:55:14Z

@HakaishinShwet You need to register gguf model to xinference and then use it. You can do it through web UI and refer to the document: https://inference.readthedocs.io/en/latest/models/custom.html

HakaishinShwet added the question Further information is requested label Jun 20, 2024

XprobeBot added this to the v0.12.2 milestone Jun 20, 2024

XprobeBot modified the milestones: v0.12.2, v0.12.4 Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

HakaishinShwet commented Jun 20, 2024

ChengjieLi28 commented Jun 21, 2024

QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

QUESTION how to load mini cpmV 2.5 gguf model which has two model files with xinference? #1681

Comments

HakaishinShwet commented Jun 20, 2024

ChengjieLi28 commented Jun 21, 2024