-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for config_sentence_transformers.json #244
Comments
Good idea. This could be added to the payload of |
I'll have to back track what I was working on at the time I made this but yes. This was derived from a need so I just have to go find the model. I'm still happy to offer a PR just wanted to make sure it was welcome before putting the work in |
So you want to be able to select the prompt with a payload like: {
"inputs": "text",
"prompt": "query"
"truncate": false
} And the prompt is just added at the beginning of inputs right? |
My preference order would be
Basically we find it a bit limiting to have to change code across multiple codebases when we change models |
Ok so what you would like is to have a default format set when starting the service? Otherwise I don't see how you can make it truly model agnostic. So:
|
Yes. The main goal would be to reduce friction between model changes. If y'all are comfortable with this, I'd be happy to offer a PR. It will be a few weeks with the holiday and all. |
I don't know how much it reduces friction because the models don't need to agree on the names. Snowflake/snowflake-arctic-embed-l "prompts": {
"query": "Represent this sentence for searching relevant passages: "
} vs intfloat/e5-mistral-7b-instruct "prompts": {
"web_search_query": "Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery: ",
"sts_query": "Instruct: Retrieve semantically similar text.\nQuery: ",
"summarization_query": "Instruct: Given a news summary, retrieve other semantically similar summaries\nQuery: ",
"bitext_query": "Instruct: Retrieve parallel sentences.\nQuery: "
} In this case you still need to be aware of the different names and what they do. However it's still easier to just pick and add the correct enum value to the body and leave to TEI to add the pre-prompt than what you currently have to do so I'm in favour of adding it. I will do that quickly. |
A first draft: #312 |
Feature request
Add cli option to auto-format input text with config_sentence_transformers.json prompt settings (if provided) before toknizing.
Motivation
A lot of models now expect a prompt prefix so enabling the server-side handle of this allows clients to become model "model-agnostic". We have trouble changing between models since we must support the custom prompt for each specific model on the client side. Server side via config would remove this entriely.
Your contribution
Happy to do the PR myself just want to make sure this would be a welcome contribution
The text was updated successfully, but these errors were encountered: