Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(question) could quarkus.langchain4j.ollama.embedding-model.model-id also give vector dimension ? #682

Open
laurentperez opened this issue Jun 15, 2024 · 2 comments

Comments

@laurentperez
Copy link
Contributor

laurentperez commented Jun 15, 2024

hello

the root use case behind issue opening is : ollama https://ollama.com/library/nomic-embed-text produces a Vector of 768 byte dimension, which does not fit into the 1536 dimension inherited from OpenAI, or the 384 dimension from in memory embedders

it throws an Exception at first embedding storage (using nomic) because the create table embeddings done by langchain4j for pgvector defaults to 1536.

of course the documentation mentions : the pgvector has to be set, i.e https://docs.quarkiverse.io/quarkus-langchain4j/dev/pgvector-store.html. however, it does not yet discovers the correct vector size, which depends on the embedder used.

to make it work I simply did an alter table embeddings and changed the vector size from 1536 to 768, because I was using nomic.

my suggestion would be : let the extension discover at Configuration/Build time the default vector size inherited from the embedder used.

however to do that, the extension shall know the embedder used : quarkus.langchain4j.ollama.embedding-model.model-id could give the dimension information to the extension

the dimension could be discoverable at runtime : it seems ollama is augmenting model info, see ollama/ollama#3570 (comment)

I believe this is also related to ollama/ollama#651 and spring-projects/spring-ai#840

WDYT ?

@laurentperez laurentperez changed the title Add a configuration property for default ollama embedder (nomic-embed-text) => relatable to vector dimension (question) could quarkus.langchain4j.ollama.embedding-model.model-id also give vector dimension ? Jun 16, 2024
@geoand
Copy link
Collaborator

geoand commented Jun 17, 2024

cc @langchain4j @jmartisk

@jmartisk
Copy link
Collaborator

There is langchain4j/langchain4j#1250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants