Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: the response I got by using terminal is way better than using ollama.generate #117

Open
wangyeye66 opened this issue Apr 13, 2024 · 5 comments

Comments

@wangyeye66
Copy link

wangyeye66 commented Apr 13, 2024

I use llama2 7b to for text generation. The prompt I attampted:
"""Task: Turn the input into (subject, predicate, object).
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
Input: Joon Park is brewing coffee.
Output: (Joon Park, brew, coffee)
Input: Jane Cook is sleeping.
Output: (Jane Cook, is, sleep)
Input: Michael Bernstein is writing email on a computer.
Output: (Michael Bernstein, write, email)
Input: Percy Liang is teaching students in a classroom.
Output: (Percy Liang, teach, students)
Input: Merrie Morris is running on a treadmill.
Output: (Merrie Morris, run, treadmill)
Input: John Doe is drinking coffee.
Output: (John Doe,"""

using ollama.generate will generate a chat rather than keep generating the text.
In terminal, it seems understand what I would like to do. Did I call wrong function in python? How can I let the model know I don't need a chat-like response?

@93andresen
Copy link

I've never tried this libary, but maybe "ollama.chat" works like the terminal amd "ollama.generate" is like autocomplete?

@ioo0s
Copy link

ioo0s commented Jun 3, 2024

I have the same problem, and the results I get from running it through the ollama run xxmodel terminal are much better than the results I get from python sdk client.chat. Why?

@BowenKwan
Copy link

Same problem here. Using ollama run custom_model in the terminal gives a much better result than ollama.chat(model='custom_model.

It seems to me that all the few shot example provided in the modelfile used to train the custom_model is not provided to the custom model when using ollama.chat. The result seems to be just like using the base model that the custom model is trained on.

@mxyng
Copy link
Collaborator

mxyng commented Jun 13, 2024

@wangyeye66 can you paste the output you get from the cli and the output from the ollama.chat?

from what I can tell, this behavior is expected. llama2:7b implements a chat template which uses these messages to simulate a user/assistant exchange. this is regardless of what method is used to interact with the llm, cli, ollama.generate, or ollama.chat. here's (roughly) what your prompt will produce as an input to the llm:

[INST] <<SYS>><</SYS>> Task: Turn the input into (subject, predicate, object).
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
Input: Joon Park is brewing coffee.
Output: (Joon Park, brew, coffee)
Input: Jane Cook is sleeping.
Output: (Jane Cook, is, sleep)
Input: Michael Bernstein is writing email on a computer.
Output: (Michael Bernstein, write, email)
Input: Percy Liang is teaching students in a classroom.
Output: (Percy Liang, teach, students)
Input: Merrie Morris is running on a treadmill.
Output: (Merrie Morris, run, treadmill)
Input: John Doe is drinking coffee.
Output: (John Doe, [/INST]

based on your prompt, you're probably more interested in the text completion model, llama2:7b-text, which does not template the input

@mxyng
Copy link
Collaborator

mxyng commented Jun 13, 2024

@BowenKwan your issue appears different so I'll respond in #188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants