DPO training of a supervised finetuned model #3997
Replies: 3 comments
-
the warning message can be safely ignored |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you very much. Thanks again |
Beta Was this translation helpful? Give feedback.
0 replies
-
up to you |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I did a SFT and then I wanted to do dpo training on top of SFT model, so chose the my model and adapter (SFT model) and I chose dpo training. The problem is that non of the inputs have requires_grad=True so the gradient will be None. I don't know where do I make mistake and how to do it properly.
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions