You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @pengzhiliang. I want to finetune kosmos-2 on a VQA task that answer is a single word (like a multi-class classification task) and I call this single word label. I only have question answer pairs but not bounding boxes. I was wondering that I should use <grounding> or not. I mean should I use <grounding> Question: Are there any <phrase>cats</phrase> in the image? Answer: label or Question: Are there any <phrase>cats</phrase> in the image? Answer: label. I am using Kosmos2ForConditionalGeneration.
and another question: is it rational to use Kosmos2ForConditionalGeneration for fine tuning or not?
The text was updated successfully, but these errors were encountered:
Thank you for your patience. @FarzanRahmani
If your downstream task does not involve bounding boxes, there's no need to use .
You can use it like this:
Question: {question} Answer: {answer}
or
Question: {question} Answer the question using a single word or phrase. Answer: {answer}
Hi @pengzhiliang. I want to finetune kosmos-2 on a VQA task that answer is a single word (like a multi-class classification task) and I call this single word label. I only have question answer pairs but not bounding boxes. I was wondering that I should use
<grounding>
or not. I mean should I use<grounding> Question: Are there any <phrase>cats</phrase> in the image? Answer: label
orQuestion: Are there any <phrase>cats</phrase> in the image? Answer: label
. I am using Kosmos2ForConditionalGeneration.and another question: is it rational to use Kosmos2ForConditionalGeneration for fine tuning or not?
The text was updated successfully, but these errors were encountered: