You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I followed the steps in the tutorial to evaluate the results of the model on MSVD-QA, but I found that regardless of how my system prompt was set, even if the model was asked to answer 'Yes', its answer was almost consistent. Is this reasonable?
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our work. It looks like the model is not respecting the system/user prompt in this particular case which can be considered as one of the limitations of the model.
May be, using some language only instruction data (e.g. ShareGPT data) along with the VideoInstruct-100K improves the performance.
Please do share if you fix this trend. Thank you and good luck :)
I followed the steps in the tutorial to evaluate the results of the model on MSVD-QA, but I found that regardless of how my system prompt was set, even if the model was asked to answer 'Yes', its answer was almost consistent. Is this reasonable?
The text was updated successfully, but these errors were encountered: