Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WMDP Dataset #186

Open
justinphan3110cais opened this issue May 1, 2024 · 0 comments
Open

Update WMDP Dataset #186

justinphan3110cais opened this issue May 1, 2024 · 0 comments

Comments

@justinphan3110cais
Copy link

Hi,
We have just improved the quality of the WMDP Dataset last week. You can refer to the updated dataset on the hugging face page or on our github repo. This is the detailed of the latest update.

Update 2024-04-23: the WMDP multiple choice questions were modified due to issues with data formatting and unicode encoding. Some questions in WMDP-Cyber were also removed for being excessively long, which makes evaluation with a fixed batch size challenging. Some questions in WMDP-Bio were also removed for insufficient dual-use potential (h/t folks from Google DeepMind and OpenAI). The modified version is now uploaded on all mirrors; please re-download the dataset. Thanks!

Can you also re-update the datasets/orchestrators/benchmark/question_answer_dataset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant