Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Answers A, B, C, D are not all equally likely - is it really accurate to use random baseline as comparison? #11

Open
bmosaicml opened this issue Apr 19, 2023 · 0 comments

Comments

@bmosaicml
Copy link

bmosaicml commented Apr 19, 2023

I pulled the test data linked in the README, and I am noticing within each category there is basically never an even 25% split between A, B, C, and D..

The most imbalanced category is high school statistics, for which 47% of the answers are D.

I have two Qs: Is my analysis correct? I was using the test data downloadable from the main repo. Furthermore, if my analysis is correct wouldn't random baseline not be a fair comparison, since majority vote would do much better?

I used the data here: https://people.eecs.berkeley.edu/~hendrycks/data.tar

@bmosaicml bmosaicml changed the title Answers A, B, C, D are not all equally likely - why would a random baseline get 25%? Answers A, B, C, D are not all equally likely - is it really accurate to use random baseline as comparison? Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant