Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: not enough values to unpack (expected 2, got 1) #11

Open
lazir0lufias opened this issue Jun 22, 2022 · 5 comments
Open

ValueError: not enough values to unpack (expected 2, got 1) #11

lazir0lufias opened this issue Jun 22, 2022 · 5 comments

Comments

@lazir0lufias
Copy link

Hi, sorry im new in this field.

!python tune_gpt.py --khan-dataroot /content/amps/khan/ --save-dir /content/drive/MyDrive/model/

when i using the above code on google colap, i got error

Traceback (most recent call last):
File "tune_gpt.py", line 333, in
main()
File "tune_gpt.py", line 318, in main
train_data = get_dataset(args)
File "tune_gpt.py", line 239, in get_dataset
len_multiplier, dirname = args.khan_dataroot.split("@")
ValueError: not enough values to unpack (expected 2, got 1)

How to fix this?

@hendrycks
Copy link
Owner

hendrycks commented Jun 23, 2022 via email

@lazir0lufias
Copy link
Author

free drive just 15gb but amps 23gb, how can i edit the scripts so that can be trained on colap ?

@gcalabria
Copy link

gcalabria commented Jun 29, 2023

I have the same problem here. I am running it on my machine, not on colab. Any ideas?

Update:
I figured out what was causing this problem. You were probably passing the argument as a string. For example:

python t5_tune.py \
  --mathematica-dataroot "/home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt"

However, it should be passed as a path (i.e., without the quotes):

python t5_tune.py \
  --mathematica-dataroot /home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt

The problem now is that I am getting an error zsh: argument list too long: python, which I believe is caused because there is simply too many files in the corpus.

@ayaka14732
Copy link

I believe that dataroot is the path to the directories, i.e. /home/gui/dev/t5math/data/amps/mathematica, not a list of files.

@ayaka14732
Copy link

ayaka14732 commented Aug 11, 2023

I understand the problem now. It should be

python t5_tune.py \
  --mathematica-dataroot="/home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt"

in your case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants