-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--fp16 True question #78
Comments
Could you please provide more details about the experimental setup and the error encountered? Additionally, can you confirm if other scripts are running correctly? |
Here is my script in scripts/tain/custom_finetune.sh. only change the DATA_PATH IMAGE_PATH and OUTPUT_PATH DATA_PATH="/home/ailab830/liavanlinux/dl_hw3/TinyLLaVA_Factory/dataset/text_files/output_dataformat.json" deepspeed --include localhost:0,1 --master_port 29501 tinyllava/train/custom_finetune.py And this is my error message. File "/home/ailab830/liavanlinux/dl_hw3/.env/lib/python3.10/site-packages/transformers/trainer.py", line 1933, in _inner_training_loop File "/home/ailab830/liavanlinux/dl_hw3/.env/lib/python3.10/site-packages/accelerate/accelerator.py", line 1605, in _prepare_deepspeed File "/home/ailab830/liavanlinux/dl_hw3/.env/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1040, in _do_sanity_check ValueErrorValueError: : Type fp16 is not supported.Type fp16 is not supported. Thank for help ! |
Hi, could you please check your the version of your packages? accelerate==0.27.2? deepspeed==0.14.0? |
yes, they are same. And I re-downloaded again, but doesn't set up on conda environment. |
from deepspeed.accelerator import get_accelerator please check this flag is True or False. If it's False, then it seems your environment of GPU and CUDA and Deepspeed/Accelerator does not support fp16. Not sure the versions of them are compatible with each other. |
flag is False. |
I use custom_finetune.sh and no other redundant parameter settings have been changed.
encountered a problem that is " raise ValueError("Type fp16 is not supported.")ValueError: Type fp16 is not supported."
All installation follows README.md.
However, I can set fp16 in other projects, under the same hardware device.
Please help me with some advice. Thank you !
The text was updated successfully, but these errors were encountered: