Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练及导出显存消耗不正常 #1179

Open
tastelikefeet opened this issue Jun 19, 2024 · 1 comment
Open

训练及导出显存消耗不正常 #1179

tastelikefeet opened this issue Jun 19, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@tastelikefeet
Copy link
Collaborator

Describe the bug

显存比正常值消耗高500M左右

Your hardware and system info

GPU:NVidia 4060Ti 16GB

Additional context

运行命令:
CUDA_VISIBLE_DEVICES=0 swift sft --model_id_or_path qwen/Qwen-7B-Chat --custom_train_dataset_path identity.json --save_steps 500 --lora_target_modules ALL --learning_rate 5e-5 --gradient_accumulation_steps 8 --batch_size 2

无法运行,export merge-lora命令同样无法运行

@tastelikefeet tastelikefeet self-assigned this Jun 19, 2024
@tastelikefeet tastelikefeet added the bug Something isn't working label Jun 19, 2024
@tastelikefeet
Copy link
Collaborator Author

fixed
由于初始化时使用了device_map=auto导致的,该技术对显存评估并不精确

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant