Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

测试报错 #240

Closed
leizhu1989 opened this issue Jun 25, 2024 · 3 comments
Closed

测试报错 #240

leizhu1989 opened this issue Jun 25, 2024 · 3 comments
Assignees

Comments

@leizhu1989
Copy link

os:ubuntu20.04
英伟达显卡驱动:470.199
显卡:T4
pytorch:2.1.0
transformer:4.40

conda环境运行

加了使用gpu代码:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

修改:model = AutoModel.from_pretrained(
MODEL_PATH,
trust_remote_code=True,
device_map="auto"
).eval().to(device) # 将模型移动到GPU

cpu下可以推理

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 12.24it/s]
Traceback (most recent call last):
File "/home/zl/GLM-4/basic_demo/trans_cli_demo.py", line 53, in
device_map="auto").eval().to(device)
File "/home/zl/anaconda3/envs/glm4/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2692, in to
return super().to(*args, **kwargs)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to
return self._apply(convert)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/home/zl/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@zRzRzRzRzRzRzR
Copy link
Collaborator

这个是显卡的问题吧,我看到这个驱动,环境没有配对估计,而且这个卡似乎也带不动这个模型

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jun 25, 2024
@leizhu1989
Copy link
Author

@zRzRzRzRzRzRzR 搞错了,是A10显卡,不知道是不是因为显卡驱动版本太低导致的

@zRzRzRzRzRzRzR
Copy link
Collaborator

zRzRzRzRzRzRzR commented Jun 25, 2024

大概率是的,更新驱动到535以上吧, cuda也建议装11.8或者12以上

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants