Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM Issue #1923

Open
zhentingqi opened this issue Jun 4, 2024 · 3 comments
Open

OOM Issue #1923

zhentingqi opened this issue Jun 4, 2024 · 3 comments

Comments

@zhentingqi
Copy link

Hi! I am running evaluations but keep getting OOM errors. Here is my script:

TASKS="mmlu"
BATCH_SIZE=1
NUM_SHOTS=5


MODEL=Qwen/Qwen1.5-4B
API=vllm
lm_eval \
    --model ${API} \
    --model_args pretrained=${MODEL},dtype="float",gpu_memory_utilization=0.6,max_model_len=1024 \
    --tasks ${TASKS} \
    --device cuda:0 \
    --batch_size ${BATCH_SIZE} \
    --num_fewshot ${NUM_SHOTS} \
    --trust_remote_code \

I am using a 80GB A100. I have already decreased the gpu_memory_utilization and max_model_len, but the problem persists. I have tried Llama3-8b with the same hyperparameters and everything was just fine. Can anyone please tell me why this happens and how I can solve it? Thanks!

@LSinev
Copy link
Contributor

LSinev commented Jun 4, 2024

#1894 check this one too

@devzzzero
Copy link

Try running nvidia-smi to see which processes are using up your gpu memory.

@johnwee1
Copy link
Contributor

not sure why but my vram usage does not seem to imply that gpu_memory_utilization is being respected - i also have issues running a 7B model on an A40 (40GB vram) despite having no issues in using the HF backend (~28gb vram utilized there). specifically: it OOMs even before running any requests. Have verified that there is nothing else running on the GPU.

However, passing enforce_eager=True in model_args into the vllm backend fixes this issue for me by preventing cuda graphs being built.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants