Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No CUDA drivers in Azure A10 #3651

Open
WesleyYue opened this issue Jun 8, 2024 · 3 comments
Open

No CUDA drivers in Azure A10 #3651

WesleyYue opened this issue Jun 8, 2024 · 3 comments

Comments

@WesleyYue
Copy link

Bug

  • Follow this guide
  • Expect the vLLM server to come up, but the job fails instead due to no CUDA drivers
  • I looked through the Skypilot code but the parts for picking an image get a bit hard to parse. From what I can tell, Skypilot (correctly) picks the ubuntu-hpc 22.04 image for a gen 2 instance.

Screenshot 2024-06-07 at 5 03 23 PM
Screenshot 2024-06-07 at 5 03 18 PM

To Reproduce

  1. Run sky launch -c qwen skypilot.yaml --cloud azure --region westus3
  2. Observe that the launch fails and errors related to no CUDA drivers found
  3. Confirm that CUDA drivers indeed does not exist by ssh qwen && nvidia-smi

skypilot.yaml (modifed from qwen-7b.yaml, with extra logging statements and to use A10 only)

envs:
  MODEL_NAME: Qwen/Qwen1.5-7B-Chat

service:
  # Specifying the path to the endpoint to check the readiness of the replicas.
  readiness_probe:
    path: /v1/chat/completions
    post_data:
      model: $MODEL_NAME
      messages:
        - role: user
          content: Hello! What is your name?
      max_tokens: 1
    initial_delay_seconds: 1200
  # How many replicas to manage.
  replicas: 1

resources:
  # accelerators: { L4, A10g, A10, L40, A40, A100:1, A100-80GB:1 }
  accelerators: { A10 }
  disk_tier: best
  ports: 8000

setup: |
  echo "[skypilot.yaml] Activating conda environment 'qwen'"
  conda activate qwen
  if [ $? -ne 0 ]; then
    echo "[skypilot.yaml] Creating new conda environment 'qwen' with Python 3.10"
    conda create -n qwen python=3.10 -y
    conda activate qwen
  fi
  echo "[skypilot.yaml] Installing required packages..."
  pip install -U vllm==0.3.2
  pip install -U transformers==4.38.0
  echo "[skypilot.yaml] Done installing packages."

run: |
  echo "[skypilot.yaml] Listing available conda environments:"
  conda env list
  echo "[skypilot.yaml] Activating conda environment 'qwen'"
  conda activate qwen
  echo "[skypilot.yaml] Listing available conda environments:"
  conda env list
  echo "[skypilot.yaml] Listing installed packages:"
  pip list
  echo "[skypilot.yaml] Setting PATH to include /sbin"
  export PATH=$PATH:/sbin
  echo "[skypilot.yaml] Starting vllm OpenAI API server with the following configuration:"
  echo "[skypilot.yaml]   - Host: 0.0.0.0"
  echo "[skypilot.yaml]   - Model: $MODEL_NAME"
  echo "[skypilot.yaml]   - Tensor Parallel Size: $SKYPILOT_NUM_GPUS_PER_NODE"
  echo "[skypilot.yaml]   - Maximum Model Length: 1024"
  python -m vllm.entrypoints.openai.api_server \
    --host 0.0.0.0 \
    --model $MODEL_NAME \
    --tensor-parallel-size $SKYPILOT_NUM_GPUS_PER_NODE \
    --max-model-len 1024 | tee ~/openai_api_server.log

Version & Commit info:

  • sky -v: skypilot, version 1.0.0.dev20240607
  • sky -c: skypilot, commit 26d902d7e47900bb6b6c897f6fda79047b35df35
@WesleyYue
Copy link
Author

Full logs:

Task from YAML spec: x.yaml
W 06-07 16:53:05 aws_catalog.py:173] Failed to fetch availability zone mapping. ImportError: Failed to import dependencies for AWS. Try pip install "skypilot[aws]"
W 06-07 16:53:06 aws_catalog.py:173] Failed to fetch availability zone mapping. ImportError: Failed to import dependencies for AWS. Try pip install "skypilot[aws]"
I 06-07 16:53:06 cli.py:1112] Service section will be ignored when using `sky launch`. 
I 06-07 16:53:06 cli.py:1112] To spin up a service, use SkyServe CLI: sky serve up
I 06-07 16:53:06 optimizer.py:1264] No resource satisfying Azure({'A100': 1}, disk_tier=best, ports=['8000'], region=westus3) on Azure.
I 06-07 16:53:06 optimizer.py:1268] Did you mean: ['A100-80GB:1', 'A100-80GB:2', 'A100-80GB:4']
I 06-07 16:53:06 optimizer.py:1264] No resource satisfying Azure({'L4': 1}, disk_tier=best, ports=['8000'], region=westus3) on Azure.
I 06-07 16:53:06 optimizer.py:1268] Did you mean: ['A100-80GB:1', 'A100-80GB:2', 'A100-80GB:4']
I 06-07 16:53:06 optimizer.py:1264] No resource satisfying Azure({'A10G': 1}, disk_tier=best, ports=['8000'], region=westus3) on Azure.
I 06-07 16:53:06 optimizer.py:1268] Did you mean: ['A100-80GB:1', 'A100-80GB:2', 'A100-80GB:4']
I 06-07 16:53:06 optimizer.py:1264] No resource satisfying Azure({'A40': 1}, disk_tier=best, ports=['8000'], region=westus3) on Azure.
I 06-07 16:53:06 optimizer.py:1268] Did you mean: ['A100-80GB:1', 'A100-80GB:2', 'A100-80GB:4']
I 06-07 16:53:06 optimizer.py:1264] No resource satisfying Azure({'L40': 1}, disk_tier=best, ports=['8000'], region=westus3) on Azure.
I 06-07 16:53:06 optimizer.py:1268] Did you mean: ['A100-80GB:1', 'A100-80GB:2', 'A100-80GB:4']
I 06-07 16:53:06 optimizer.py:695] == Optimizer ==
I 06-07 16:53:06 optimizer.py:706] Target: minimizing cost
I 06-07 16:53:06 optimizer.py:718] Estimated cost: $0.5 / hour
I 06-07 16:53:06 optimizer.py:718] 
I 06-07 16:53:06 optimizer.py:843] Considered resources (1 node):
I 06-07 16:53:06 optimizer.py:913] -------------------------------------------------------------------------------------------------------
I 06-07 16:53:06 optimizer.py:913]  CLOUD   INSTANCE                   vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE   COST ($)   CHOSEN   
I 06-07 16:53:06 optimizer.py:913] -------------------------------------------------------------------------------------------------------
I 06-07 16:53:06 optimizer.py:913]  Azure   Standard_NV6ads_A10_v5     6       55        A10:1          westus3       0.45          ✔     
I 06-07 16:53:06 optimizer.py:913]  Azure   Standard_NC24ads_A100_v4   24      220       A100-80GB:1    westus3       3.67                
I 06-07 16:53:06 optimizer.py:913] -------------------------------------------------------------------------------------------------------
I 06-07 16:53:06 optimizer.py:913] 
I 06-07 16:53:06 optimizer.py:931] Multiple Azure instances satisfy A10:1. The cheapest Azure(Standard_NV6ads_A10_v5, {'A10': 1}, disk_tier=best, ports=['8000']) is considered among:
I 06-07 16:53:06 optimizer.py:931] ['Standard_NV6ads_A10_v5', 'Standard_NV12ads_A10_v5', 'Standard_NV18ads_A10_v5', 'Standard_NV36ads_A10_v5', 'Standard_NV36adms_A10_v5'].
I 06-07 16:53:06 optimizer.py:931] 
I 06-07 16:53:06 optimizer.py:937] To list more details, run 'sky show-gpus A10'.
Launching a new cluster 'qwen'. Proceed? [Y/n]: y
I 06-07 16:53:10 cloud_vm_ray_backend.py:4397] Creating a new cluster: 'qwen' [1x Azure(Standard_NV6ads_A10_v5, {'A10': 1}, disk_tier=best, ports=['8000'])].
I 06-07 16:53:10 cloud_vm_ray_backend.py:4397] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 06-07 16:53:13 cloud_vm_ray_backend.py:1385] To view detailed progress: tail -n100 -f /Users/wesley/sky_logs/sky-2024-06-07-16-53-06-502155/provision.log
I 06-07 16:53:13 cloud_vm_ray_backend.py:1779] Launching on Azure westus3
I 06-07 16:57:13 log_utils.py:45] Head node is up.
I 06-07 17:06:59 cloud_vm_ray_backend.py:1627] Successfully provisioned or found existing VM.
I 06-07 17:07:03 cloud_vm_ray_backend.py:3215] Running setup on 1 node.
[skypilot.yaml] Activating conda environment 'qwen'

EnvironmentNameNotFound: Could not find conda environment: qwen
You can list all discoverable environments with `conda info --envs`.


[skypilot.yaml] Creating new conda environment 'qwen' with Python 3.10
Channels:
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /home/azureuser/miniconda3/envs/qwen

  added / updated specs:
    - python=3.10


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    bzip2-1.0.8                |       h5eee18b_6         262 KB
    ca-certificates-2024.3.11  |       h06a4308_0         127 KB
    libffi-3.4.4               |       h6a678d5_1         141 KB
    openssl-3.0.13             |       h7f8727e_2         5.2 MB
    pip-24.0                   |  py310h06a4308_0         2.7 MB
    python-3.10.14             |       h955ad1f_1        26.8 MB
    setuptools-69.5.1          |  py310h06a4308_0        1012 KB
    sqlite-3.45.3              |       h5eee18b_0         1.2 MB
    tk-8.6.14                  |       h39e8969_0         3.4 MB
    tzdata-2024a               |       h04d1e81_0         116 KB
    wheel-0.43.0               |  py310h06a4308_0         110 KB
    xz-5.4.6                   |       h5eee18b_1         643 KB
    zlib-1.2.13                |       h5eee18b_1         111 KB
    ------------------------------------------------------------
                                           Total:        41.8 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main 
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu 
  bzip2              pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_6 
  ca-certificates    pkgs/main/linux-64::ca-certificates-2024.3.11-h06a4308_0 
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1 
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_1 
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1 
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1 
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1 
  libuuid            pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0 
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0 
  openssl            pkgs/main/linux-64::openssl-3.0.13-h7f8727e_2 
  pip                pkgs/main/linux-64::pip-24.0-py310h06a4308_0 
  python             pkgs/main/linux-64::python-3.10.14-h955ad1f_1 
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0 
  setuptools         pkgs/main/linux-64::setuptools-69.5.1-py310h06a4308_0 
  sqlite             pkgs/main/linux-64::sqlite-3.45.3-h5eee18b_0 
  tk                 pkgs/main/linux-64::tk-8.6.14-h39e8969_0 
  tzdata             pkgs/main/noarch::tzdata-2024a-h04d1e81_0 
  wheel              pkgs/main/linux-64::wheel-0.43.0-py310h06a4308_0 
  xz                 pkgs/main/linux-64::xz-5.4.6-h5eee18b_1 
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_1 



Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
#     $ conda activate qwen
#
# To deactivate an active environment, use
#
#     $ conda deactivate

[skypilot.yaml] Installing required packages...
Collecting vllm==0.3.2
  Downloading vllm-0.3.2-cp310-cp310-manylinux1_x86_64.whl.metadata (7.5 kB)
Collecting ninja (from vllm==0.3.2)
  Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB)
Collecting psutil (from vllm==0.3.2)
  Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (21 kB)
Collecting ray>=2.9 (from vllm==0.3.2)
  Downloading ray-2.24.0-cp310-cp310-manylinux2014_x86_64.whl.metadata (13 kB)
Collecting sentencepiece (from vllm==0.3.2)
  Downloading sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)
Collecting numpy (from vllm==0.3.2)
  Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 3.3 MB/s eta 0:00:00
Collecting torch==2.1.2 (from vllm==0.3.2)
  Downloading torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl.metadata (25 kB)
Collecting transformers>=4.38.0 (from vllm==0.3.2)
  Downloading transformers-4.41.2-py3-none-any.whl.metadata (43 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.8/43.8 kB 3.8 MB/s eta 0:00:00
Collecting xformers==0.0.23.post1 (from vllm==0.3.2)
  Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl.metadata (1.0 kB)
Collecting fastapi (from vllm==0.3.2)
  Downloading fastapi-0.111.0-py3-none-any.whl.metadata (25 kB)
Collecting uvicorn[standard] (from vllm==0.3.2)
  Downloading uvicorn-0.30.1-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic>=2.0 (from vllm==0.3.2)
  Downloading pydantic-2.7.3-py3-none-any.whl.metadata (108 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.0/109.0 kB 5.9 MB/s eta 0:00:00
Collecting aioprometheus[starlette] (from vllm==0.3.2)
  Downloading aioprometheus-23.12.0-py3-none-any.whl.metadata (9.8 kB)
Collecting pynvml==11.5.0 (from vllm==0.3.2)
  Downloading pynvml-11.5.0-py3-none-any.whl.metadata (7.8 kB)
Collecting triton>=2.1.0 (from vllm==0.3.2)
  Downloading triton-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.4 kB)
Collecting cupy-cuda12x==12.1.0 (from vllm==0.3.2)
  Downloading cupy_cuda12x-12.1.0-cp310-cp310-manylinux2014_x86_64.whl.metadata (2.6 kB)
Collecting fastrlock>=0.5 (from cupy-cuda12x==12.1.0->vllm==0.3.2)
  Downloading fastrlock-0.8.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.metadata (9.3 kB)
Collecting filelock (from torch==2.1.2->vllm==0.3.2)
  Downloading filelock-3.14.0-py3-none-any.whl.metadata (2.8 kB)
Collecting typing-extensions (from torch==2.1.2->vllm==0.3.2)
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy (from torch==2.1.2->vllm==0.3.2)
  Downloading sympy-1.12.1-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch==2.1.2->vllm==0.3.2)
  Downloading networkx-3.3-py3-none-any.whl.metadata (5.1 kB)
Collecting jinja2 (from torch==2.1.2->vllm==0.3.2)
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting fsspec (from torch==2.1.2->vllm==0.3.2)
  Downloading fsspec-2024.6.0-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-nccl-cu12==2.18.1 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.7 kB)
Collecting triton>=2.1.0 (from vllm==0.3.2)
  Downloading triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch==2.1.2->vllm==0.3.2)
  Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting annotated-types>=0.4.0 (from pydantic>=2.0->vllm==0.3.2)
  Downloading annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.18.4 (from pydantic>=2.0->vllm==0.3.2)
  Downloading pydantic_core-2.18.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.5 kB)
Collecting click>=7.0 (from ray>=2.9->vllm==0.3.2)
  Downloading click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting jsonschema (from ray>=2.9->vllm==0.3.2)
  Downloading jsonschema-4.22.0-py3-none-any.whl.metadata (8.2 kB)
Collecting msgpack<2.0.0,>=1.0.0 (from ray>=2.9->vllm==0.3.2)
  Downloading msgpack-1.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.1 kB)
Collecting packaging (from ray>=2.9->vllm==0.3.2)
  Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting protobuf!=3.19.5,>=3.15.3 (from ray>=2.9->vllm==0.3.2)
  Downloading protobuf-5.27.1-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Collecting pyyaml (from ray>=2.9->vllm==0.3.2)
  Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting aiosignal (from ray>=2.9->vllm==0.3.2)
  Downloading aiosignal-1.3.1-py3-none-any.whl.metadata (4.0 kB)
Collecting frozenlist (from ray>=2.9->vllm==0.3.2)
  Downloading frozenlist-1.4.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting requests (from ray>=2.9->vllm==0.3.2)
  Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting huggingface-hub<1.0,>=0.23.0 (from transformers>=4.38.0->vllm==0.3.2)
  Downloading huggingface_hub-0.23.3-py3-none-any.whl.metadata (12 kB)
Collecting regex!=2019.12.17 (from transformers>=4.38.0->vllm==0.3.2)
  Downloading regex-2024.5.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 2.7 MB/s eta 0:00:00
Collecting tokenizers<0.20,>=0.19 (from transformers>=4.38.0->vllm==0.3.2)
  Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting safetensors>=0.4.1 (from transformers>=4.38.0->vllm==0.3.2)
  Downloading safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting tqdm>=4.27 (from transformers>=4.38.0->vllm==0.3.2)
  Downloading tqdm-4.66.4-py3-none-any.whl.metadata (57 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 kB 5.6 MB/s eta 0:00:00
Collecting orjson (from aioprometheus[starlette]->vllm==0.3.2)
  Downloading orjson-3.10.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (49 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.7/49.7 kB 4.7 MB/s eta 0:00:00
Collecting quantile-python>=1.1 (from aioprometheus[starlette]->vllm==0.3.2)
  Downloading quantile-python-1.1.tar.gz (2.9 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting starlette>=0.14.2 (from aioprometheus[starlette]->vllm==0.3.2)
  Downloading starlette-0.37.2-py3-none-any.whl.metadata (5.9 kB)
Collecting fastapi-cli>=0.0.2 (from fastapi->vllm==0.3.2)
  Downloading fastapi_cli-0.0.4-py3-none-any.whl.metadata (7.0 kB)
Collecting httpx>=0.23.0 (from fastapi->vllm==0.3.2)
  Downloading httpx-0.27.0-py3-none-any.whl.metadata (7.2 kB)
Collecting python-multipart>=0.0.7 (from fastapi->vllm==0.3.2)
  Downloading python_multipart-0.0.9-py3-none-any.whl.metadata (2.5 kB)
Collecting ujson!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0,>=4.0.1 (from fastapi->vllm==0.3.2)
  Downloading ujson-5.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.3 kB)
Collecting email_validator>=2.0.0 (from fastapi->vllm==0.3.2)
  Downloading email_validator-2.1.1-py3-none-any.whl.metadata (26 kB)
Collecting h11>=0.8 (from uvicorn[standard]->vllm==0.3.2)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting httptools>=0.5.0 (from uvicorn[standard]->vllm==0.3.2)
  Downloading httptools-0.6.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting python-dotenv>=0.13 (from uvicorn[standard]->vllm==0.3.2)
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting uvloop!=0.15.0,!=0.15.1,>=0.14.0 (from uvicorn[standard]->vllm==0.3.2)
  Downloading uvloop-0.19.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Collecting watchfiles>=0.13 (from uvicorn[standard]->vllm==0.3.2)
  Downloading watchfiles-0.22.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Collecting websockets>=10.4 (from uvicorn[standard]->vllm==0.3.2)
  Downloading websockets-12.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting dnspython>=2.0.0 (from email_validator>=2.0.0->fastapi->vllm==0.3.2)
  Downloading dnspython-2.6.1-py3-none-any.whl.metadata (5.8 kB)
Collecting idna>=2.0.0 (from email_validator>=2.0.0->fastapi->vllm==0.3.2)
  Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting typer>=0.12.3 (from fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading typer-0.12.3-py3-none-any.whl.metadata (15 kB)
Collecting anyio (from httpx>=0.23.0->fastapi->vllm==0.3.2)
  Downloading anyio-4.4.0-py3-none-any.whl.metadata (4.6 kB)
Collecting certifi (from httpx>=0.23.0->fastapi->vllm==0.3.2)
  Downloading certifi-2024.6.2-py3-none-any.whl.metadata (2.2 kB)
Collecting httpcore==1.* (from httpx>=0.23.0->fastapi->vllm==0.3.2)
  Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)
Collecting sniffio (from httpx>=0.23.0->fastapi->vllm==0.3.2)
  Downloading sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch==2.1.2->vllm==0.3.2)
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting attrs>=22.2.0 (from jsonschema->ray>=2.9->vllm==0.3.2)
  Downloading attrs-23.2.0-py3-none-any.whl.metadata (9.5 kB)
Collecting jsonschema-specifications>=2023.03.6 (from jsonschema->ray>=2.9->vllm==0.3.2)
  Downloading jsonschema_specifications-2023.12.1-py3-none-any.whl.metadata (3.0 kB)
Collecting referencing>=0.28.4 (from jsonschema->ray>=2.9->vllm==0.3.2)
  Downloading referencing-0.35.1-py3-none-any.whl.metadata (2.8 kB)
Collecting rpds-py>=0.7.1 (from jsonschema->ray>=2.9->vllm==0.3.2)
  Downloading rpds_py-0.18.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB)
Collecting charset-normalizer<4,>=2 (from requests->ray>=2.9->vllm==0.3.2)
  Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting urllib3<3,>=1.21.1 (from requests->ray>=2.9->vllm==0.3.2)
  Downloading urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Collecting mpmath<1.4.0,>=1.1.0 (from sympy->torch==2.1.2->vllm==0.3.2)
  Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Collecting exceptiongroup>=1.0.2 (from anyio->httpx>=0.23.0->fastapi->vllm==0.3.2)
  Downloading exceptiongroup-1.2.1-py3-none-any.whl.metadata (6.6 kB)
Collecting shellingham>=1.3.0 (from typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading shellingham-1.5.4-py2.py3-none-any.whl.metadata (3.5 kB)
Collecting rich>=10.11.0 (from typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading rich-13.7.1-py3-none-any.whl.metadata (18 kB)
Collecting markdown-it-py>=2.2.0 (from rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Collecting pygments<3.0.0,>=2.13.0 (from rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading pygments-2.18.0-py3-none-any.whl.metadata (2.5 kB)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich>=10.11.0->typer>=0.12.3->fastapi-cli>=0.0.2->fastapi->vllm==0.3.2)
  Downloading mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Downloading vllm-0.3.2-cp310-cp310-manylinux1_x86_64.whl (41.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.4/41.4 MB 16.2 MB/s eta 0:00:00
Downloading cupy_cuda12x-12.1.0-cp310-cp310-manylinux2014_x86_64.whl (83.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.0/83.0 MB 8.2 MB/s eta 0:00:00
Downloading pynvml-11.5.0-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 3.7 MB/s eta 0:00:00
Downloading torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl (670.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 670.2/670.2 MB 1.4 MB/s eta 0:00:00
Downloading triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.2/89.2 MB 7.9 MB/s eta 0:00:00
Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl (213.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 213.0/213.0 MB 4.1 MB/s eta 0:00:00
Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 2.3 MB/s eta 0:00:00
Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 10.0 MB/s eta 0:00:00
Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 4.9 MB/s eta 0:00:00
Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 60.0 MB/s eta 0:00:00
Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 943.0 kB/s eta 0:00:00
Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 1.1 MB/s eta 0:00:00
Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 2.4 MB/s eta 0:00:00
Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 4.9 MB/s eta 0:00:00
Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 1.1 MB/s eta 0:00:00
Downloading nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl (209.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.8/209.8 MB 4.2 MB/s eta 0:00:00
Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 8.1 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 39.6 MB/s eta 0:00:00
Downloading pydantic-2.7.3-py3-none-any.whl (409 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 409.6/409.6 kB 36.3 MB/s eta 0:00:00
Downloading pydantic_core-2.18.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 69.9 MB/s eta 0:00:00
Downloading ray-2.24.0-cp310-cp310-manylinux2014_x86_64.whl (65.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.9/65.9 MB 10.0 MB/s eta 0:00:00
Downloading transformers-4.41.2-py3-none-any.whl (9.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.1/9.1 MB 109.7 MB/s eta 0:00:00
Downloading fastapi-0.111.0-py3-none-any.whl (91 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 92.0/92.0 kB 9.8 MB/s eta 0:00:00
Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 kB 34.0 MB/s eta 0:00:00
Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.2/288.2 kB 26.0 MB/s eta 0:00:00
Downloading sentencepiece-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 58.0 MB/s eta 0:00:00
Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB)
Downloading click-8.1.7-py3-none-any.whl (97 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB 10.3 MB/s eta 0:00:00
Downloading email_validator-2.1.1-py3-none-any.whl (30 kB)
Downloading fastapi_cli-0.0.4-py3-none-any.whl (9.5 kB)
Downloading fastrlock-0.8.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (51 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.3/51.3 kB 4.4 MB/s eta 0:00:00
Downloading h11-0.14.0-py3-none-any.whl (58 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 5.9 MB/s eta 0:00:00
Downloading httptools-0.6.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (341 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 341.4/341.4 kB 33.6 MB/s eta 0:00:00
Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.6/75.6 kB 8.2 MB/s eta 0:00:00
Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.9/77.9 kB 9.0 MB/s eta 0:00:00
Downloading huggingface_hub-0.23.3-py3-none-any.whl (401 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 401.7/401.7 kB 37.1 MB/s eta 0:00:00
Downloading fsspec-2024.6.0-py3-none-any.whl (176 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.9/176.9 kB 19.4 MB/s eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB 14.1 MB/s eta 0:00:00
Downloading msgpack-1.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (385 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 385.1/385.1 kB 34.0 MB/s eta 0:00:00
Downloading orjson-3.10.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.5/142.5 kB 14.9 MB/s eta 0:00:00
Downloading packaging-24.0-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.5/53.5 kB 5.3 MB/s eta 0:00:00
Downloading protobuf-5.27.1-cp38-abi3-manylinux2014_x86_64.whl (309 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 309.2/309.2 kB 32.3 MB/s eta 0:00:00
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Downloading python_multipart-0.0.9-py3-none-any.whl (22 kB)
Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 705.5/705.5 kB 25.2 MB/s eta 0:00:00
Downloading regex-2024.5.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (775 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 775.1/775.1 kB 58.5 MB/s eta 0:00:00
Downloading safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 85.8 MB/s eta 0:00:00
Downloading starlette-0.37.2-py3-none-any.whl (71 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.9/71.9 kB 7.5 MB/s eta 0:00:00
Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 120.6 MB/s eta 0:00:00
Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.3/78.3 kB 7.3 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading ujson-5.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.6/53.6 kB 5.6 MB/s eta 0:00:00
Downloading uvicorn-0.30.1-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.4/62.4 kB 6.7 MB/s eta 0:00:00
Downloading uvloop-0.19.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 117.1 MB/s eta 0:00:00
Downloading watchfiles-0.22.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 87.2 MB/s eta 0:00:00
Downloading websockets-12.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (130 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 130.2/130.2 kB 12.7 MB/s eta 0:00:00
Downloading aioprometheus-23.12.0-py3-none-any.whl (31 kB)
Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Downloading frozenlist-1.4.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (239 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 239.5/239.5 kB 26.1 MB/s eta 0:00:00
Downloading filelock-3.14.0-py3-none-any.whl (12 kB)
Downloading jsonschema-4.22.0-py3-none-any.whl (88 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.3/88.3 kB 9.6 MB/s eta 0:00:00
Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 66.2 MB/s eta 0:00:00
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 7.1 MB/s eta 0:00:00
Downloading sympy-1.12.1-py3-none-any.whl (5.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 108.7 MB/s eta 0:00:00
Downloading anyio-4.4.0-py3-none-any.whl (86 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.8/86.8 kB 8.5 MB/s eta 0:00:00
Downloading attrs-23.2.0-py3-none-any.whl (60 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.8/60.8 kB 6.8 MB/s eta 0:00:00
Downloading certifi-2024.6.2-py3-none-any.whl (164 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 164.4/164.4 kB 17.5 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.1/142.1 kB 15.0 MB/s eta 0:00:00
Downloading dnspython-2.6.1-py3-none-any.whl (307 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.7/307.7 kB 30.3 MB/s eta 0:00:00
Downloading idna-3.7-py3-none-any.whl (66 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.8/66.8 kB 6.3 MB/s eta 0:00:00
Downloading jsonschema_specifications-2023.12.1-py3-none-any.whl (18 kB)
Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 49.7 MB/s eta 0:00:00
Downloading referencing-0.35.1-py3-none-any.whl (26 kB)
Downloading rpds_py-0.18.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 80.7 MB/s eta 0:00:00
Downloading sniffio-1.3.1-py3-none-any.whl (10 kB)
Downloading typer-0.12.3-py3-none-any.whl (47 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.2/47.2 kB 4.8 MB/s eta 0:00:00
Downloading urllib3-2.2.1-py3-none-any.whl (121 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.1/121.1 kB 12.6 MB/s eta 0:00:00
Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl (21.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 29.4 MB/s eta 0:00:00
Downloading exceptiongroup-1.2.1-py3-none-any.whl (16 kB)
Downloading rich-13.7.1-py3-none-any.whl (240 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.7/240.7 kB 24.6 MB/s eta 0:00:00
Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)
Downloading markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.5/87.5 kB 9.1 MB/s eta 0:00:00
Downloading pygments-2.18.0-py3-none-any.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 83.0 MB/s eta 0:00:00
Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Building wheels for collected packages: quantile-python
  Building wheel for quantile-python (setup.py): started
  Building wheel for quantile-python (setup.py): finished with status 'done'
  Created wheel for quantile-python: filename=quantile_python-1.1-py3-none-any.whl size=3443 sha256=8709fab3c63a2c2ac773179e20d77a3cf88eec45ddd05b8555fceb93a5c07052
  Stored in directory: /home/azureuser/.cache/pip/wheels/6d/f4/0a/0e7d01548a005f9f3fa23101f071d248da052f2a9bf2fe11c6
Successfully built quantile-python
Installing collected packages: sentencepiece, quantile-python, ninja, mpmath, fastrlock, websockets, uvloop, urllib3, ujson, typing-extensions, tqdm, sympy, sniffio, shellingham, safetensors, rpds-py, regex, pyyaml, python-multipart, python-dotenv, pynvml, pygments, psutil, protobuf, packaging, orjson, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, msgpack, mdurl, MarkupSafe, idna, httptools, h11, fsspec, frozenlist, filelock, exceptiongroup, dnspython, click, charset-normalizer, certifi, attrs, annotated-types, uvicorn, triton, requests, referencing, pydantic-core, nvidia-cusparse-cu12, nvidia-cudnn-cu12, markdown-it-py, jinja2, httpcore, email_validator, cupy-cuda12x, anyio, aiosignal, aioprometheus, watchfiles, starlette, rich, pydantic, nvidia-cusolver-cu12, jsonschema-specifications, huggingface-hub, httpx, typer, torch, tokenizers, jsonschema, xformers, transformers, ray, fastapi-cli, fastapi, vllm
Successfully installed MarkupSafe-2.1.5 aioprometheus-23.12.0 aiosignal-1.3.1 annotated-types-0.7.0 anyio-4.4.0 attrs-23.2.0 certifi-2024.6.2 charset-normalizer-3.3.2 click-8.1.7 cupy-cuda12x-12.1.0 dnspython-2.6.1 email_validator-2.1.1 exceptiongroup-1.2.1 fastapi-0.111.0 fastapi-cli-0.0.4 fastrlock-0.8.2 filelock-3.14.0 frozenlist-1.4.1 fsspec-2024.6.0 h11-0.14.0 httpcore-1.0.5 httptools-0.6.1 httpx-0.27.0 huggingface-hub-0.23.3 idna-3.7 jinja2-3.1.4 jsonschema-4.22.0 jsonschema-specifications-2023.12.1 markdown-it-py-3.0.0 mdurl-0.1.2 mpmath-1.3.0 msgpack-1.0.8 networkx-3.3 ninja-1.11.1.1 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 orjson-3.10.3 packaging-24.0 protobuf-5.27.1 psutil-5.9.8 pydantic-2.7.3 pydantic-core-2.18.4 pygments-2.18.0 pynvml-11.5.0 python-dotenv-1.0.1 python-multipart-0.0.9 pyyaml-6.0.1 quantile-python-1.1 ray-2.24.0 referencing-0.35.1 regex-2024.5.15 requests-2.32.3 rich-13.7.1 rpds-py-0.18.1 safetensors-0.4.3 sentencepiece-0.2.0 shellingham-1.5.4 sniffio-1.3.1 starlette-0.37.2 sympy-1.12.1 tokenizers-0.19.1 torch-2.1.2 tqdm-4.66.4 transformers-4.41.2 triton-2.1.0 typer-0.12.3 typing-extensions-4.12.2 ujson-5.10.0 urllib3-2.2.1 uvicorn-0.30.1 uvloop-0.19.0 vllm-0.3.2 watchfiles-0.22.0 websockets-12.0 xformers-0.0.23.post1
Collecting transformers==4.38.0
  Downloading transformers-4.38.0-py3-none-any.whl.metadata (131 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 131.1/131.1 kB 86.5 kB/s eta 0:00:00
Requirement already satisfied: filelock in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (3.14.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.19.3 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (0.23.3)
Requirement already satisfied: numpy>=1.17 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (1.26.4)
Requirement already satisfied: packaging>=20.0 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (24.0)
Requirement already satisfied: pyyaml>=5.1 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (2024.5.15)
Requirement already satisfied: requests in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (2.32.3)
Collecting tokenizers<0.19,>=0.14 (from transformers==4.38.0)
  Downloading tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Requirement already satisfied: safetensors>=0.4.1 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (0.4.3)
Requirement already satisfied: tqdm>=4.27 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from transformers==4.38.0) (4.66.4)
Requirement already satisfied: fsspec>=2023.5.0 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from huggingface-hub<1.0,>=0.19.3->transformers==4.38.0) (2024.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from huggingface-hub<1.0,>=0.19.3->transformers==4.38.0) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from requests->transformers==4.38.0) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from requests->transformers==4.38.0) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from requests->transformers==4.38.0) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages (from requests->transformers==4.38.0) (2024.6.2)
Downloading transformers-4.38.0-py3-none-any.whl (8.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.5/8.5 MB 71.0 MB/s eta 0:00:00
Downloading tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 95.9 MB/s eta 0:00:00
Installing collected packages: tokenizers, transformers
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.19.1
    Uninstalling tokenizers-0.19.1:
      Successfully uninstalled tokenizers-0.19.1
  Attempting uninstall: transformers
    Found existing installation: transformers 4.41.2
    Uninstalling transformers-4.41.2:
      Successfully uninstalled transformers-4.41.2
Successfully installed tokenizers-0.15.2 transformers-4.38.0
[skypilot.yaml] Done installing packages.
I 06-07 17:10:28 cloud_vm_ray_backend.py:3228] Setup completed.
I 06-07 17:10:28 cloud_vm_ray_backend.py:3414] Multiple resources are specified for the task, using: Azure({'A10': 1}, disk_tier=best, ports=['8000'])
I 06-07 17:10:31 cloud_vm_ray_backend.py:3315] Job submitted with Job ID: 1
I 06-08 00:10:31 log_lib.py:412] Start streaming logs for job 1.
INFO: Tip: use Ctrl-C to exit log streaming (task will not be killed).
INFO: Waiting for task resources on 1 node. This will block if the cluster is full.
INFO: All task resources reserved.
INFO: Reserved IPs: ['<redacted>']
(task, pid=11482) [skypilot.yaml] Listing available conda environments:
(task, pid=11482) # conda environments:
(task, pid=11482) #
(task, pid=11482) base                  *  /home/azureuser/miniconda3
(task, pid=11482) qwen                     /home/azureuser/miniconda3/envs/qwen
(task, pid=11482) 
(task, pid=11482) [skypilot.yaml] Activating conda environment 'qwen'
(task, pid=11482) [skypilot.yaml] Listing available conda environments:
(task, pid=11482) # conda environments:
(task, pid=11482) #
(task, pid=11482) base                     /home/azureuser/miniconda3
(task, pid=11482) qwen                  *  /home/azureuser/miniconda3/envs/qwen
(task, pid=11482) 
(task, pid=11482) [skypilot.yaml] Listing installed packages:
(task, pid=11482) Package                   Version
(task, pid=11482) ------------------------- ------------
(task, pid=11482) aioprometheus             23.12.0
(task, pid=11482) aiosignal                 1.3.1
(task, pid=11482) annotated-types           0.7.0
(task, pid=11482) anyio                     4.4.0
(task, pid=11482) attrs                     23.2.0
(task, pid=11482) certifi                   2024.6.2
(task, pid=11482) charset-normalizer        3.3.2
(task, pid=11482) click                     8.1.7
(task, pid=11482) cupy-cuda12x              12.1.0
(task, pid=11482) dnspython                 2.6.1
(task, pid=11482) email_validator           2.1.1
(task, pid=11482) exceptiongroup            1.2.1
(task, pid=11482) fastapi                   0.111.0
(task, pid=11482) fastapi-cli               0.0.4
(task, pid=11482) fastrlock                 0.8.2
(task, pid=11482) filelock                  3.14.0
(task, pid=11482) frozenlist                1.4.1
(task, pid=11482) fsspec                    2024.6.0
(task, pid=11482) h11                       0.14.0
(task, pid=11482) httpcore                  1.0.5
(task, pid=11482) httptools                 0.6.1
(task, pid=11482) httpx                     0.27.0
(task, pid=11482) huggingface-hub           0.23.3
(task, pid=11482) idna                      3.7
(task, pid=11482) Jinja2                    3.1.4
(task, pid=11482) jsonschema                4.22.0
(task, pid=11482) jsonschema-specifications 2023.12.1
(task, pid=11482) markdown-it-py            3.0.0
(task, pid=11482) MarkupSafe                2.1.5
(task, pid=11482) mdurl                     0.1.2
(task, pid=11482) mpmath                    1.3.0
(task, pid=11482) msgpack                   1.0.8
(task, pid=11482) networkx                  3.3
(task, pid=11482) ninja                     1.11.1.1
(task, pid=11482) numpy                     1.26.4
(task, pid=11482) nvidia-cublas-cu12        12.1.3.1
(task, pid=11482) nvidia-cuda-cupti-cu12    12.1.105
(task, pid=11482) nvidia-cuda-nvrtc-cu12    12.1.105
(task, pid=11482) nvidia-cuda-runtime-cu12  12.1.105
(task, pid=11482) nvidia-cudnn-cu12         8.9.2.26
(task, pid=11482) nvidia-cufft-cu12         11.0.2.54
(task, pid=11482) nvidia-curand-cu12        10.3.2.106
(task, pid=11482) nvidia-cusolver-cu12      11.4.5.107
(task, pid=11482) nvidia-cusparse-cu12      12.1.0.106
(task, pid=11482) nvidia-nccl-cu12          2.18.1
(task, pid=11482) nvidia-nvjitlink-cu12     12.5.40
(task, pid=11482) nvidia-nvtx-cu12          12.1.105
(task, pid=11482) orjson                    3.10.3
(task, pid=11482) packaging                 24.0
(task, pid=11482) pip                       24.0
(task, pid=11482) protobuf                  5.27.1
(task, pid=11482) psutil                    5.9.8
(task, pid=11482) pydantic                  2.7.3
(task, pid=11482) pydantic_core             2.18.4
(task, pid=11482) Pygments                  2.18.0
(task, pid=11482) pynvml                    11.5.0
(task, pid=11482) python-dotenv             1.0.1
(task, pid=11482) python-multipart          0.0.9
(task, pid=11482) PyYAML                    6.0.1
(task, pid=11482) quantile-python           1.1
(task, pid=11482) ray                       2.24.0
(task, pid=11482) referencing               0.35.1
(task, pid=11482) regex                     2024.5.15
(task, pid=11482) requests                  2.32.3
(task, pid=11482) rich                      13.7.1
(task, pid=11482) rpds-py                   0.18.1
(task, pid=11482) safetensors               0.4.3
(task, pid=11482) sentencepiece             0.2.0
(task, pid=11482) setuptools                69.5.1
(task, pid=11482) shellingham               1.5.4
(task, pid=11482) sniffio                   1.3.1
(task, pid=11482) starlette                 0.37.2
(task, pid=11482) sympy                     1.12.1
(task, pid=11482) tokenizers                0.15.2
(task, pid=11482) torch                     2.1.2
(task, pid=11482) tqdm                      4.66.4
(task, pid=11482) transformers              4.38.0
(task, pid=11482) triton                    2.1.0
(task, pid=11482) typer                     0.12.3
(task, pid=11482) typing_extensions         4.12.2
(task, pid=11482) ujson                     5.10.0
(task, pid=11482) urllib3                   2.2.1
(task, pid=11482) uvicorn                   0.30.1
(task, pid=11482) uvloop                    0.19.0
(task, pid=11482) vllm                      0.3.2
(task, pid=11482) watchfiles                0.22.0
(task, pid=11482) websockets                12.0
(task, pid=11482) wheel                     0.43.0
(task, pid=11482) xformers                  0.0.23.post1
(task, pid=11482) [skypilot.yaml] Setting PATH to include /sbin
(task, pid=11482) [skypilot.yaml] Starting vllm OpenAI API server with the following configuration:
(task, pid=11482) [skypilot.yaml]   - Host: 0.0.0.0
(task, pid=11482) [skypilot.yaml]   - Model: Qwen/Qwen1.5-7B-Chat
(task, pid=11482) [skypilot.yaml]   - Tensor Parallel Size: 1
(task, pid=11482) [skypilot.yaml]   - Maximum Model Length: 1024
(task, pid=11482) INFO 06-08 00:10:36 api_server.py:229] args: Namespace(host='0.0.0.0', port=8000, allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, served_model_name=None, lora_modules=None, chat_template=None, response_role='assistant', ssl_keyfile=None, ssl_certfile=None, root_path=None, middleware=[], model='Qwen/Qwen1.5-7B-Chat', tokenizer=None, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=False, download_dir=None, load_format='auto', dtype='auto', kv_cache_dtype='auto', max_model_len=1024, worker_use_ray=False, pipeline_parallel_size=1, tensor_parallel_size=1, max_parallel_loading_workers=None, block_size=16, seed=0, swap_space=4, gpu_memory_utilization=0.9, max_num_batched_tokens=None, max_num_seqs=256, max_paddings=256, disable_log_stats=False, quantization=None, enforce_eager=False, max_context_len_to_capture=8192, disable_custom_all_reduce=False, enable_lora=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', max_cpu_loras=None, device='cuda', engine_use_ray=False, disable_log_requests=False, max_log_len=None)
(task, pid=11482) INFO 06-08 00:10:36 llm_engine.py:79] Initializing an LLM engine with config: model='Qwen/Qwen1.5-7B-Chat', tokenizer='Qwen/Qwen1.5-7B-Chat', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=1024, download_dir=None, load_format=auto, tensor_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, seed=0)
(task, pid=11482) Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
(task, pid=11482) Traceback (most recent call last):
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
(task, pid=11482)     return _run_code(code, main_globals, None,
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/runpy.py", line 86, in _run_code
(task, pid=11482)     exec(code, run_globals)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 237, in <module>
(task, pid=11482)     engine = AsyncLLMEngine.from_engine_args(engine_args)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 625, in from_engine_args
(task, pid=11482)     engine = cls(parallel_config.worker_use_ray,
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 321, in __init__
(task, pid=11482)     self.engine = self._init_engine(*args, **kwargs)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 366, in _init_engine
(task, pid=11482)     return engine_class(*args, **kwargs)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 120, in __init__
(task, pid=11482)     self._init_workers()
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 163, in _init_workers
(task, pid=11482)     self._run_workers("init_model")
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 1014, in _run_workers
(task, pid=11482)     driver_worker_output = getattr(self.driver_worker,
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/worker/worker.py", line 85, in init_model
(task, pid=11482)     torch.cuda.set_device(self.device)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 404, in set_device
(task, pid=11482)     torch._C._cuda_setDevice(device)
(task, pid=11482)   File "/home/azureuser/miniconda3/envs/qwen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 298, in _lazy_init
(task, pid=11482)     torch._C._cuda_init()
(task, pid=11482) RuntimeError: No CUDA GPUs are available
INFO: Job finished (status: SUCCEEDED).
I 06-07 17:10:43 cloud_vm_ray_backend.py:3350] Job ID: 1
I 06-07 17:10:43 cloud_vm_ray_backend.py:3350] To cancel the job:       sky cancel qwen 1
I 06-07 17:10:43 cloud_vm_ray_backend.py:3350] To stream job logs:      sky logs qwen 1
I 06-07 17:10:43 cloud_vm_ray_backend.py:3350] To view the job queue:   sky queue qwen
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] 
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] Cluster name: qwen
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] To log into the head VM: ssh qwen
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] To submit a job:         sky exec qwen yaml_file
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] To stop the cluster:     sky stop qwen
I 06-07 17:10:43 cloud_vm_ray_backend.py:3446] To teardown the cluster: sky down qwen
Clusters
NAME  LAUNCHED  RESOURCES                                                                  STATUS  AUTOSTOP  COMMAND                       
qwen  1 hr ago  1x Azure(Standard_NV6ads_A10_v5, {'A10': 1}, disk_tier=best, ports=['8...  UP      -         sky launch -c qwen x.yaml...  

@Michaelvll
Copy link
Collaborator

Hmm, good catch! Does this problem also happen for other GPUs types like A100, or is it an issue with A10 only?

@WesleyYue
Copy link
Author

I tested on Standard_NC24ads_A100_v4 and Standard_NV6ads_A10_v5 but it happens on A10 only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants