我要评分
获取效率
正确性
完整性
易理解

Inference Test with MX C500 Passthrough on cVM

Environment Requirements

Table 1 and Table 2 list the environment requirements.

Table 1 Hardware requirements

Item

Description

CPU

New Kunpeng 920 processor model

GPU

MetaX C500

Table 2 OS requirement

Item

Version

Host OS

openEuler 24.03 LTS SP1

openEuler 24.03 LTS SP2

openEuler 24.03 LTS SP3

Installing the Inference Environment

  1. Obtain the VM image. (Currently, only the openEuler 24.03 LTS SP1 image is supported.)
    wget https://repo.openeuler.org/openEuler-24.03-LTS-SP1/virtual_machine_img/aarch64/openEuler-24.03-LTS-SP1-aarch64.qcow2.xz
    xz -d openEuler-24.03-LTS-SP1-aarch64.qcow2.xz
  2. Install the MetaX GPU driver on the cVM.
    1. For example, to install metax-driver-mxc500-2.32.0.6-rpm-aarch64.run locally, log in to the MetaX developer image resource center (account registration required).
    2. In the Deployment Mode area, select On-premises. In the Resource Configuration Information area on the right, select MX C500 series in the Hardware drop-down list, select Linux/AArch64/Kylin V10 SP2 in the System and Compatibility drop-down list, and select 2.32.0.x (April 2025) in the Solution Version drop-down list. For details, see Figure 1.
    3. In the Installation Steps area, obtain the corresponding installation command, and then run the command to complete the installation.
    Figure 1 Installing the MetaX GPU driver on the cVM
  3. Install the Docker inference environment on the cVM.
    1. For example, to install vllm-metax:0.11.0-maca.ai3.3.0.11-torch2.6-py312-kylin2309a-arm64, log in to the MetaX developer image resource center (account registration required).
    2. In the left area, select AI, and then choose the inference framework (vLLM [0.10.2 or later]). Select arm64 in the Architecture area, select kylin in the OS area, and select 3.12 in the Python Version area. For details, see Figure 2.
    3. Click Copy docker pull to copy it, and then run the docker pull command to pull the Docker image.
    Figure 2 Installing the Docker inference environment on the cVM

Using the vllm Docker Image for Inference Testing

  1. Start the Docker image.
    docker run -it --mount type=bind,source=/home,target=/workspace/mnt,readonly=true --device=/dev/mxcd --device=/dev/dri --group-add video e6ca53da420b /bin/bash
  2. Perform a benchmark test in Docker.
    vllm bench throughput --dataset /workspace/mnt/ShareGPT_V3_unfiltered_cleaned_split.json --model /workspace/mnt/Llama-3-8B-Instruct-HF/ -tp 1
  • The --mount type=bind, source=/home, target=/workspace/mnt, readonly=true command parameters bind and mount the directory in the VM to the specified path in the container. The directory specified by source=/home contains the model and data in the VM. (Download the model and data by yourself.) In this case, the data directory and model directory are /home/ShareGPT_V3_unfiltered_cleaned_split.json and /home/Llama-3-8B-Instruct-HF, respectively.
  • e6ca53da420b is the segment ID of the Docker image, which can be obtained by running the docker images command.