Inference Test with MX C500 Passthrough on cVM
Environment Requirements
Installing the Inference Environment
- Obtain the VM image. (Currently, only the openEuler 24.03 LTS SP1 image is supported.)
wget https://repo.openeuler.org/openEuler-24.03-LTS-SP1/virtual_machine_img/aarch64/openEuler-24.03-LTS-SP1-aarch64.qcow2.xz xz -d openEuler-24.03-LTS-SP1-aarch64.qcow2.xz
- Install the MetaX GPU driver on the cVM.
- For example, to install metax-driver-mxc500-2.32.0.6-rpm-aarch64.run locally, log in to the MetaX developer image resource center (account registration required).
- In the Deployment Mode area, select On-premises. In the Resource Configuration Information area on the right, select MX C500 series in the Hardware drop-down list, select Linux/AArch64/Kylin V10 SP2 in the System and Compatibility drop-down list, and select 2.32.0.x (April 2025) in the Solution Version drop-down list. For details, see Figure 1.
- In the Installation Steps area, obtain the corresponding installation command, and then run the command to complete the installation.
- Install the Docker inference environment on the cVM.
- For example, to install vllm-metax:0.11.0-maca.ai3.3.0.11-torch2.6-py312-kylin2309a-arm64, log in to the MetaX developer image resource center (account registration required).
- In the left area, select AI, and then choose the inference framework (vLLM [0.10.2 or later]). Select arm64 in the Architecture area, select kylin in the OS area, and select 3.12 in the Python Version area. For details, see Figure 2.
- Click Copy docker pull to copy it, and then run the docker pull command to pull the Docker image.
Using the vllm Docker Image for Inference Testing
- Start the Docker image.
docker run -it --mount type=bind,source=/home,target=/workspace/mnt,readonly=true --device=/dev/mxcd --device=/dev/dri --group-add video e6ca53da420b /bin/bash
- Perform a benchmark test in Docker.
vllm bench throughput --dataset /workspace/mnt/ShareGPT_V3_unfiltered_cleaned_split.json --model /workspace/mnt/Llama-3-8B-Instruct-HF/ -tp 1
- The --mount type=bind, source=/home, target=/workspace/mnt, readonly=true command parameters bind and mount the directory in the VM to the specified path in the container. The directory specified by source=/home contains the model and data in the VM. (Download the model and data by yourself.) In this case, the data directory and model directory are /home/ShareGPT_V3_unfiltered_cleaned_split.json and /home/Llama-3-8B-Instruct-HF, respectively.
- e6ca53da420b is the segment ID of the Docker image, which can be obtained by running the docker images command.
Parent topic: Best Practices (GPU Passthrough)

