Rate This Document
Findability
Accuracy
Completeness
Readability

Verification

Download the model file and run a test to verify the environment deployment.

  1. Obtain the DeepSeek-R1-Distill-Llama-70B model file.

    You can obtain the model file from HuggingFace or ModelScope. The model file in this document is obtained from ModelScope. Click Download model and obtain the model file as prompted. Save the obtained model file to the /home/models/DeepSeek-R1-Distill-Llama-70B/ directory.

  2. Use vllm/examples/offline_inference/basic/basic.py for testing. The following is an example of the basic.py test code. Replace the model path with the local model file path.
    # SPDX-License-Identifier: Apache-2.0
    
    from vllm import LLM, SamplingParams
    
    # Sample prompts.
    prompts = [
        "Hello, my name is",
        "The president of the United States is",
        "The capital of France is",
        "The future of AI is",
    ]
    # Create a sampling params object.
    sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
    
    # Create an LLM.
    llm = LLM(model="/home/models/DeepSeek-R1-Distill-Llama-70B/", tensor_parallel_size=8) # Modify the local model path and the number of used NPUs to ensure that the model can run properly.
    # Generate texts from the prompts. The output is a list of RequestOutput objects
    # that contain the prompt, generated text, and other information.
    outputs = llm.generate(prompts, sampling_params)
    # Print the outputs.
    for output in outputs:
        prompt = output.prompt
        generated_text = output.outputs[0].text
        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
    
    
  3. Run the test command.
    python3 basic.py

    If the model is running properly and no garbled character is displayed, the environment is properly configured.