Rate This Document
Findability
Accuracy
Completeness
Readability

Testing the Video Stream Cloud Phone Density

  1. Extract the cfct_config file from the DemoVideoEngine.tar.gz package.
    1
    2
    3
    cd /home/kbox_video/
    tar -xvf DemoVideoEngine.tar.gz cfct_config
    chmod 644 cfct_config
    
  2. Check BIOS configuration parameters and the GPU working mode. For details, see Configuring the BIOS and Configuring the GPU Working Mode.
  3. Use the BMC to adjust the server fan speed percentage to 100%.
    1. Log in to the BMC over SSH.
      1
      ssh {BMC_user_name}@{BMC_IP_address}
      
    2. Set the fan speed adjustment mode to manual.
      1
      ipmcset -d fanmode -v 1 0
      

      The command output is as follows:

      1
      2
      3
      Set fan mode successfully.
      Current Mode:       manual
      Time out    :       100000000 seconds
      
    3. Set the fan speed percentage to 100%.
      1
      ipmcset -d fanlevel -v 100
      

      The command output is as follows:

      1
      2
      3
      Set fan level successfully.
      Current Mode:              manual, timeout 100000000 seconds.
      Global Manual Fan Level:   100%
      
    4. Query the fan status.
      1
      ipmcget -d faninfo
      

      The command output is as follows:

      1
      2
      3
      Current Mode: manual, timeout 100000000 seconds.
      Manual fan level:
      Fan1: 100, Fan2: 100, Fan3: 100, Fan4: 100
      
  4. Bind the GPU driver processes to idle cores. For details about how to bind NIC interrupts to idle cores, see "Binding NICs to CPUs" in the Kbox Cloud Phone Container Feature Guide. Perform this operation each time you restart the server.
    1. Query the GPU process ID.
      1
      ps -ef |grep gfx
      
      The command output is as follows:
      1
      2
      root        1703       2  1 Aug31 ?        07:31:36 [gfx_0.0.0]
      root        1739       2  1 Aug31 ?        09:13:08 [gfx_0.0.0]
      
    2. Bind the first GPU process to an idle core.
      1
      taskset -pc 32-33 1703
      
      The command output is as follows:
      1
      2
      pid 1703's current affinity list: 0-127
      pid 1703's new affinity list: 32,33
      
    3. Bind the second GPU process to an idle core.
      1
      taskset -pc 64-65 1739
      
      The command output is as follows:
      1
      2
      pid 1739's current affinity list: 0-127
      pid 1739's new affinity list: 64,65
      

      If four GPU driver processes are displayed, bind the first two driver processes to cores 32 to 33 and the last two driver processes to cores 64 to 65.

  5. Scatter single-queue NIC interrupts. For a single-queue NIC, the RPS service can be used to distribute interrupts to multiple cores. This prevents performance bottlenecks caused by a large number of software interrupts centralized to a single core. The method is as follows. NIC enp125s0f1 is used as an example. The following commands bind the NIC interrupts to the first eight CPU cores and tune other RPS settings to optimize the NIC performance in this scenario. Perform this operation each time you restart the server.
    1
    2
    3
    4
    5
    systemctl stop irqbalance.service
    systemctl disable irqbalance.service
    echo ff000000 > /sys/class/net/enp125s0f1/queues/rx-0/rps_cpus
    echo 4096 > /sys/class/net/enp125s0f1/queues/rx-0/rps_flow_cnt
    echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
    
  6. Disable NUMA balancing and SWAP. Perform this operation each time you restart the server.
    1
    2
    echo 0 > /proc/sys/kernel/numa_balancing
    swapoff -a
    
  7. Set the CPU_BIND_MODE field in the cfct_config file to the value in Table 1 (1 indicates binding NUMA).
    Table 1 Mapping between test specifications and CPU_BIND_MODE

    Test Specifications

    CPU_BIND_MODE

    7260 + W6800 density test, 72 channels, Subway Surfers, 1080p@60 fps

    0

  8. Modify MODE0_CPU0, MODE0_CPU1, MODE0_CPU2, and MODE0_CPU3 in the cfct_config file. Change the core binding mode to six cores per channel. MODE0_CPU0 is used as an example.
    MODE0_CPUS0=("2,3,4,5,6,7" "8,9,10,11,12,13" "14,15,16,17,18,19" "20,21,22,23,24,25" "26,27,28,29,30,31")
  9. When using the cfct_video script to start multiple containers, decide the number of containers to be started based on Table 1 in 7. The following uses 72 as an example. Run the following commands in sequence.

    To perform a multi-channel test, log in to the game for one channel. At the novice level (do not perform any operations after you start the game for the first time), set the image quality and log out. Refer to Creating a Base Data Volume and create a base data volume to ensure that the image quality of the containers started later are consistent.

    1. Start cloud phones.
      1
      ./cfct_video start 1 72
      
    2. Start Subway Surfers.
      1
      ./cfct_video start_game 1 72
      
    3. Capture screenshots.
      1
      ./cfct_video screencap 1 72
      

    After screen capture is complete, download the screenshots to the local host. If all screenshots are game screens at the novice level of Subway Surfers, the startup is successful.

  10. Test the performance after the client is connected to the server.
    1. Decompress the pressure test tool VideoClientEmulator.tar.gz on the local host or another server.
      1
      2
      tar -xvpf VideoClientEmulator.tar.gz
      cd VideoClientEmulator
      
    2. Run the following command to connect to the server.
      If the test is performed on the local host, the IP address can be 127.0.0.1. An example command is as follows. The parameters that follow ./test.sh connect are the server IP address, start port number, and end port number, respectively.
      1
      ./test.sh connect 127.0.0.1 8001 8036
      
      Check whether the number of clients displayed in the output is correct. For example:
      1
      2
      [INFO] checking current client num ...
      [INFO] current client process num: =36
      
    3. Run the following command to display and collect frame rate data.
      The following command is an example. The parameters that follow ./test.sh start are the server IP address, start port number, end port number, and collection time, respectively.
      1
      ./test.sh start 127.0.0.1 8001 8036 900
      

      After the 15-minute countdown, an analysis report of the frame rate data is generated, which is similar to the following figure. A CSV file that records single-channel data is generated in the test directory.

  11. Collect data.
    1. Obtain the native_perf_tools_quadra.tar.gz package by referring to Video Stream Engine, decompress it, and go to the native_perf_tools folder.
    2. The native perf tools depend on the nmon tool. If nmon is not installed, install it.
      • On Ubuntu, you can run the apt command for installation.
        1
        apt install nmon
        
      • On openEuler, you can run the yum command for installation.
        1
        yum install nmon
        
    3. Run the following commands in sequence to collect data for 15 minutes. {prefix_number} is the prefix of the folder for storing collected data. In the following figure, 60 is used as an example.
      1
      2
      source env.sh
      ./native_get_perf_data.sh all {prefix_number}
      

  12. After the collection is complete, download the following files to the local host, parse the files, and generate test reports.
    1. log folder in the native_perf_tools folder on the server, which contains performance data of the server, such as performance data of CPUs, GPUs, and encoding cards.
    2. The client.log file in the /sdcard/log directory on the mobile phone client is the client log file, including information about the decoding frame rate and network lag.
    3. Place the preceding files and folders in the same path, for example, D:\tmp.
      Download ExtractingData_quadra.rar and decompress it. Go to the ExtractingData folder and run the following command:
      python parse.py D:\tmp

      An output folder is generated in D:\tmp, which contains tables and images.