开发者
我要评分
获取效率
正确性
完整性
易理解
在线提单
论坛求助

片上内存带宽测试

本文针对片上内存带宽提供两种测试方式,即单NUMA 片上内存带宽测试和单节点片上内存带宽测试。

以下是单NUMA 片上内存带宽测试方式:

  1. 将测试线程数设置为单NUMA支持的最大线程数,并通过绑定片上内存以便将数据分布到片上内存。
  2. 测试单NUMA的片上内存带宽,参考测试命令如下:
    OMP_NUM_THREADS=38 OMP_PROC_BIND=close taskset -c 0-37 numactl -m 16 ./stream_c.exe

    参考测试结果中的Triad值,单个NUMA的带宽约在380000~410000MB/s:

    ------------------------------------------------------------- 
    STREAM version $Revision: 5.10 $
    -------------------------------------------------------------
    This system uses 8 bytes per array element.
    -------------------------------------------------------------
    Array size = 141648512 (elements), Offset = 0 (elements)
    Memory per array = 1080.7 MiB (= 1.1 GiB).
    Total memory required = 3242.1 MiB (= 3.2 GiB).
    Each kernel will be executed 500 times.
     The *best* time for each kernel (excluding the first iteration)
     will be used to compute the reported bandwidth.
    -------------------------------------------------------------
    Number of Threads requested = 38
    Number of Threads counted = 38
    -------------------------------------------------------------
    Your clock granularity/precision appears to be 1 microseconds.
    Each test below will take on the order of 6192 microseconds.
       (= 6192 clock ticks)
    Increase the size of the arrays if this shows that
    you are not getting at least 20 clock ticks per test.
    -------------------------------------------------------------
    WARNING -- The above is only a rough guideline.
    For best results, please be sure you know the
    precision of your system timer.
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          402552.3     0.005893     0.005630     0.007208
    Scale:         380128.4     0.006136     0.005962     0.009997
    Add:           420949.0     0.008377     0.008076     0.009797
    Triad:         403806.4     0.008782     0.008419     0.010945
    -------------------------------------------------------------
    Solution Validates: avg error less than 1.000000e-13 on all three arrays
    -------------------------------------------------------------

以下是单节点片上内存带宽测试方式:

  1. 使用MPI启动16个进程,每个进程绑定一个NUMA并将进程的线程数设置为单NUMA支持的最大线程数,同时通过绑定对应的片上内存以便将进程的数据分布到片上内存。
  2. 测试单节点的片上内存带宽,参考测试命令如下:
    mpirun -np 16 --map-by numa ./bw_node_mem_on_chip.sh

    bw_node_mem_on_chip.sh脚本参考如下:

    #!/bin/bash
    rank=${OMPI_COMM_WORLD_LOCAL_RANK}
    start=$(($rank * 38))
    end=$(($rank * 38 + 37))
    OMP_NUM_THREADS=38 OMP_PROC_BIND=close taskset -c ${start}-${end} numactl -m $(($rank + 16)) ./stream_c.exe
    参考测试结果中的Triad值,每个进程的测试结果约在380000~410000MB/s,则单节点片上内存带宽为16个进程带宽测试结果的总和:
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          395880.0     0.006105     0.005725     0.012895
    Scale:         383998.0     0.006379     0.005902     0.013061
    Add:           414187.1     0.008566     0.008208     0.016977
    Triad:         399149.2     0.008950     0.008517     0.014249
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          401481.2     0.006033     0.005645     0.011622
    Scale:         382253.1     0.006311     0.005929     0.018508
    Add:           421359.5     0.008711     0.008068     0.021030
    Triad:         408012.3     0.008970     0.008332     0.019766
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          394696.5     0.006160     0.005742     0.022151
    Scale:         386417.5     0.006283     0.005865     0.010724
    Add:           414235.3     0.008654     0.008207     0.023914
    Triad:         398357.4     0.009105     0.008534     0.029001
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          396987.7     0.006275     0.005709     0.054602
    Scale:         380966.3     0.006411     0.005949     0.012462
    Add:           413071.2     0.008757     0.008230     0.058634
    Triad:         397979.4     0.009042     0.008542     0.016576
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          395123.1     0.006204     0.005736     0.013740
    Scale:         380645.9     0.006716     0.005954     0.056543
    Add:           414127.0     0.008792     0.008209     0.042559
    Triad:         398680.4     0.009151     0.008527     0.018464
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          397751.8     0.006195     0.005698     0.014172
    Scale:         383224.0     0.006464     0.005914     0.025427
    Add:           414825.8     0.008741     0.008195     0.023310
    Triad:         400382.1     0.009250     0.008491     0.031880
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          395599.9     0.006272     0.005729     0.055929
    Scale:         387283.4     0.006598     0.005852     0.019510
    Add:           414427.9     0.008822     0.008203     0.057628
    Triad:         398168.3     0.009365     0.008538     0.024487
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          395106.6     0.006227     0.005736     0.012475
    Scale:         383424.9     0.006604     0.005911     0.135234
    Add:           415236.5     0.008757     0.008187     0.015858
    Triad:         400236.0     0.009139     0.008494     0.016357
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          394565.4     0.006253     0.005744     0.021472
    Scale:         381486.1     0.006486     0.005941     0.012711
    Add:           413922.6     0.008824     0.008213     0.026116
    Triad:         398446.5     0.009428     0.008532     0.054950
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          396011.9     0.006345     0.005723     0.055082
    Scale:         382053.4     0.006477     0.005932     0.037756
    Add:           415951.2     0.008905     0.008173     0.059440
    Triad:         399574.2     0.009203     0.008508     0.018427
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          395665.8     0.006477     0.005728     0.029760
    Scale:         383471.3     0.006535     0.005910     0.016319
    Add:           413778.5     0.008959     0.008216     0.023425
    Triad:         399899.2     0.009258     0.008501     0.018282
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          396491.0     0.006429     0.005716     0.020900
    Scale:         383224.0     0.006531     0.005914     0.029009
    Add:           415284.9     0.008942     0.008186     0.023462
    Triad:         397846.2     0.009269     0.008545     0.023168
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          403714.9     0.006431     0.005614     0.053611
    Scale:         378237.7     0.006667     0.005992     0.055854
    Add:           421995.5     0.009127     0.008056     0.058290
    Triad:         403315.2     0.009863     0.008429     0.134397
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          404058.1     0.006639     0.005609     0.024329
    Scale:         384463.9     0.006897     0.005895     0.056308
    Add:           420477.3     0.009536     0.008085     0.042015
    Triad:         406546.5     0.009711     0.008362     0.043979
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          397186.8     0.006756     0.005706     0.055778
    Scale:         376350.9     0.006862     0.006022     0.024000
    Add:           413682.4     0.009218     0.008218     0.056509
    Triad:         399160.4     0.009873     0.008517     0.035960
    -------------------------------------------------------------
    Function    Best Rate MB/s  Avg time     Min time     Max time
    Copy:          394434.5     0.006698     0.005746     0.057386
    Scale:         384775.2     0.007042     0.005890     0.055602
    Add:           414319.5     0.009339     0.008205     0.059440
    Triad:         398491.0     0.009642     0.008531     0.024842
    -------------------------------------------------------------