鲲鹏社区首页
中文
注册
开发者
我要评分
获取效率
正确性
完整性
易理解
在线提单
论坛求助

普通应用多副本配置步骤

  1. 使用yum安装patchelf。

    yum install patchelf

  2. 建立numa-duplication.sh
    #!/bin/bash
     
    # Function to get all dependencies of an executable
    function get_dependencies() {
        local path=$1
        for dep in $(ldd $path | awk '{print $3}' | grep -v 'not a dynamic executable'); do
            if [ -f $dep ]; then
                local resolved_dep=$(readlink -f $dep)
                if ! [[ " ${dependencies[*]} " =~ " ${dep} " ]]; then
                    dependencies+=("$dep")
                    get_dependencies $resolved_dep
                fi
            fi
        done
    }
     
    # Function to replace needed libraries in an executable
    function replace_needed() {
        local elf_path=$1
        local numa_lib_path=$2
        local elf_dependencies=($(patchelf --print-needed $elf_path))
        for dep in "${elf_dependencies[@]}"; do
            if [ "$dep" == "ld-linux-aarch64.so.1" ]; then
                continue
            fi
     
            so_real_name=${dependencies_hash[$dep]}
            if [ -z "$so_real_name" ]; then
                echo "Error: Dependency $dep not found. $elf_path"
                exit 1
            fi
     
            patchelf --replace-needed "$dep" "$numa_lib_path/$so_real_name" $elf_path
        done
    }
     
    # Get the number of NUMA nodes
    numa_node_num=$(numactl --hardware | grep "available:" | awk '{print $2}')
     
    # Get the path of the application
    app_path=$1
    if [ ! -f "$app_path" ]; then
        app_path=$(which $app_path)
    fi
    app_name=$(basename $app_path)
     
    # Declare and initialize global variables
    declare -g dependencies=()
    get_dependencies $app_path
    echo ${dependencies[@]}
     
    declare -A dependencies_hash
    for dep in "${dependencies[@]}"; do
        so_name=$(basename $dep)
        dependencies_hash[$so_name]=$(basename $(readlink -f $dep))
    done
     
    # Copy the application to each NUMA node and replace dependencies
    # rm -rf $app_name
    pwd_path=$(pwd)
    numa_node_num=16
    pids=()
    for i in $(seq 0 $((numa_node_num-1)))
    do
        (
            path_numa="./noshared_libraries/numa_$i"
            path_numa_lib="./noshared_libraries/numa_$i/lib"
            mkdir -p $path_numa
            mkdir -p $path_numa_lib
            cp $app_path $path_numa
            replace_needed "$path_numa/$app_name" "$pwd_path/noshared_libraries/numa_$i/lib"
            for so_item in "${dependencies[@]}"
            do
     
                resolved_dep=$(readlink -f $so_item)
                cp $resolved_dep $path_numa_lib
                so_name=$(basename $resolved_dep)
                replace_needed "$path_numa_lib/$so_name" "$pwd_path/noshared_libraries/numa_$i/lib"
            done
        ) &
        pids+=($!)
    done
     
    for pid in "${pids[@]}"; do
        wait "$pid"
    done
  3. 运行numa-duplication.sh打包二进制, 生成一个noshared_libraries文件夹包含二进制以及依赖。
    bash numa-duplication.sh /path_to_app/app
    • path_to_app表示最终要运行的文件app保存的路径,请根据实际情况替换。
    • numa-duplication.sh和app应保存在同一文件夹下。
  4. 建立运行脚本文件run.sh(以16进程16线程为例)。
    #!/bin/bash
     
     
    offset=2
    cores=38        # cores per NUMA node
    nppernuma=1
    thdperproc=16
     
    rank=${OMPI_COMM_WORLD_LOCAL_RANK}
    numaid=$((rank / nppernuma))
    numastart=$((numaid*cores))
    mode=$((rank - numaid * nppernuma))
    coreid=$((numastart + offset + mode * thdperproc))
    corend=$((coreid + thdperproc - 1))
     
    #numa-duplication
    fun_c=/path_to_app/app
     
    if [[ -d noshared_libraries ]]; then
      fun_c=/path_to_app/noshared_libraries/numa_${numaid}/app
    fi
     
    numa_offset=16
     
    # Provide the necessary runtime parameters at …
    taskset -c ${coreid}-${corend} numactl -m $(($numa_offset+numaid)) ${fun_c} …
     
    
  5. 多副本运行。
    mpirun -x PATH -x LD_LIBRARY_PATH -bind-to none -np 16 -N 16 -x OMP_NUM_THREADS=16 run.sh

    如果使用root账户运行,需要添加--allow-run-as-root参数。