我要评分
获取效率
正确性
完整性
易理解

Installation Process

Table 1 lists the components to be installed and their installation directories.

The algorithm packages need to be installed only on the client, not on the controller node or compute nodes.

Table 1 Installation directories

Node

Installation Directory

Components to Be Installed

Client

/home/test/ sophon/lib

sophon-ml-acc_2.11-1.2.0.jar

sophon-ml-core_2.11-1.2.0.jar

sophon-ml-kernel-2.11-1.2.0-aarch_64.jar

fastutil-8.3.1.jar (third-party open-source library)

/home/test/ sophon/

JAR test package

Shell script for task submission

Perform the following steps:

  1. On the client node, log in to the server as an authorized user of the big data component, install the third-party open-source library fastutil-8.3.1.jar on which the algorithms depend to the corresponding directory, and set the permission for the JAR file to 750.
    1. Create a lib directory.
      mkdir -p /home/test/sophon/lib
    2. Go to the directory you created.
      cd /home/test/sophon/lib
    3. Download the fastutil-8.3.1.jar file.
      wget https://repo1.maven.org/maven2/it/unimi/dsi/fastutil/8.3.1/fastutil-8.3.1.jar
  2. Copy the machine learning and graph algorithm package files to the /home/test/sophon/lib/ directory on the client and set the permission for the JAR files to 750.
    cp /opt/Spark-ml-algo-lib-1.2.0/ml-core/target/sophon-ml-core_2.11-1.2.0.jar /home/test/sophon/lib
    cp /opt/Spark-ml-algo-lib-1.2.0/ml-accelerator/target/sophon-ml-acc_2.11-1.2.0.jar /home/test/ sophon/lib
    cp /opt/sophon-ml-kernel-2.11-1.2.0-aarch_64.jar /home/test/sophon/lib
  3. Save the JAR file (for example, ml-test.jar) of the algorithm test tool (developed by users) to the upper-level directory /home/test/sophon/ of the library algorithm package files on the client.
  4. Save the shell script for task submission in the /home/test/sophon/ directory where the test JAR file is stored. An example of the shell script content is as follows:
    #!/bin/bash
    spark-submit \
    --class com.bigdata.ml.RFMain \
    --driver-class-path "./lib/*" \
    --master yarn \
    --deploy-mode client \
    --driver-cores 36 \
    --driver-memory 50g \
    --jars "lib/fastutil-8.3.1.jar,lib/sophon-ml-acc_2.11-1.2.0.jar,lib/sophon-ml-core_2.11-1.2.0.jar,lib/sophon-mlkernel-2.11-1.2.0-aarch_64.jar" \
    --conf "spark.executor.extraClassPath=fastutil-8.3.1.jar:sophon-ml-acc_2.11-1.2.0.jar:sophon-mlcore_2.11-1.2.0.jar:sophon-ml-kernel-2.11-1.2.0-aarch_64.jar" \ ./ml-test.jar

    Table 2 describes the statements in the script.

    Table 2 Description of the statements in the script

    Statement

    Description

    spark-submit

    Jobs are submitted in spark-submit mode.

    --class com.bigdata.ml.RFMain

    Test program entry function for invoking algorithms

    --driver-class-path "./lib/*"

    Path on the client for storing the following files:

    Files required by machine learning algorithms: sophon-ml-acc_2.11-1.2.0.jar, sophon-mlcore_2.11-1.2.0.jar, sophon-mlkernel-2.11-1.2.0-aarch_64.jar, and fastutil-8.3.1.jar

    Files required by graph analysis algorithms: sophon-graph-kernel-2.11-1.2.0-aarch_64.jar and fastutil-8.3.1.jar

    --conf

    "spark.executor.extraClassPath "

    JAR files required by the machine learning algorithm library, algorithms, and the dependent third-party open source library fastutil.

    Files required by machine learning algorithms: sophon-ml-acc_2.11-1.2.0.jar, sophon-mlcore_2.11-1.2.0.jar, sophon-mlkernel-2.11-1.2.0-aarch_64.jar, and fastutil-8.3.1.jar

    Files required by graph analysis algorithms: sophon-graph-kernel-2.11-1.2.0-aarch_64.jar and fastutil-8.3.1.jar

    --master yarn

    Spark tasks are submitted on the Yarn cluster.

    --deploy-mode client

    Spark tasks are submitted in client mode.

    --driver-cores

    Number of cores used by the driver process

    --driver-memory

    Memory used by the driver, which cannot exceed the total memory of a single node

    --jars

    JAR files required by algorithms.

    ./ml-test.jar

    JAR file that is used as the test program.

    By default, the logs generated during the running of algorithm packages are displayed on the client console and are not stored in files. You can import the customized log4j.properties file to save the logs to your local PC. For details, see Saving Run Logs to a Local PC.