我要评分
获取效率
正确性
完整性
易理解

Environment

Physical Networking

A cluster consists of one client, one controller node, and three compute nodes. Figure 1 shows the networking diagram. The controller node functions as the server, and the compute nodes are agent1, agent2, and agent3 of the big data cluster. In POC test scenarios, the client can be deployed on the controller node.

Figure 1 Networking diagram

Hardware Requirements

Table 1 lists the hardware requirements.

Table 1 Hardware requirements

Item

Description

Processor

Kunpeng 920 5250

Memory size

384 GB (12 x 32 GB)

Memory frequency

2666 MHz

Network

10GE for the service network and GE for the management network

Drive

System drive: 1 x RAID 0 (1 x 1.2 TB SAS HDD)

Data drive: 12 x RAID 0 (1 x 4 TB SATA HDD)

RAID controller card

LSI SAS3508

OS and Software Requirements

Table 2 lists the OS and software requirements.

Table 2 OS and software requirements

Item

Description

OS

openEuler 22.03 LTS SP1

JDK

BiSheng JDK 1.8.0_342

ZooKeeper

3.6.2

Hadoop

3.2.0

Spark

Spark 3.3.1

  • The machine learning algorithm library adapts to Spark 3.1.1 and supports the SVM, DBSCAN, DTB, and Word2Vec algorithms.
  • It can also be adapted to 2.X and 3.X.