Cluster Environment
A cluster consists of one client, one controller node, and three compute nodes. Figure 1 shows the networking diagram. The controller node functions as the server, and the compute nodes are agent1, agent2, and agent3 of the big data cluster. In POC test scenarios, the client can be deployed on the controller node.
Cluster Hardware
Table 1 lists the hardware configurations in the cluster (controller node and compute nodes).
Item |
Requirement |
|---|---|
Processor |
Kunpeng 920 processor |
Memory size |
384 GB (12 x 32 GB) |
Memory frequency |
2666 MHz |
NIC |
10GE for the service network and GE for the management network |
Drive |
System drive: 1 x RAID 0 (1 x 1.2 TB SAS HDD) Data drive: 12 x RAID 0 (1 x 4 TB SATA HDD) |
RAID controller card |
LSI SAS3508 |
Cluster Software
Table 2 lists the required software versions.
Item |
Requirement |
|---|---|
OS |
openEuler-20.03-LTS-SP1 |
JDK |
BiSheng JDK 1.8.0_272 |
ZooKeeper |
3.4.9 |
Hadoop |
3.1.1 |
Spark |
Apache Spark 2.3.2 or 2.4.6 |
- The Spark deployment mode is Spark on Yarn.
- The Kunpeng algorithm library is compatible with Apache Spark 2.3.2 and 2.4.6. Other platforms are not verified. For security purposes, you are advised to use a later version.
