Introduction
Product Description
HiBench is a big data benchmark suite that helps you evaluate the performance, throughput, and system resource utilization of different big data platforms. It contains a set of Hadoop, Spark, and Streaming test programs, including Sort, WordCount, TeraSort, Sleep, SQL, PageRank, Nutch index, Bayes, Kmeans, NWeight, and enhanced DFSIO. This document describes how to use HiBench to perform benchmark tests on Spark of an HDP cluster.
Related Concepts
- Hadoop
Hadoop is an open source distributed storage and computing framework that is widely used for massive data storage and processing. It can process data in a reliable, efficient, and scalable manner.
- Spark
Spark is a unified analysis engine used for large-scale data processing. It features scalability and memory-based computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to build the data store and running system for various applications, such as real-time stream processing, machine learning, and interactive query.
Application Scenarios
HiBench is used to measure the performance of Spark clusters.