Tuning Process Flow

As a stream processing framework, Flink processes data, calculates, collects, and analyzes data using certain methods, and outputs the result. In a typical data processing scenario, the Kafka, Flink, and Redis components are used together. Data is obtained from Kafka, and Flink performs calculation and analysis and writes the result to Redis.

The stream processing framework has certain programming specifications. Figure 1 shows the process flow.

Figure 1 Stream processing framework

Each Flink program can be compiled in a similar way. The main idea is to define the Flink stream processing process, align the input and output of each step, and finally output data. After the workflow is defined, it can be executed. The entire execution process is infinite and never stops until the Flink task is killed by an external operation.

The yahoo-streaming-benchmark tool is used to test the Flink processing performance of the cluster. The test process includes generating data, writing data to Kafka, reading data from Kafka, preprocessing data, processing data in the window, and writing result to Redis. The end-to-end test of Flink stream processing is performed. The main performance indexes are data throughput and processing delay.

Data is generated by one Flink process and continuously written to Kafka. The other process reads and processes the data and writes the data to Redis. After the Flink job is executed, Redis collects and analyzes the result.

Parent topic: Introduction