Creating a Microarchitecture Analysis Task
Function
The microarchitecture analysis is based on Arm Performance Monitor Unit (PMU) events. You can obtain the running status of instructions on the CPU pipeline, helping quickly locate performance bottlenecks of the current application on the CPU. Users can modify their programs to make full use of the current hardware resources.
Prerequisites
No offline nodes exist.
You must have the root permission to perform the following operations.
- If the configuration of Paranoid is incorrect, set the Paranoid variable to -1. For example, in CentOS and openEuler, run the following command:
1echo -1 > /proc/sys/kernel/perf_event_paranoid
- If a message is displayed indicating that data collection failed and the OS performance monitor is not enabled, run the following command to enable it:
1echo 0 > /proc/sys/kernel/nmi_watchdog
Procedure
- Click
next to System Profiler.Choose General Analysis from the drop-down list. The page for creating a task is displayed.
- Set task parameters by referring to "Task Management" and Table 1.Create a microarchitecture analysis task. See Figure 1.
Table 1 Parameters for creating a microarchitecture analysis task Parameter
Description
Task Name
Name of the task. The name must meet the following requirements:
- Contain only letters, digits, and underscores (_).
- Contain 1 to 64 characters.
Select Nodes
Select the nodes to be analyzed. If there is only one node, this node is selected by default. A maximum of 10 nodes can be selected at a time.
Analysis Object
Select System or Application.
Mode
Select Launch application or Attach to process.
This parameter is mandatory when Analysis Object is set to Application.
Application Path
Enter the absolute path of the application to be analyzed. For example, to analyze the loop_test application stored in the /home/test directory, enter /home/test/loop_test.
This parameter is mandatory when Analysis Object is set to Application and Mode is set to Launch application.
NOTE:- By default, applications in the /opt/ or /home/ directory are analyzed. The administrator can click
in the upper right corner on the home page and choose Tool Settings > System Profiler > System Settings, and enter application paths (separated by semicolons) in the Application Path text box. Only administrators can modify this parameter. Common users can only view this parameter.
You are advised to set the application path to a path such as /home or /opt. Do not set the application path to a system directory such as /, /dev, /sys, or /boot. Otherwise, system exceptions may occur.
- The OS running user (devkitworker1) of the System Profiler must have the read and execute permissions on the applications to be analyzed.
- In the multi-node scenario, you can enable Configure Node Parameters to configure this parameter separately for each node.
(Optional) Application Parameters
Set application parameters based on the actual scenario.
This parameter is available when Analysis Object is set to Application and Mode is set to Launch application.
NOTE:You can enable Configure Node Parameters to configure this parameter separately for each node.
(Optional) Application User
Information about the OS user who runs the application. The application runs under the preset devkitworker1 user by default. If the application only runs under a specific user, enable the option, configure the corresponding user name and password, and then run the application.
This parameter is available when Mode is set to Launch application. By default, this parameter is disabled.
Name
Name of the OS user who runs the application.
This parameter is mandatory when Application User is enabled.
Password
User password.
This parameter is mandatory when Application User is enabled.
Process Name
Enter a process name. The process name can be a regular expression. Enter either the process name or PID.
This parameter is mandatory when Analysis Object is set to Application and Mode is set to Attach to process.
PID
Enter the IDs of the processes to be analyzed. A maximum of 128 PIDs can be entered. Use commas (,) to separate them. Enter either the process name or PID.
This parameter is available when Analysis Object is set to Application and Mode is set to Attach to process. A maximum of 128 PIDs are supported.
NOTE:- If Attach to process is selected, the tool associates the ID of the process that runs an application to trace and collect the performance data of the application in real time. The OS running user devkitworker1 of the System Profiler must have the read permission on the application.
- To query the PID, run the ps -ef | grep Program_name command.
- You can enable Configure Node Parameters to configure this parameter separately for each node.
Analysis Type
Select Microarchitecture.
Collection Mode
Data sampling mode. Select CPU (default) or Process/Threads.
Collection Duration (s)
Sampling duration, in seconds. The default value is 60. The value range is 1 to 900.
Top-Down Type
Metrics to be collected.
- level 1: Backend Bound, Bad Speculation, Frontend Bound, and Retiring are collected by default.
- Other levels: You can select the metrics to be collected.
Analysis Metric
Metrics to be analyzed. This parameter is available when Top-Down Type is set to Other levels.
- Bad Speculation: Pipeline resources waste due to incorrect instruction speculations.
- Frontend Bound: Frontend is a part of a processor where instructions are fetched and decoded into the micro-ops (uOps) for the backend pipeline execution. This metric reflects the ratio of processor front-end resources that are under-utilized.
- Backend Bound->Resource Bound: (applicable to Kunpeng 920 processors; hidden if not supported by the environment) Backend is the processor portion that performs out-of-order dispatch and execution of uOps and returns results. Resource Bound is a subclass of Backend Bound. It reflects pipeline stalls that occur when uOps are dispatched to an out-of-order execution scheduler due to insufficient resources.
- Backend Bound->Core Bound: Backend is the processor portion that performs out-of-order dispatch and execution of uOps and returns results. Core Bound is a subclass of Backend Bound. It reflects the ratio of performance bottlenecks due to insufficient CPU execution unit resources.
- Backend Bound->Memory Bound: Backend is the processor portion that performs out-of-order dispatch and execution of uOps and returns results. Memory Bound is a subclass of Backend Bound. It reflects pipeline stalls due to data read/write waiting.
(Optional) CPU Cores to Be Sampled
IDs of the CPU cores to be sampled. This parameter is available in Advanced Configurations when Collection Mode is CPU.
NOTE:- Set this parameter if you want to collect the performance data of an application on a CPU core. Enter one or more CPU core IDs here. The value range is 0 to the total number of CPU cores of the server minus 1. For example, if you enter 0-2,10 for a 16-core CPU, performance data of CPU cores 0, 1, 2, and 10 will be collected and analyzed.
(Optional) Collection Range
Collection range. This parameter is available in Advanced Configurations when Collection Mode is set to Process/Threads. The default value is All. The options are:
- All: collects performance data of the application layer and OS kernel.
- User Mode: collects performance data of the application layer.
- Kernel Mode: collects performance data of the OS kernel.
Collection Delay (s)
Sampling delay, in seconds. The value ranges from 0 (default) to 899. This parameter can be configured in Advanced Configurations and is mandatory when Analysis Object is Application and Mode is Launch application.
NOTE:The sampling starts after the specified time. This parameter is used to ignore the program startup analysis, warm up the sampling program, and eliminate sampling delay caused by factors such as environment detection.
- Click OK.
You can click the icons next to the task name to perform the following operations:
: cancels the analysis task. After an analysis task is canceled, the collected information will be deleted.
: restarts the analysis task. You can modify task parameter settings and restart an analysis task. This button is available when a task is canceled or fails.
: deletes the analysis task. After a task is deleted, all data of this task will be deleted. Exercise caution when performing this operation.
: performs the analysis again. The analysis task is renamed and restarted.
: creates a task for comparing analysis results.
: changes the task or report name. The report naming rule is the same as that of a task.
