Rate This Document
Findability
Accuracy
Completeness
Readability

Creating a Miss Event Analysis Task

Function

Miss event analysis is based on the Arm SPE capability. SPE samples instructions and records information about triggered events, including accurate PC pointer information. It analyzes miss events, such as LLC miss, TLB miss, remote access, and long latency load, and accurately locates the code that causes the events. You can modify the program to reduce the probability of certain miss events and improve program performance.

Prerequisites

  • No nodes are in the Offline state.
  • The miss event analysis function is available to openEuler 20.x or later and openEuler-based OS releases. In addition, the Statistical Profiling Extension (SPE) must have been properly configured. For details, see Configuring the SPE Environment.
  • Miss event analysis is not supported in VM and container environments.

    You must have the root permission to perform the following operations.

    1. If the configuration of Paranoid is incorrect, set the Paranoid variable to -1. Run the following command:
      1
      echo -1 > /proc/sys/kernel/perf_event_paranoid
      
    2. If a message is displayed indicating that data collection failed and the OS performance monitor is not enabled, run the following command to enable it:
      1
      echo 0 > /proc/sys/kernel/nmi_watchdog
      

Procedure

  1. Click next to System Profiler.

    Choose General Analysis from the drop-down list. The page for creating a task is displayed.

  2. Set task parameters by referring to "Task Management" and Table 1.
    Create a miss event analysis task. See Figure 1.
    Figure 1 Creating a miss event analysis task
    Table 1 Parameters for creating a miss event analysis task

    Parameter

    Description

    Task Name

    Name of the task. The name must meet the following requirements:

    1. Contain only letters, digits, and underscores (_).
    2. Contain 1 to 64 characters.

    Select Nodes

    Select the nodes to be analyzed. If there is only one node, this node is selected by default. A maximum of 10 nodes can be selected at a time.

    Analysis Object

    Select System or Application.

    Mode

    Select Launch application or Attach to process.

    This parameter is mandatory when Analysis Object is set to Application.

    Application Path

    Enter the absolute path of the application to be analyzed. For example, to analyze the loop_test application stored in the /home/test directory, enter /home/test/loop_test.

    This parameter is mandatory when Analysis Object is set to Application and Mode is set to Launch application.

    NOTE:
    • By default, applications in the /opt/ or /home/ directory are analyzed. The administrator can click in the upper right corner on the home page and choose Tool Settings > System Profiler > System Settings, and enter application paths (separated by semicolons) in the Application Path text box. Only administrators can modify this parameter. Common users can only view this parameter.

      You are advised to set the application path to a path such as /home or /opt. Do not set the application path to a system directory such as /, /dev, /sys, or /boot. Otherwise, system exceptions may occur.

    • The OS running user (devkitworker1) of the System Profiler must have the read and execute permissions on the applications to be analyzed.
    • In the multi-node scenario, you can enable Configure Node Parameters to configure this parameter separately for each node.

    (Optional) Application Parameters

    Set application parameters based on the actual scenario.

    This parameter is available when Analysis Object is set to Application and Mode is set to Launch application.

    NOTE:

    You can enable Configure Node Parameters to configure this parameter separately for each node.

    (Optional) Application User

    Information about the OS user who runs the application. The application runs under the preset devkitworker1 user by default. If the application only runs under a specific user, enable the option, configure the corresponding user name and password, and then run the application.

    This parameter is available when Mode is set to Launch application. By default, this parameter is disabled.

    Name

    Name of the OS user who runs the application.

    This parameter is mandatory when Application User is enabled.

    Password

    User password.

    This parameter is mandatory when Application User is enabled.

    Process Name

    Enter a process name. The process name can be a regular expression. Enter either the process name or PID.

    This parameter is mandatory when Analysis Object is set to Application and Mode is set to Attach to process.

    PID

    Enter the IDs of the processes to be analyzed. A maximum of 128 PIDs can be entered. Use commas (,) to separate them. Enter either the process name or PID.

    This parameter is available when Analysis Object is set to Application and Mode is set to Attach to process.

    NOTE:
    • If Attach to process is selected, the tool associates the ID of the process that runs an application to trace and collect the performance data of the application in real time. The OS running user devkitworker1 of the System Profiler must have the read permission on the application.
    • To query the PID, run the ps -ef | grep Program_name command.
    • You can enable Configure Node Parameters to configure this parameter separately for each node.

    Analysis Type

    Select Memory Access.

    Access Analysis Type

    Select Miss events.

    Sampling Duration (s)

    Sampling duration, in seconds. The default value is 5. The value range is 1 to 300.

    Sampling Interval (Instructions)

    Sampling interval. The default value is 8192. The value ranges from 1024 to 2^32-1. You need to set this parameter in Advanced Settings.

    (Optional) Sampling Delay (s)

    Sampling delay, in seconds. The default value is 1. The value ranges from 0 to 299 and is smaller than the sampling duration. You can set this parameter in Advanced Settings.

    NOTE:

    The sampling starts after the specified time. This parameter is used to ignore the program startup analysis, warm up the sampling program, and eliminate sampling delay caused by factors such as environment detection.

    (Optional) Indicator Type

    Type of the metrics. This parameter can be set in Advanced Settings. The options are:

    • LLC Miss: Number of memory request misses in the LLC.
    • TLB Miss: Number of CPUs' memory access or addressing operations where no virtual-to-physical mapping is found in the TLB.
    • Remote Access: Number of cross-CPU DRAM accesses.
    • Long Latency Load: Ratio of cross-CPU DRAM accesses where the access latency exceeds the preset minimum latency.

    Minimum Delay (Clock Cycles)

    Minimum delay. The default value is 100. The value range is 1 to 4095.

    NOTE:

    This parameter is displayed when Indicator Type is set to Long Latency Load.

    (Optional) CPU Cores to Be Sampled

    Enter the CPU core ID, which can be configured in Advanced Settings.

    NOTE:
    • Set this parameter if you want to collect the performance data of an application on a CPU core. Enter one or more CPU core IDs here. The value range is 0 to the total number of CPU cores of the server minus 1. For example, if you enter 0,1,2,10 for a 16-core CPU, performance data of CPU cores 0, 1, 2, and 10 will be collected and analyzed.
    • You can enable Configure Node Parameters to configure this parameter separately for each node.

    (Optional) Sampling Range

    Sampling range. The default value is All. You can set this parameter in Advanced Settings. The options are:

    • All: collects performance data of the application layer and OS kernel.
    • User Mode: collects performance data of the application layer.
    • Kernel Mode: collects performance data of the OS kernel.

    dwarf

    Set whether to collect the source code information of the function. This option is visible in Advanced Settings and is disabled by default. Enabling this option may increase the analysis duration.

    C/C++ Source File Directory

    Enter the C/C++ source file project directory. This parameter is available when Analysis Object is set to Application and dwarf is enabled in Advanced Settings.

    NOTICE:

    The source code of the application must comply with the general programming specifications. Otherwise, the source code of hot functions in the analysis result may not be displayed properly.

    NOTE:
    • You can use this parameter to import the source code of an application to view the performance data after the source code and assembly instructions are mapped.
    • You can enable Configure Node Parameters to configure this parameter separately for each node.
  3. Click OK.

    You can click the icons next to the task name to perform the following operations:

    • : cancels the analysis task. After an analysis task is canceled, the collected information will be deleted.
    • : restarts the analysis task. You can modify task parameter settings and restart an analysis task. This button is available when a task is canceled or fails.
    • : deletes the analysis task. After a task is deleted, all data of this task will be deleted. Exercise caution when performing this operation.
    • : performs the analysis again. The analysis task is renamed and restarted.
    • : changes the task or report name. The report naming rule is the same as that of a task.