Debugging a Hybrid MPI/OpenMP Application

Open Multi-Processing (OpenMP) is a set of compiler directives with a multi-thread programming design for shared-memory parallel systems. It supports programming languages such as C, C++, and Fortran. Based on threads, OpenMP provides a high-level abstraction of parallelism. Featured with simplicity, scalability, and portability, OpenMP is especially suitable for parallel programs on multi-core CPU hosts.

OpenMP is mainly used for fine-grained loop-level parallelism, which means each loop is allocated to different threads for execution. MPI is mainly used for coarse-grained parallelism. You can debug hybrid MPI/OpenMP applications by rank or thread.

Prerequisites

You have logged in to the Kunpeng DevKit.
The program has been compiled.
Open MPI 4.1.4 has been installed. For details about how to install Open MPI, see Open MPI Installation Guide.
In the Resource Manager of VS Code, the folder of the local source program has been opened.

Debugging a Hybrid MPI/OpenMP Application Written in C/C++

MPI/OpenMP applications support C, C++, and Fortran. For details about Fortran operations, see Debugging an MPI Application Written in Fortran.

Click in the navigation pane on the left, or click Development and then click Debug under Compiler and Debugger to display the debugging page.

Select Parallel HPC application and set MPI/OpenMP application debugging parameters, as shown in Figure 1. After the configuration is complete, click Start.

Figure 1 Parallel HPC application debugging

**Table 1** Parallel HPC application debugging parameters
Parameter	Description
Configured Remote Server	Target server for debugging a parallel HPC application.
Linux User Name	Name of the Linux user who starts the MPI application. NOTE: The root user account has the highest permission. To avoid unnecessary risks to the system, we strongly recommend you use a non-root account for the debugging.
Linux User Password	Password of the Linux user.
Remember password	If this option is selected, the Linux user password of the current remote server will be remembered.
SSH Port	SSH port number of the server where the MPI application is started.
Program	MPI application. Associated application paths can be automatically displayed for selection. Grant the Linux user the read permission for the current MPI application and the read, write, and execute permissions for the directory where the application is located. NOTE: The MPI application must be an executable file. If there is no source code information in the MPI application, the debugger performs debugging in assembly mode by default.
(Optional) Program Arguments	Arguments transferred to the application. If there are multiple arguments, separate them with spaces. Grant the Linux user the read, write, and execute permissions for the directory where the application is located and the execute permission for the parent directory.
Program Source Code Path	Shared path for storing the source code and MPI application. Associated working directory of source code can be automatically displayed for selection. If a shared path has been configured for the MPI application, the source code and MPI application must be stored in the shared path. Grant the Linux user the read and execute permissions for the source code directory of the current MPI application and the execute permission for the parent directory.
(Optional) Environment Variables	Enter the environment variables required for running a parallel HPC application in any of the following ways: export PATH=$PATH:/path/to/mpi source /configure/mpi/path/file module load /mpi/modulefiles
Launch Type	The options are: mpirun Donau Scheduler Slurm Scheduler
MPI Application Command	mpirun command and the corresponding command arguments. The number of ranks ranges from 1 to 2048.
Command to Run Donau Scheduler	Command to run Donau Scheduler and corresponding command arguments.
Command to Run Slurm Scheduler	srun command and corresponding command arguments.
OpenMP Application	If this parameter is selected, you need to enter the number of OpenMP threads.
OpenMP Threads	Number of OpenMP threads, which ranges from 1 to 1024.
(Optional) Deadlock Detection	If this parameter is selected, you need to specify the lock wait timeout.
(Optional) Lock Wait Timeout (s)	Amount of time a transaction waits to obtain a lock. The default value is 10. The value ranges from 10 to 60.

A message is displayed in the lower right corner, indicating that parallel HPC application debugging is starting. The tool checks whether the configuration is correct. If the configuration is incorrect, modify the configuration as prompted. If the configuration is correct, a dialog box is displayed in the lower right corner, indicating that the rank status is being read, as shown in Figure 2.
Figure 2 Reading the rank status

If parallel HPC application debugging fails to be started, rectify the fault by following instructions in Failed to Start a Parallel HPC Application Debugging Task.
If the rank status fails to be read, download the latest log file as prompted to view the failure details. See Figure 3.
Figure 3 Failed to read the rank status
If the rank status is successfully read, the hybrid MPI/OpenMP application debugging page automatically appears. The RUN AND DEBUG window, source code window, and debugging bar are displayed. The RUN AND DEBUG window consists of the debugging information and RANK INFO areas, as shown in Figure 4.
Figure 4 Rank status read successfully

Click buttons on the debugging bar to perform debugging. See Table 2. You can move the mouse pointer to the left part in the debugging button area and click to drag debugging buttons to another place.

**Table 2** Description of debugging buttons
Icon	Button	Description
	Resume	Runs the code until the next breakpoint.
	Suspend	Suspends the program that is being executed.
	Skip a single step	Executes the next line of code.
	Step in	Steps in to the function.
	Step out	Steps out of the function.
	Restart	Restarts debugging.
	Stop	Stops debugging.

Thread status: The dot before a thread indicates the thread status. A green dot indicates that the thread is stopped, a red dot indicates that the thread is running.

The line of code being debugged is highlighted. You can click the code line number to set a breakpoint. You can right-click the breakpoint to edit, delete, or disable it.
You can add conditional breakpoints (expressions and hit counts). Conditional breakpoints can be modified, enabled, disabled, and deleted. An expression breakpoint indicates that the program is stopped when the expression is true. A hit count breakpoint indicates that the program is stopped when the specified number of hits is reached or exceeded.

Figure 5 Setting a breakpoint
- An expression can contain a maximum of 1024 characters.
- A hit count is a positive integer less than or equal to 2147483647 (2³¹-1).
Click on the debugging bar to restart hybrid MPI/OpenMP application debugging. After the restart, a dialog box is displayed in the lower right corner, indicating that the rank status is being read. After the rank status is read successfully, the hybrid MPI/OpenMP application debugging page is displayed.
Figure 6 Restarting a debugging task
Click the RUN AND DEBUG window on the left to view the variables (Locals and Registers), WATCH, BREAKPOINTS, and CALL STACK information.
- During debugging, you can right-click a variable expression to reset the variable value or add the variable expression to the WATCH area. Register expressions cannot be added to the WATCH area.
- In the WATCH area, you can add, modify, delete some or all watched expressions. Only C or C++ expressions support this function.
- In the breakpoint area, you can delete a single breakpoint, and delete, enable, and disable all breakpoints.
See Figure 7.
Figure 7 Debugging information
- You can click the CALL STACK area to display the stack information, including the function name, file name, number of running lines, and address.
- In the debugging information area on the left, click a stack to display the corresponding source code or assembly code in the code area.
Click in the RANK INFO area. The COMMUNICATION SUBGROUP CHANGE page is displayed on the VS Code panel.
- Click the Change Overview tab. The communication subgroup change data is collected every 100 ms. The changes of communication subgroups are distinguished by diamonds in different colors. Blue indicates that a communication subgroup is created, purple indicates that a communication subgroup is cleared, and yellow indicates that there are communication subgroups created and cleared within 100 ms.
  Figure 8 Communication subgroup change overview
  
  Hover the mouse pointer over the diamond to see the detailed information about the communication subgroup change.
  
  Figure 9 Pop-up for a communication subgroup
- Click Change Details. On the page that is displayed, move the mouse pointer to view detailed information such as the belonging communication subgroup and rank information.
  Figure 10 Communication subgroup change details
  - If a deadlock is detected during debugging, a message will be displayed. You can click to see the deadlock details.
  - In the RANK INFO area, click to enable the function of collecting data about creating and clearing communication subgroups and of displaying the change overview on the VS Code panel.
  - In the Communication Subgroup Change area, you can click Communication subgroup created, Communication subgroups cleared, or Communication subgroups created and cleared to hide corresponding information.

Parent topic: Parallel HPC Application Running and Debugging