Rate This Document
Findability
Accuracy
Completeness
Readability

Rust Process Suspension

Fault Locating

A deadlock occurs when two or more threads wait for each other to take action. Due to resource locking, no thread can obtain the required resources, resulting in infinite suspension. The program is always in the running state and does not stop. Figure 1 shows how to locate and rectify the fault.

Figure 1 Fault locating of the Rust process suspension problem
  1. Run the top command. It is found that the process has been running for a long time and no new logs are printed. If the single-core CPU usage is 100% or the usage of all CPUs is 0%, the process is suspended.
  2. Locate the threads that may be suspended based on the service logic and run the GDB attach command for debugging.
  3. Analyze the stack information and service logic to find out the cause.
  4. Modify and recompile the code. Then perform a verification.
  5. If the problem is resolved, integrate the modification into the code.
  6. If the problem persists, add the location information, and recompile and run the program.

Case: System Suspension Caused by Pipe Communication

Symptom

In Rust, thread communication generally relies on two ways, pipe communication and mutex management. Due to the special ownership mechanism of Rust, most deadlock problems caused by data competition are avoided. However, the deadlock problem is not resolved completely. If a deadlock occurs due to the mutex, GDB call stack shows that the thread is in the __lll_lock_wait state. The symptom and fault locating process are the same as those of the C/C++ process suspension. If a deadlock occurs due to pipe communication, locate the fault as follows:

Fault Locating

  1. It is found that the process has been running for a long time, and the CPU usage is 0 according to the top command output. It is suspected that the process is suspended.
  2. Run the cargo build command for compilation. The generated executable file is stored in the ./target/debug/ directory and is executed using GDB. For details, see GDB.
     gdb ./target/debug/FileCyber
    (gdb) r
    
  3. When a deadlock occurs, run the ctrl c command to terminate the process and run the info threads command to view the threads.

  4. If a thread has the pthread_mutex_lock information, check the associated threads. However, in the current state, thread 1 is performing pthread_join, that is, is waiting for the completion of the other threads. According to the code logic, the process is waiting for the completion of task t1. Select thread 4 and thread 2.

    Run the bt command to view the backtrace.

    The TID of thread 4 is 33928 and that of thread 2 is 33926.

  5. Analyze the frame.

    Check the code based on the bt information. It is found that the deadlock occurs because t3 did not receive the message. After the corresponding logic is checked and the code is modified, the deadlock problem does not recur.

    When modifying the third-party code of Rust, use the patch.crates-io tag to make the Rust code depend on the local modified third-party library.