Introduction
Slurm Overview
Slurm is an open-source, highly scalable cluster management tool and job scheduling system for Linux clusters of various scales. It provides the following key features:
Resource allocation
Allocates exclusive or non-exclusive resources of a certain period for users to run jobs.
Job management framework
Provides a framework for starting, executing, and monitoring parallel jobs on the allocated resources.
Queues
Places jobs in a queue when the submitted jobs require more resources than the available resources.
Abundant job scheduling policies
Provides advanced job scheduling policies, such as resource reservation, fair-share scheduling, and backfilling.
Other tools
Provide tools such as job information statistics and job status diagnosis.
Recommended Software Version
Slurm 18.08.7
Software Architecture

Slurm uses slurmctld, a centralized management process, to monitor resources and jobs.
Each compute node has a slurmd daemon, which waits for jobs, executes jobs, returns the result, and waits for more jobs.
slurmdbd is optional. It records job statistics of multiple clusters managed by Slurm in a database.
For more details, visit: