我要评分
获取效率
正确性
完整性
易理解

Introduction

Slurm Overview

Slurm is an open-source, highly scalable cluster management tool and job scheduling system for Linux clusters of various scales. It provides the following key features:

Resource allocation

Allocates exclusive or non-exclusive resources of a certain period for users to run jobs.

Job management framework

Provides a framework for starting, executing, and monitoring parallel jobs on the allocated resources.

Queues

Places jobs in a queue when the submitted jobs require more resources than the available resources.

Abundant job scheduling policies

Provides advanced job scheduling policies, such as resource reservation, fair-share scheduling, and backfilling.

Other tools

Provide tools such as job information statistics and job status diagnosis.

Recommended Software Version

Slurm 18.08.7

Software Architecture

Slurm uses slurmctld, a centralized management process, to monitor resources and jobs.

Each compute node has a slurmd daemon, which waits for jobs, executes jobs, returns the result, and waits for more jobs.

slurmdbd is optional. It records job statistics of multiple clusters managed by Slurm in a database.

For more details, visit:

https://slurm.schedmd.com/overview.html