Running and Verifying SGE
Procedure
- Use PuTTY to log in to the SGE master host as the root user.
- Run the following command to add the execution host:
qconf -as armnode4
- Query the default host name.
qconf -shgrpl
@allhosts
- Modify the host group information and add the execution host to the host group.
- qconf -mhgrp @allhosts
- Press i to enter the insert mode and enter the following information:
group_name @allhosts hostlist armnode4
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Switch to a non-test user.
su - test
- Load the SGE environment variables as the test user.
source /path/to/SGE_ROOT/default/common/settings.sh
- Add the SGE environment variables to the .bashrc file of the test user for the environment variables to take effect permanently.
echo "source /path/to/SGE_ROOT/default/common/settings.sh" >> /path/test/.bashrc
- Create an execution script run.sh.
Run the vi run.sh command.
Press i to enter the insert mode and add the following information:#!/bin/bash #$ -S /bin/bash nodeinfo=`hostname` echo "This is the SGE test from $nodeinfo" >> sge-test.log
Press Esc, type :wq!, and press Enter to save the file and exit.
- Submit a job.
qsub -V -cwd -o stdout.txt -e stderr.txt run.sh
Table 1 describes the command parameters.Table 1 Common parameters in the qsub command Parameter
Description
-V
Exports the environment variables in the current shell to the job to be submitted.
-cwd
Runs the program in the current working directory. By default, it is the home directory on the compute node of the current user.
-o
Adds the standard output to the end of the specified file. The default file name is $job_name.o$job_id.
-e
Adds the standard error output to the end of the specified file. The default file name is $job_name.e$job_id.
-q
Specifies the queue to be delivered. If this parameter is not specified, the system searches for the queue with the permission and minimum load to execute the job.
-S
Specifies the software that runs the commands in run.sh. The default value is tcsh. You are advised to use bash. Set this parameter to /bin/bash, or add "#$ -S /bin/bash" at the beginning of the run.sh file. If the parameter is not set to bash, the output contains "Warning: no access to tty (Bad file descriptor)."
-hold_jid
Specifies the jobs to be executed before the current job. A comma (,) is used to separate multiple job_ids.
-N
Sets the job name. The default job name is the input file name of qsub.
-p
Sets the job priority. The value ranges from -1023 to 1024. A larger value indicates a higher priority. However, a higher permission is required to set this parameter to a positive number. Common users cannot set this parameter to a positive number.
-j y|n
Specifies whether to merge the standard output and standard error output streams into the -o parameter result.
-pe
Sets parallel environment (PE) information.
- Manage jobs. Table 2 describes the commands and parameter for job management.
Table 2 Commands and parameter for job management Command
Description
qstat -f
Queries all jobs submitted by the current user on the current node. The job status can be:
- qw: waiting state. A job is in qw state after being submitted. The job runs immediately once computing resources are available.
- hqw: pending state. A job in this state starts to run only after the previous jobs are executed. The job submitted by using qsub with -hold_jid specified is in this state.
- Eqw: job waiting in error state.
- r: running state.
- s: job suspended temporarily because resources are used by the job with a higher priority.
- dr: job deleted after a node exits unexpectedly. The job disappears only after the node is restarted.
qstat jobID
Queries information by job ID.
qstat -u user
Queries information by user.
qdel jobID
Deletes a job.