Running Array Jobs

warning

These are our old docs! Please see our new docs by clicking on the M3 docs dropdown and selecting "New M3 docs".

These old docs were roughly converted from our old format. As a result, this copy is not identical to our previous docs. You can still find our old docs at https://old-docs.massive.org.au/.

You may notice some formatting and structural issues with these old docs. We will not resolve these, this is here purely for backwards compatibility to ensure old URLs do not die.

Job arrays allow you to run a group of identical/similar jobs. The Slurm script is EXACTLY the same. The only difference between each sub-job is the environment variable, $SLURM_ARRAY_TASK_ID. So it can be a good idea if you want to do some data level parallelization. E.g. let sub-job 1 (SLURM_ARRAY_TASK_ID=1) process data chunk 1, sub-job 2 processes data chunk 2, ... etc.

To do that, just add the following statement in your submission script, where n is the number of jobs in the array:

#SBATCH --array=1-n

An example of Slurm Array job script

#SBATCH --array=1-20

Or you can specify an array job at submission time, without modifying your submission script:

sbatch --array=1-20 job.script

In Slurm, the job array is implemented as a group of single jobs. E.g. if you submit an array job with #SBATCH --array=1-4. When the starting job is ID=1000, the ids of all jobs are: 1000, 1001, 1002, 1003.

note

There is a limit of 1000 jobs per array. Slurm also has a bug where it will not allow array id's above this limit, this can be worked around with a prefix in the script (e.g. for "1001-1020" use --array=01-20 and reference variables with a prefix 10$SLURM_ARRAY_TASK_ID)

A maximum number of simultaneously running tasks from the job array may be specified using a % separator. For example --array=0-15%4 will limit the number of simultaneously running tasks from this job array to 4.

An example of Slurm Array job script​

An example of Slurm Array job script