FAQ: Embarassingly parallel jobs using Python and R

Setup of the problem

Very often the user wants to run a certain task N times with different input values. Here we show a recipe for running such parameter studies using python on SLURM with multinodes and the Job Array feature.

my_script.py
counter=0
print(f"hello from worker number {counter}")
my_script.R
counter <- 0
print(paste("hello from worker number",counter))


when we run this python script we are getting the following output:

bash output
$ module load python
$ python3 my_script.py
hello from worker number 0
bash output
$ module load r
$ Rscript my_script.R
[1] "hello from worker number 0"


From bash to sbatch

Now we want to run many copies of the program using slurm. For that purpose we generate a slurm script:

my_script.slurm
#!/bin/bash
module load slurm_setup
module load python
python3 my_script.py
my_script.slurm
#!/bin/bash
module load slurm_setup
module load r
Rscript my_script.R


we can run this script still interactively on the command line:

bash output
$ bash my_script.slurm
hello from worker number 0
bash output
$ bash my_script.slurm
[1] "hello from worker number 0"


but we can also send the script to slurm to run it on the serial queue:

bash output
$ sbatch --clusters=serial --time=00:01:00 my_script.slurm
Submitted batch job 1739218 on cluster serial
$ ls
my_script.py my_script.slurm slurm-1739218.out
$ cat slurm-1739218.out
hello from worker number 0
bash output
$ sbatch --clusters=serial --time=00:01:00 my_script.slurm
Submitted batch job 1739218 on cluster serial
$ ls
my_script.R my_script.slurm slurm-1739218.out
$ cat slurm-1739218.out
[1] "hello from worker number 0"


In the output file we see the desired output. We can also insert the job options into our script:

my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
module load slurm_setup
module load python
python3 my_script.py
my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
module load slurm_setup
module load r
Rscript my_script.R

we can then run the slurm batch script using the sbatch command

bash output
$ sbatch my_script.slurm
Submitted batch job 1739218 on cluster serial
$ cat slurm-1739218.out
hello from worker number 0
bash output
$ sbatch my_script.slurm
Submitted batch job 1739218 on cluster serial
$ cat slurm-1739218.out
[1] "hello from worker number 0"


Going parallel

Now we would like to run several workers in parallel which can be identified by the variable counter. When we try the following we start up the script 4 times (ntasks) and run it in parallel (srun) but we fail with the counter:

my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
#SBATCH --ntasks=4
module load slurm_setup
module load python
srun python3 my_script.py
my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
#SBATCH --ntasks=4
module load slurm_setup
module load r
srun Rscript my_script.R


because this gives the output:

bash output
$ sbatch my_script.slurm
$ cat slurm-1739220.out
hello from worker number 0
hello from worker number 0
hello from worker number 0
hello from worker number 0
bash output
$ sbatch my_script.slurm
$ cat slurm-1739220.out
[1] "hello from worker number 0"
[1] "hello from worker number 0"
[1] "hello from worker number 0"
[1] "hello from worker number 0"


We need the information which worker is running. This information can be obtained by the environment variable SLURM_PROCID. We have therefore to change the python script from above:

my_script.py
import os
counter=int(os.environ['SLURM_PROCID'])
print(f"hello from worker number {counter}")
my_script.R
counter <- as.numeric(Sys.getenv("SLURM_PROCID"))
print(paste("hello from worker number",counter))


when we run the slurm script again we obtain the following output:

bash output
$ sbatch my_script.slurm
$ cat slurm-1739220.out
hello from worker number 0
hello from worker number 2
hello from worker number 1
hello from worker number 3
bash output
$ sbatch my_script.slurm
$ cat slurm-1739220.out
[1] "hello from worker number 1"
[1] "hello from worker number 2"
[1] "hello from worker number 3"
[1] "hello from worker number 0"


We observe that the list of workers is not ordered because they all run in parallel and the first worker who finishes writes the output.

By using this recipe we can also run slurm jobs on several nodes at once. All we have to do is change the cluster and the number of tasks.

my_script.slurm
#!/bin/bash 
#SBATCH --time=00:01:00
#SBATCH --get-user-env
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --ntasks=56
module load slurm_setup
module load python
srun python3 my_script.py
my_script.slurm
#!/bin/bash 
#SBATCH --time=00:01:00
#SBATCH --get-user-env
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --ntasks=56
module load slurm_setup
module load r
srun Rscript my_script.R


We get the following output:

bash output
$ cat slurm-132433.out
hello from worker number 19
hello from worker number 8
hello from worker number 1

.....

hello from worker number 44
hello from worker number 33
hello from worker number 43
hello from worker number 24
bash output
$ cat slurm-132433.out
[1] "hello from worker number 19"
[1] "hello from worker number 8"
[1] "hello from worker number 1"

.....

[1] "hello from worker number 44"
[1] "hello from worker number 33"
[1] "hello from worker number 43"
[1] "hello from worker number 24"


By using this recipe we can now run 56 jobs in parallel.

Job Chaining with Job Arrays

Sometimes we want to do larger scans and want to run 560 jobs. There exists a feature called Job Arrays in slurm, which submits the same job serially N times. The workflow is the following:

Lets say we want to run the 4 tasks job 2 times the we can use the Job Array option. We also have to change the source code script again:

my_script.py
import os
procid=int(os.environ['SLURM_PROCID']) 
array_task_id=int(os.environ["SLURM_ARRAY_TASK_ID"])
ntasks=int(os.environ["SLURM_NTASKS"])
counter=array_task_id * ntasks + procid
print(f"hello from worker number {counter} with procid: {procid} and array_task_id: {array_task_id}")

my_script.R
procid <- as.numeric(Sys.getenv("SLURM_PROCID"))
array_task_id <- as.numeric(Sys.getenv("SLURM_ARRAY_TASK_ID"))
ntasks <- as.numeric(Sys.getenv("SLURM_NTASKS"))
counter <- array_task_id * ntasks + procid
print(paste("hello from worker number",counter," with procid:",procid,"and array_task_id",array_task_id))


the counter has a parallel component from ntasks and a serial component from the job arrays.

The slurm file has to be changed too.

my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
#SBATCH --ntasks=4
#SBATCH --array=0-2
module load slurm_setup
module load python
srun python3 my_script.py
my_script.slurm
#!/bin/bash
#SBATCH --clusters=serial 
#SBATCH --time=00:01:00
#SBATCH --ntasks=4
#SBATCH --array=0-2
module load slurm_setup
module load r
srun Rscript my_script.R


we get the output in 3 different output files!

bash output
$ sbatch my_script.slurm
$ cat slurm-1739241*
hello from worker number 1 with procid: 1 and array_task_id: 0
hello from worker number 0 with procid: 0 and array_task_id: 0
hello from worker number 2 with procid: 2 and array_task_id: 0
hello from worker number 4 with procid: 1 and array_task_id: 1
hello from worker number 5 with procid: 2 and array_task_id: 1
hello from worker number 3 with procid: 0 and array_task_id: 1
hello from worker number 7 with procid: 1 and array_task_id: 2
hello from worker number 8 with procid: 2 and array_task_id: 2
hello from worker number 6 with procid: 0 and array_task_id: 2
bash output
$ sbatch my_script.slurm
$ cat slurm-1739241*
[1] "hello from worker number 1 with procid: 1 and array_task_id: 0"
[1] "hello from worker number 0 with procid: 0 and array_task_id: 0"
[1] "hello from worker number 2 with procid: 2 and array_task_id: 0"
[1] "hello from worker number 4 with procid: 1 and array_task_id: 1"
[1] "hello from worker number 5 with procid: 2 and array_task_id: 1"
[1] "hello from worker number 3 with procid: 0 and array_task_id: 1"
[1] "hello from worker number 7 with procid: 1 and array_task_id: 2"
[1] "hello from worker number 8 with procid: 2 and array_task_id: 2"
[1] "hello from worker number 6 with procid: 0 and array_task_id: 2"


This is the version for the serial queue, for the cm2_tiny queue it is very similar:

my_script.slurm
#!/bin/bash 
#SBATCH --time=00:01:00
#SBATCH --get-user-env
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --ntasks=56
#SBATCH --array=0-3
module load slurm_setup
module load python
srun python3 my_script.py
my_script.slurm
#!/bin/bash 
#SBATCH --time=00:01:00
#SBATCH --get-user-env
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --ntasks=56
#SBATCH --array=0-3
module load slurm_setup
module load r
srun Rscript my_script.R


which returns the output in 4 files!

bash output
$ sbatch my_script.slurm
$ cat slurm-132452_*
hello from worker number 3 with procid: 3 and array_task_id: 0
hello from worker number 11 with procid: 11 and array_task_id: 0
hello from worker number 2 with procid: 2 and array_task_id: 0
hello from worker number 6 with procid: 6 and array_task_id: 0
hello from worker number 7 with procid: 7 and array_task_id: 0

...

hello from worker number 198 with procid: 30 and array_task_id: 3
hello from worker number 208 with procid: 40 and array_task_id: 3
hello from worker number 216 with procid: 48 and array_task_id: 3
hello from worker number 223 with procid: 55 and array_task_id: 3
bash output
$ sbatch my_script.slurm
$ cat slurm-132452_*
[1] "hello from worker number 3 with procid: 3 and array_task_id: 0"
[1] "hello from worker number 11 with procid: 11 and array_task_id: 0"
[1] "hello from worker number 2 with procid: 2 and array_task_id: 0"
[1] "hello from worker number 6 with procid: 6 and array_task_id: 0"
[1] "hello from worker number 7 with procid: 7 and array_task_id: 0"

...

[1] "hello from worker number 198 with procid: 30 and array_task_id: 3"
[1] "hello from worker number 208 with procid: 40 and array_task_id: 3"
[1] "hello from worker number 216 with procid: 48 and array_task_id: 3"
[1] "hello from worker number 223 with procid: 55 and array_task_id: 3"


Debugging

Sometimes something is broken and we want to interactively debug the python script in parallel mode. We can do this by using an interactive srun shell. To that purpose we define a new command

bash output
$ alias irun="salloc -pcm2_inter -n 4 srun"

If you prepend this command to the python3 command then you can interactively see what is going on:

bash output
$ irun python3 my_script.py
salloc: Granted job allocation 150821
hello from worker number 0
hello from worker number 0
hello from worker number 0
hello from worker number 0
bash output
$ irun Rscript my_script.R
salloc: Granted job allocation 150821
[1] "hello from worker number 0"
[1] "hello from worker number 0"
[1] "hello from worker number 0"
[1] "hello from worker number 0"


and you can see where to program gives an error or wrong results.