Compiling and executing a parallel MPI C program

Let’s compile and execute the program SommeVecVecPAR.c which computes the sum of the two vectors A and B and the result is put in C.

Definition of the compiling environment:

Supposing that we want to use Cascade partition, we have to connect to one of the login nodes for this partition, for example s92node01 (see Login nodes).

To use the software modules for this partition, we have to execute:

module use /applis/PSMN/debian11/Cascade/modules/all

Now we can see available modules:

module avail

Load a module for GNU compilers (GCC):

module load OpenMPI/4.1.1-GCC-10.3.0

We can check the compiler version:

mpicc --version
gcc (GCC) 10.3.0
Copyright © 2020 Free Software Foundation, Inc.

Compilation

mpicc -o SommeVecVecPAR.exe SommeVecVecPAR.c

The binary file (executable) SommeVecVecPAR.exe has been generated.

Note

If you prefer to use the Intel compiler, then you have to load the intel/2021a module (instead of OpenMPI/4.1.1-GCC-10.3.0) and launch the compilation with the mpiicc command (instead of mpicc)

Interactive execution (on login node):

mpirun -np 2 ./SommeVecVecPAR.exe

The result is displayed on the screen :

Les deux vecteurs :

A =            1           2           3           4           5           6           7           8          9          10

B =            9           8           7           6           5           4           3           2          1           0

Je suis le proc            1 parmi            2  processus
A local ( proc            1  )  =            6           7           8           9          10
B local ( proc            1  )  =            4           3           2           1           0
C local ( proc            1  )  =           10          10          10          10          10
LES DEUX VECTEURS LOCAUX :
Je suis le proc            0 parmi            2  processus
A local ( proc            0  )  =            1           2           3           4           5
B local ( proc            0  )  =            9           8           7           6           5
LE VECTEUR SOMME LOCAL :
C local ( proc            0  )  =           10          10          10          10          10
LE VECTEUR SOMME :
C =           10          10          10          10          10          10          10          10          10          10

Batch execution (by submitting to Cascade partition):

We use a submission script.sh (for parallel MPI program) to submit a job which will run the program on compute nodes.

This submission script configures the environment and then run the binary (with its options, if any) on the compute node.

sbatch script.sh
Submitted batch job 649

The job is waiting that asked ressources become availables and then it will be in Running state:

squeue
    JOBID PARTITION     NAME     USER     ST       TIME     NODES  NODELIST(REASON)
    649   Cascade      SommeVec  mylogin  R        0:00      1     s92node02

As prescribed in submission script, the standard output is redirected to the file SommeVecVecPAR.649.s92node02.out and the standard error is redirected to the file SommeVecVecPAR.649.s92node02.err.

-rw-r--r-- 1 mylogin cbp     0 25 janv. 09:21 SommeVecVecPAR.649.s92node02.err
-rw-r--r-- 1 mylogin cbp   419 25 janv. 09:21 SommeVecVecPAR.649.s92node02.out

When the job is finished, we can see the output file as following:

cat SommeVecVecPAR.649.s92node02.out

Les deux vecteurs :
A =            1           2           3           4           5           6           7           8          9          10

B =            9           8           7           6           5           4           3           2          1           0

Je suis le proc            1 parmi            2  processus
A local ( proc            1  )  =            6           7           8           9          10
B local ( proc            1  )  =            4           3           2           1           0
C local ( proc            1  )  =           10          10          10          10          10
LES DEUX VECTEURS LOCAUX :
Je suis le proc            0 parmi            2  processus
A local ( proc            0  )  =            1           2           3           4           5
B local ( proc            0  )  =            9           8           7           6           5
LE VECTEUR SOMME LOCAL :
C local ( proc            0  )  =           10          10          10          10          10
LE VECTEUR SOMME :
C =           10          10          10          10          10          10          10          10          10          10