Computing resources =================== Our clusters are grouped as *partitions*, by CPU generations, **available** :term:`RAM` **size** and infiniband networks: Big picture ----------- +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Partition | CPU family | nb cores | RAM (GB) | Network | main Scratch | **Best use case** | +===========+===============+==========+==========+=========+==================+============================+ | E5 | E5 | 16 | 62, 124, | 56Gb/s | /scratch/E5N | training, sequential, | | | | | 252 | | | small parallel | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | E5-GPU | E5 | 8 | 124 | 56Gb/s | /scratch/Lake | sequential, small parallel | | | | | | | | , GPU computing | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Lake | Sky Lake | 32 | 94, 124, | 56Gb/s | /scratch/Lake | medium parallel, | + +---------------+ + 190, 380 + + + sequential + | | Cascade Lake | | | | | | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Epyc | AMD Epyc | 128 | 510 | 100Gb/s | /scratch/Lake | large parallel | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Cascade | Cascade Lake | 96 | 380 | 100Gb/s | /scratch/Cascade | large parallel | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ See :doc:`partitions_overview` for more hardware details. **Available** :term:`RAM` **size** may vary a little (not all RAM is available for computing, GB vs GiB, etc.). Available resources ------------------- Use the ``sinfo`` [#sinfo]_ command to view the list of partitions (default one is noted with a '*') and their state (also ``sinfo -l``, ``sinfo -lNe`` and ``sinfo --summarize``): .. code-block:: bash $ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST E5* up 8-00:00:00 4 idle c82gluster[1-4] Cascade up 8-00:00:00 77 idle s92node[02-78] Or informations state about a particular partition: .. code-block:: bash $ sinfo -p Epyc PARTITION AVAIL TIMELIMIT NODES STATE NODELIST Epyc up 8-00:00:00 1 mix c6525node002 Epyc up 8-00:00:00 12 alloc c6525node[001,003-006,008-014] Epyc up 8-00:00:00 1 idle c6525node007 To see more informations (cpus and cpu organization, :term:`RAM` size [in MiB], state/availability), use one of these: .. code-block:: bash $ sinfo --exact --format="%9P %.8z %.8X %.8Y %.8c %.7m %.5D %N" PARTITION S:C:T SOCKETS CORES CPUS MEMORY NODES NODELIST E5* 2:8:1 2 8 16 128872 4 c82gpgpu[31-34] E5* 2:8:1 2 8 16 64328 3 c82gluster[2-4] E5-GPU 2:4:1 2 4 8 128829 1 r730gpu20 Lake 2:16:1 2 16 32 385582 3 c6420node[172-174] Cascade 2:48:1 2 48 96 385606 77 s92node[02-78] $ sinfo --exact --format="%9P %.8c %.7m %.5D %.14F %N" PARTITION CPUS MEMORY NODES NODES(A/I/O/T) NODELIST E5* 16 128872 4 3/1/0/4 c82gpgpu[31-34] E5* 16 64328 3 3/0/0/3 c82gluster[2-4] E5-GPU 8 128829 1 0/1/0/1 r730gpu20 Lake 32 385582 3 1/2/0/3 c6420node[172-174] Cascade 96 385606 77 47/26/4/77 s92node[02-78] $ sinfo --exact --format="%9P %.8c %.7m %.20C %.5D %25f" --partition E5,E5-GPU PARTITION CPUS MEMORY CPUS(A/I/O/T) NODES AVAIL_FEATURES E5* 16 256000 248/120/16/384 24 local_scratch E5* 16 128828 354/30/0/384 24 (null) E5* 16 257852 384/0/0/384 24 (null) E5* 32 257843 384/0/0/384 12 (null) E5* 16 64328 48/0/0/48 3 (null) E5* 16 128872 64/0/0/64 4 (null) E5-GPU 8 127000 32/128/0/160 20 gpu ``A/I/O/T`` standing for ``Allocated/Idle/Other/Total``, in CPU terms. .. code-block:: bash $ sinfo -lN | less NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON [...] c82gluster4 1 E5* idle 16 2:8:1 64328 0 1 (null) none s92node02 1 Cascade idle 96 2:48:1 385606 0 1 (null) none [...] .. important:: * HyperThreading [#ht]_ is activated on all Intel nodes, but not available as computing resources (*real cores vs logical cores*). * :term:`RAM` size is in MiB, and you cannot reserve more than 94% of it, by node. Basic defaults -------------- * default partition: E5 * default time: 10 minutes * default cpu(s): 1 core * default memory size: 4GiB / core Features -------- Some nodes have *features* [#features]_ (``gpu``, ``local_scratch``, etc.). To request a feature/constraint, you must add the following line to your submit script: ``#SBATCH --constraint=``. Example: .. code-block:: bash #!/bin/bash #SBATCH --name=my_job_needs_local_scratch #SBATCH --time=02:00:00 #SBATCH --ntasks=8 #SBATCH --mem-per-cpu=4096M #SBATCH --constraint=local_scratch Only nodes having features matching the job constraints will be used to satisfy the request. Maximums -------- Here are some maximums of usable resources **per job**: * maximum wall-time : 8 days ('8-0:0:0' as 'day-hours:minutes:secondes') * maximum nodes per job and/or maximum cores **per job**: +-----------+-------+-------+-----+ | Partition | nodes | cores | gpu | +===========+=======+=======+=====+ | E5 | 24 | 384 | | +-----------+-------+-------+-----+ | E5-GPU | 19 | 152 | 18 | +-----------+-------+-------+-----+ | Lake | 24 | 768 | | +-----------+-------+-------+-----+ | Epyc | 14 | 1792 | | +-----------+-------+-------+-----+ | Cascade | 76 | 7296 | | +-----------+-------+-------+-----+ Anything more **must be asked using** `our contact forms `_. .. [#sinfo] You can get the complete list of parameters by referring to the ``sinfo`` manual page (``man sinfo``). .. [#ht] `See HyperThreading `_ .. [#features] See ``sbatch`` manual page (``man sbatch``, -C, --constraint=).