Use scratch

A scratch filesystem is a temporary space where you can copy input data and write intermediate, temporary and output results from a job. It is not mandatory to work with a scratch but it offer some advantages:

  • larger space compared to $HOME or group-shared datasets (/Xnfs/$GROUP),

  • provide greater IOPS [1]

However, scratches are shared spaces. Their performances depend on good use. They should NOT contain:

  • documentations (too much small files, and… why?) -> $HOME, /Xnfs/$GROUP

  • symbolic links (very tiny files, useless IO)

  • source codes or softwares (small files, useless IO) -> $HOME, /Xnfs/$GROUP

  • conda, modules or libraries, virtualenv… -> $HOME, /Xnfs/$GROUP

Hint

Only input data, temporary data and output data.

Warning

Scratches are not meant for long duration storage. You should cleanup as fast as your job finishes.

DO NOT STORE source codes (including conda, virtualenv…), small files or symbolic links in $SCRATCH. It degrade performances very fast, FOR EVERYONE.

When scratches are full, PSMN’s Staff will erase files and directories blindly.

You have been warned.

Two types of scratches are available, global to a partition (or cluster) or local to a node. See Login nodes and Clusters/Partitions overview for access, repartition and paths.

Name

Type

snapshots

quotas

Performance

Purpose

$SCRATCH

glusterfs

no

no

high

large temporary files, checkpoints,

raw temporary input/output

local scratch (ssd, disk)

ext4, zfs

no

no

medium, high

job specific output requiring IOPS [1] (120 days lifetime residency)

Warning

DO NOT STORE source codes (including conda, virtualenv…), small files or symbolic links in $SCRATCH. It degrade performances very fast, FOR EVERYONE.

On this PSMN’s networks topology, you can visualize how scratches connect to clusters:

PSMN network synoptic 2022

Fig. 33 PSMN network synoptic (as of 2022)

Examples

Here’s two ways of using scratch:

  • manual copy

    From a login node, you can create, copy, delete into /scratch/$CLUSTER/$USER/whatever/, before and after a (set of) job(s).

  • Automated copy

    Inside your batch script, you can copy input data to scratch, indicate to your software which $TMPDIR or $SCRATCHDIR to use. Also cleanup at the end of a successfull job.

    This is mandatory for local scratch, on specific nodes. See Clusters/Partitions overview for paths.

See our repository of examples scripts.

If you do not feel comfortable with scripts and scratch, please ask us around a coffee.

Note

These machines were set up thanks to the preparatory work, recipes and integrations carried out on the CBP experimental platform.