wiki:Workshops/JobParallelism/AfterYourJobHasCompleted

Version 1 (modified by Carl Baribault, 36 hours ago) ( diff )

Setting up for sacct - WIP

After your job has completed - determining cumulative core efficiency

Assumptions

See Assumptions - same as for running jobs.

Preliminary: tools available

LONI clusters

LONI clusters provide the self-contained commands seff and qshow.

On LONI QB4 cluster:

[loniID@qbd2 ~]$ seff -h
Usage: seff [Options] <Jobid>
       Options:
       -h    Help menu
       -v    Version
       -d    Debug mode: display raw Slurm data
[loniID@qbd2 ~]$ seff -v
seff Version 2.1

  • qshow (provided by LONI)

On LONI QB4 cluster:

[loniID@qbd2 ~]$ qshow -h
** usage: qshow -n <options> <base-name> <begin #> <end #> <command>
...
Show and optionally kill user processes on remote nodes or execute
commands...
[loniID@qbd2 ~]$ qshow -v
qshow 2.74

Cypress

In the following we'll need to use the sacct command for analyzing completed jobs on Cypress. (Cypress uses an older version of SLURM (v14.03.0) with insufficient support for the seff command.)

Here are the relevant outputs that we'll need from sacct.

sacct output columnDescriptionFormat
TotalCPUTotal core hours used[DD-[hh:]]mm:ss)
CPUTimeRAWTotal cores hours allocatedSeconds

Cumulative core efficency: (total core hours used) / (total core hours allocated

foo

Note: See TracWiki for help on using the wiki.