Changes between Version 69 and Version 70 of cypress/using


Ignore:
Timestamp:
12/12/22 11:20:38 (17 months ago)
Author:
fuji
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • cypress/using

    v69 v70  
    33= Submitting Jobs on Cypress =
    44
    5 In this section we will examine how to submit jobs on Cypress using the SLURM resource manager. We’ll begin with the basics and proceed to examples of jobs which employ MPI, OpenMP, and hybrid parallelization schemes.
     5In this section, we will examine how to submit jobs on Cypress using the SLURM resource manager. We’ll begin with the basics and proceed to examples of jobs that employ MPI, OpenMP, and hybrid parallelization schemes.
    66
    77
    88== Quick Start for PBS users ==
    99
    10 Cypress uses SLURM to schedule jobs and manage resources resources. Full documentation and tutorials for SLURM can be found on the SLURM website at:
     10Cypress uses SLURM to schedule jobs and manage resources. Full documentation and tutorials for SLURM can be found on the SLURM website at:
    1111
    1212http://slurm.schedmd.com/documentation.html
     
    1616http://slurm.schedmd.com/rosetta.html
    1717
    18 Lastly, resource limits on Cypress divided into separate Quality Of Services (QOSs). These are analogous to the queues on Sphynx. You may choose a QOS by using the appropriate script directive in your submission script, e.g.
     18Lastly, resource limits on Cypress are divided into separate Quality Of Services (QOSs). These are analogous to the queues on Sphynx. You may choose a QOS by using the appropriate script directive in your submission script, e.g.
    1919
    2020{{{#!bash
     
    2828=== Introduction to Managed Cluster Computing  ===
    2929
    30 For those who are new to cluster computing and resource management, let's begin with an explanation of what a resource manager is and why it is necessary. Suppose you have a piece of C code that you would like to compile and execute, for example a helloworld program.
     30For those who are new to cluster computing and resource management, let's begin with an explanation of what a resource manager is and why it is necessary. Suppose you have a piece of C code that you would like to compile and execute, for example, a HelloWorld program.
    3131
    3232{{{#!c
     
    3939}}}
    4040
    41 On your desktop you would open a terminal, compile the code using your favorite c compiler and execute the code. You can do this without worry as you are the only person using your computer and you know what demands are being made on your CPU and memory at the time you run your code. On a cluster, many users must share the available resources equitably and simultaneously. It's the job of the resource manager to choreograph this sharing of resources by accepting a description of your program and the resources it requires, searching the available hardware for resources that meet your requirements, and making sure that no one else is given those resources while you are using them.
    42 
    43 Occasionally the manager will be unable to find the resources you need due to usage by other user. In those instances your job will be "queued", that is the manager will wait until the needed resources become available before running your job. This will also occur if the total resources you request for all your jobs exceed the limits set by the cluster administrator. This ensures that all users have equal access to the cluster.
    44 
    45 The take home point here is this: in a cluster environment a user submits jobs to a resource manager, which in turn runs an executable(s) for the user. So how do you submit a job request to the resource manager? Job requests take the form of scripts, called job scripts. These scripts contain script directives, which tell the resource manager what resources the executable requires. The user then submits the job script to the scheduler.
    46 
    47 The syntax of these script directives is manager specific. For the SLURM resource manager, all script directives begin with "#SBATCH". Let's look at a basic SLURM script requesting one node and one core on which to run our helloworld program.
     41On your desktop, you would open a terminal, compile the code using your favorite c compiler and execute the code. You can do this without worry as you are the only person using your computer and you know what demands are being made on your CPU and memory at the time you run your code. On a cluster, many users must share the available resources equitably and simultaneously. It's the job of the resource manager to choreograph this sharing of resources by accepting a description of your program and the resources it requires, searching the available hardware for resources that meet your requirements, and making sure that no one else is given those resources while you are using them.
     42
     43Occasionally the manager will be unable to find the resources you need due to usage by other users. In those instances your job will be "queued", that is the manager will wait until the needed resources become available before running your job. This will also occur if the total resources you request for all your jobs exceed the limits set by the cluster administrator. This ensures that all users have equal access to the cluster.
     44
     45The take-home point here is this: in a cluster environment, a user submits jobs to a resource manager, which in turn runs an executable(s) for the user. So how do you submit a job request to the resource manager? Job requests take the form of scripts, called job scripts. These scripts contain script directives, which tell the resource manager what resources the executable requires. The user then submits the job script to the scheduler.
     46
     47The syntax of these script directives is manager specific. For the SLURM resource manager, all script directives begin with "#SBATCH". Let's look at a basic SLURM script requesting one node and one core on which to run our HelloWorld program.
    4848
    4949{{{#!bash
     
    6060}}}
    6161
    62 Notice that the SLURM script begins with #!/bin/bash. This tells the Linux shell what flavor shell interpreter to run. In this example we use BASh (Bourne Again Shell). The choice of interpreter (and subsequent syntax) is up to the user, but every SLURM script should begin this way. This is followed by a collection of #SBATCH script directives telling the manager about the resources needed by our code and where to put the codes output. Lastly, we have the executable we wish the manager to run (note: this script assumes it is located in the same directory as the executable).
     62Notice that the SLURM script begins with #!/bin/bash. This tells the Linux shell what flavor shell interpreter to run. In this example, we use BASh (Bourne Again Shell). The choice of interpreter (and subsequent syntax) is up to the user, but every SLURM script should begin this way. This is followed by a collection of #SBATCH script directives telling the manager about the resources needed by our code and where to put the codes output. Lastly, we have the executable we wish the manager to run (note: this script assumes it is located in the same directory as the executable).
    6363
    6464'''For Workshop''' :
     
    9292}}}
    9393
    94 Our job was successfully submitted and was assigned the job number 6041. We can check the output of our job by examining the contents of our output and error files. Referring back to the helloworld.srun SLURM script, notice the lines
     94Our job was successfully submitted and was assigned job number 6041. We can check the output of our job by examining the contents of our output and error files. Referring back to the helloworld.srun SLURM script, notice the lines
    9595
    9696{{{#!bash
     
    122122}}}
    123123
    124 Notice here that we've omitted some of the script directives included in our previous hello world submission script. We will still run on the normal QOS as that's the default on Cypress. However, when no output directives are given SLURM will redirect the output of our executable (including any error messages) to a file labeled with our jobs ID number. This number is assigned upon submission. Let's suppose that the above is stored in a file named oneHourJob.srun and we submit our job using the '''sbatch''' command. Then we can check on the progress of our job using squeue and we can cancel the job by executing scancel on the assigned job ID.
     124Notice here that we've omitted some of the script directives included in our previous hello world submission script. We will still run on the normal QOS as that's the default on Cypress. However, when no output directives are given SLURM will redirect the output of our executable (including any error messages) to a file labeled with our jobs ID number. This number is assigned upon submission. Let's suppose that the above is stored in a file named oneHourJob. '''srun''' and we submit our job using the '''sbatch''' command. Then we can check on the progress of our job using '''squeue''' and we can cancel the job by executing '''scancel''' on the assigned job ID.
    125125
    126126[[Image(squeue_scancel2.png, 50%, center)]]
    127127
    128 Notice that when we run the squeue command, our job status is marked R for running and has been running for 7 seconds. The squeue command also tells us what node our job is being run on, in this case node 123. When running squeue in a research environment you will usually see a long list of users running multiple jobs. To single out your own job you can use the "-u" option flag to specify your user name.
     128Notice that when we run the '''squeue''' command, our job status is marked R for running and has been running for 7 seconds. The '''squeue''' command also tells us what node our job is being run on, in this case node 123. When running '''squeue''' in a research environment you will usually see a long list of users running multiple jobs. To single out your own job you can use the "-u" option flag to specify your user name.
    129129
    130130Congratulations, you are ready to begin running jobs on Cypress!
     
    214214* Age:  Jobs that have been waiting in the queue longer get higher priority.
    215215
    216 * Job Size:  Larger jobs (i.e. jobs with more CPUs/nodes requested) have higher priority to favor jobs that take advantage of parallel processesing (e.g. MPI jobs).
     216* Job Size:  Larger jobs (i.e. jobs with more CPUs/nodes requested) have higher priority to favor jobs that take advantage of parallel processing (e.g. MPI jobs).
    217217
    218218SLURM calculates each priority component as a fraction (value between 0 and 1), which is then multiplied by a weight.  The current weights are:  Fair-share:  100,000; Age:  10,000;  Job Size:  1,000.  That is, Fair-share is the major contributor to priority.  The weighted components are added to give the final priority.
     
    249249}}}
    250250
    251 Again, notice that we did not need to feed any of the usual information to mpirun regarding the number of processes, hostfiles, etc. as this is handled automatically by SLURM. Another thing to note is the loading the intel-psxe (parallel studio) module. This loads the Intel instantiation of MPI including mpirun. If you would like to use OpenMPI then you should load the openmpi/gcc/64/1.8.2-mlnx-ofed2 module or one of the other OpenMPI versions currently available on Cypress. We also take advantage of a couple of SLURMS output environment variables to automate our record keeping.  Now, a record of what nodes we ran on, our job ID, and the number of tasks used will be written to the MPIoutput.out file.  While this is certainly not necessary, it often pays dividends when errors arise.
     251Again, notice that we did not need to feed any of the usual information to mpirun regarding the number of processes, hostfiles, etc. as this is handled automatically by SLURM. Another thing to note is the loading the intel-psxe (parallel studio) module. This loads the Intel instantiation of MPI including mpirun. If you would like to use OpenMPI then you should load the openmpi/gcc/64/1.8.2-mlnx-ofed2 module or one of the other OpenMPI versions currently available on Cypress. We also take advantage of a couple of SLURMS output environment variables to automate our record-keeping.  Now, a record of what nodes we ran on, our job ID, and the number of tasks used will be written to the MPIoutput.out file.  While this is certainly not necessary, it often pays dividends when errors arise.
    252252
    253253
     
    363363== Running Manny !Serial/Parallel Jobs ==
    364364=== Jobs Array ===
    365 If you are running a large number of serial jobs, it is recommended to submit them as a '''job array''' to make the best use of your allocated resources.  For example, suppose you are running 100 serial jobs using scripts located in a "scripts" folder, each of which does a serial calculation:  scripts/run1.sh, scripts/run2.sh, ..., scripts/run100.sh.  You would create an sbatch script named "run100scripts.srun" with contents:
     365If you are running a large number of serial jobs, it is recommended to submit them as a '''job array''' to make the best use of your allocated resources.  For example, suppose you are running 100 serial jobs using scripts located in a "scripts" folder, each of which does a serial calculation:  scripts/run1.sh, scripts/run2.sh, ..., scripts/run100.sh.  You would create a Slurm script named "run100scripts.srun" with contents:
    366366
    367367{{{
     
    385385}}}
    386386
    387 Make sure your scripts have executable permissions.  Then, submitting with:
     387Make sure your scripts have executable permissions.  Then, submit with:
    388388
    389389{{{
     
    410410
    411411=== Many-task computing ===
    412 If you have many tasks and each task needs a few cores, it may be beneficial to pack several tasks into one job.
     412If you have many similar time-cost tasks and each task needs a few cores, it may be beneficial to pack several tasks into one job.
    413413For example,
    414414