Context Navigation

Changes between Version 9 and Version 10 of cypress/using

Timestamp:: 04/08/15 16:07:59 (10 years ago)
Author:: cmaggio
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

cypress/using

-              v9
+              v10
 For those who are new to cluster computing and resource management, let's begin with an explanation of what a resource manager is and why it is necessary. Suppose you have a piece of C code that you would like to compile and execute, for example a helloworld program.
+[[Image(helloworld.png, 50%)]]
+{{{#!c
+#include<stdio.h>
+int main(){
+             printf("Hello World\n");
+             return 0;
+}
+}}}
 On your desktop you would open a terminal, compile the code using your favorite c compiler and execute the code. You can do this without worry as you are the only person using your computer and you know what demands are being made on your CPU and memory at the time you run your code. On a cluster, many users must share the available resources equitably and simultaneously. It's the job of the resource manager to choreograph this sharing of resources by accepting a description of your program and the resources it requires, searching the available hardware for resources that meet your requirements, and making sure that no one else is given those resources while you are using them.
 …
 The syntax of these script directives is manager specific. For the SLURM resource manager, all script directives begin with "#SBATCH". Let's look at a basic SLURM script requesting one node and one core on which to run our helloworld program.
+ [[Image(hello_srun.png, 50%)]]
+{{{#!bash
+#!/bin/bash
+#SBATCH --job-name=HiWorld    ### Job Name
+#SBATCH --output=Hi.out       ### File in which to store job output
+#SBATCH --error=Hi.err        ### File in which to store job error messages
+#SBATCH --qos=normal          ### Quality of Service (like a queue in PBS)
+#SBATCH --time=0-00:01:00     ### Wall clock time limit in Days-HH:MM:SS
+#SBATCH --nodes=1             ### Node count required for the job
+#SBATCH --ntasks-per-node=1   ### Nuber of tasks to be launched per Node
+./helloworld
+}}}
 Notice that the SLURM script begins with #!/bin/bash. This tells the Linux shell what flavor shell interpreter to run. In this example we use BASh (Bourne Again Shell). The choice of interpreter (and subsequent syntax) is up to the user, but every SLURM script should begin this way. This is followed by a collection of #SBATCH script directives telling the manager about the resources needed by our code and where to put the codes output. Lastly, we have the executable we wish the manager to run (note: this script assumes it is located in the same directory as the executable).
 …
 Our job was successfully submitted and was assigned the job number 6041. We can check the output of our job by examining the contents of our output and error files. Referring back to the helloworld.srun SLURM script, notice the lines
+[[Image(output_error.png, 50%)]]
+{{{#!bash
+#SBATCH --output=Hi.out       ### File in which to store job output
+#SBATCH --error=Hi.err        ### File in which to store job error messages
+}}}
 These specify files in which to store the output written to standard out and standard error, respectively. If our code ran without issue, then the Hi.err file should be empty and the Hi.out file should contain our greeting.