Changes between Version 2 and Version 3 of Workshops/JobCheckpointing/Examples/BASH
- Timestamp:
- 03/20/2026 01:45:57 PM (25 hours ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Workshops/JobCheckpointing/Examples/BASH
v2 v3 62 62 exit 0 63 63 fi 64 done 64 65 }}} 65 66 … … 68 69 To run the BASH checkpointing job example, defaulting to checkpointing every 20 application iterations and a total of 500 iterations, perform the following. 69 70 70 1. Edit the files '''checkpoint_runner.sh''' and '''checkpoint_signal_iter.sh''' in your current directory.71 1. Edit the files '''checkpoint_runner.sh''' and '''checkpoint_signal_iter.sh''' as shown above in your current directory. 71 72 For file editing with nano, etc., see [[https://wiki.hpc.tulane.edu/trac/wiki/cypress/FileEditingSoftware/Example|File Editing Example]]. 72 73 73 2. Submit the job via the following command. 74 2. Change permissions on the BASH application script, '''checkpoint_signal_iter.sh''' executable via the following command. 75 76 {{{ 77 [tulaneID@cypress1 ~]$ chmod u+x checkpoint_signal_iter.sh 78 }}} 79 80 3. Submit the job via the following command. 74 81 75 82 {{{ … … 77 84 }}} 78 85 79 2. Monitor the job's output via the following command, substituting the job ID for <jobID>.86 4. Monitor the job's output via the following command, substituting the job ID for <jobID>. 80 87 81 88 {{{ … … 83 90 }}} 84 91 85 3. Here are normal results for the output and error files, '''log_<jobID>.err''' and '''log_<jobID>.out''', observing that the job cancelled and requeued itself many times.92 5. Here are normal results for the output and error files, '''log_<jobID>.err''' and '''log_<jobID>.out''', observing that the job cancelled and requeued itself many times. 86 93 87 94 {{{
