Changes between Version 1 and Version 2 of Workshops/JobCheckpointing/Examples/Python
- Timestamp:
- 03/20/2026 01:48:46 PM (25 hours ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Workshops/JobCheckpointing/Examples/Python
v1 v2 78 78 To run the Python checkpointing job example, defaulting to checkpointing every 20 application iterations and a total of 500 iterations, perform the following. 79 79 80 1. Edit the files '''checkpoint_runner.sh''' and '''checkpoint_signal_iter.py''' in your current directory.80 1. Edit the files '''checkpoint_runner.sh''' and '''checkpoint_signal_iter.py''' as shown above in your current directory. 81 81 For file editing with nano, etc., see [[https://wiki.hpc.tulane.edu/trac/wiki/cypress/FileEditingSoftware/Example|File Editing Example]]. 82 82 … … 87 87 }}} 88 88 89 2. Monitor the job's output via the following command, substituting the job ID for <jobID>.89 3. Monitor the job's output via the following command, substituting the job ID for <jobID>. 90 90 91 91 {{{ … … 93 93 }}} 94 94 95 3. Here are normal results for the output and error files, '''log_<jobID>.err''' and '''log_<jobID>.out''', observing that the job cancelled and requeued itself many times. (Not all cancellations were captured in the error file.)95 4. Here are normal results for the output and error files, '''log_<jobID>.err''' and '''log_<jobID>.out''', observing that the job cancelled and requeued itself many times. (Not all cancellations were captured in the error file.) 96 96 97 97 {{{
