Changes between Version 7 and Version 8 of Workshops/cypress/JobArrays


Ignore:
Timestamp:
08/16/25 16:52:42 (9 hours ago)
Author:
fuji
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Workshops/cypress/JobArrays

    v7 v8  
    1919}}}
    2020
    21 There is a python script that is
     21There is a Python script that is
    2222{{{
    2323[fuji@cypress1 JobArray1]$ cat hello2.py
     
    126126=== Use Array Task ID to define the script file name ===
    127127
    128 Get into '''JobArray2''' directory under '''workshop''',
    129 {{{
    130 [fuji@cypress1 ~]$ cd workshop/JobArray2/
     128Get into '''JobArray2''' directory under '''hpc-workshop''',
     129{{{
     130[fuji@cypress1 ~]$ cd hpc-workshop/JobArray2/
    131131[fuji@cypress1 JobArray2]$ ls
    132132hello2.py    script01.sh  script03.sh  script05.sh  script07.sh  script09.sh   slurmscript2
     
    160160}}}
    161161
     162=== Use Array Task ID to identify the data file ===
     163
     164Get into '''JobArray3''' directory under '''hpc-workshop''',
     165{{{
     166[fuji@cypress1 ~]$ cd hpc-workshop/JobArray3/
     167[fuji@cypress1 JobArray3]$ ls
     168data  slurmscript
     169}}}
     170
     171In '''data''' directory,
     172{{{
     173[fuji@cypress1 JobArray3]$ ls data
     174data_file_10.txt  data_file_2.txt  data_file_4.txt  data_file_6.txt  data_file_8.txt
     175data_file_1.txt   data_file_3.txt  data_file_5.txt  data_file_7.txt  data_file_9.txt
     176}}}
     177'''slurmscript'''
     178{{{#!bash
     179#!/bin/bash
     180#SBATCH --qos=workshop          # Quality of Service
     181#SBATCH --partition=workshop    # partition
     182#SBATCH --job-name=job_array    # Job Name
     183#SBATCH --time=00:01:00         # WallTime
     184#SBATCH --nodes=1               # Number of Nodes
     185#SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
     186#SBATCH --cpus-per-task=1       # Number of threads per task (OMP threads)
     187#SBATCH --array=1-10            # Array of IDs=1,2,...10
     188
     189# list all (10) files in a data directory and use a job array to process each file.
     190# define the data directory
     191DATA_DIRECTORY=./data
     192echo Using DATA_DIRECTORY=$DATA_DIRECTORY
     193echo Using SLURM_ARRAY_TASK_ID=$SLURM_ARRAY_TASK_ID
     194# select the data file from the data directory using the SLURM task ID
     195DATA_FILE=$(find $DATA_DIRECTORY -type f | sort -V | sed -n "$SLURM_ARRAY_TASK_ID p")
     196echo Using DATA_FILE=$DATA_FILE.
     197# define the output directory
     198OUTPUT_DIRECTORY=./output
     199mkdir -p $OUTPUT_DIRECTORY
     200echo Using OUTPUT_DIRECTORY=$OUTPUT_DIRECTORY
     201OUTPUT_FILE=$OUTPUT_DIRECTORY/$(basename $DATA_FILE).out
     202# if the output file already exists, then bypass and exit
     203echo Checking for OUTPUT_FILE=$OUTPUT_FILE...
     204if [ -f $OUTPUT_FILE ]; then
     205   echo Found. Bypassing processing.
     206else
     207   echo Not found. Processing.
     208   sed -r 's/(.*)/\1 output/' $DATA_FILE >> $OUTPUT_FILE
     209   echo Done.
     210fi
     211}}}
    162212=== Cancel Jobs in Job Array ===
    163213 Look at '''slurmscript12'''