Changes between Version 10 and Version 11 of cypress/WGSA


Ignore:
Timestamp:
07/20/20 15:02:42 (4 years ago)
Author:
fuji
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • cypress/WGSA

    v10 v11  
    167167wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GeneHancer/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
    168168}}}
     169
     170Guidance for using external resources (COSMIC, SPIDEX, CADD indel, dbNSFP) can be found [https://sites.google.com/site/jpopgen/wgsa/use-external-resources here].
     171
     172=== Procedure to run on Cypress ===
     173==== 1.Prepare input files ====
     174 Two input files are needed. One is a variant file and the other is a configuration/setting file.
     175 The standard variant file is a plain text format file with TAB-delimited columns (tsv format).
     176
     177An example of variant file, 'clinvar_subset.txt' can be downloaded [https://sites.google.com/site/jpopgen/wgsa/using-wgsa-via-aws/clinvar_subset.txt?attredirects=0&d=1 here].
     178
     179A setting/configuration file is a plain text format file, in which the users provide information for the name of the input file, name of the output file, directory to various resources and options for annotation. Example template files can be found [https://sites.google.com/site/jpopgen/wgsa/using-wgsa-via-aws/example-config-file here].
     180
     181To run the pipeline on a local machine, '''the directories settings (line 3 to 9) shall be modified''' to reflect the absolute paths to the corresponding directories on the local machine.
     182
     183
     184{{{
     185input file name:                    clinvar_subset.txt                #name of the input file
     186output file name:                   clinvar_subset.txt.annotated             #name of the output file
     187resources dir:                      /lustre/project/hpcstaff/fuji/WGSA/resources/                                  #the location
     188 of the resouces folder
     189annovar dir:                        /lustre/project/hpcstaff/fuji/WGSA/annovar2019Oct24/annovar/                    #the locatio
     190n of the ANNOVAR annotate_variation.pl
     191snpeff dir:                         /lustre/project/hpcstaff/fuji/WGSA/snpeff/snpEff/                              #the location
     192 of the snpEff snpEff.jar
     193vep dir:                            /lustre/project/hpcstaff/fuji/WGSA/vep/ensembl-vep-release-94/   #the location of the VEP va
     194riant_effect_predictor.pl
     195.vep dir:                           /lustre/project/hpcstaff/fuji/WGSA/.vep/                                       #the location
     196 of the .vep folder
     197tmp dir:                            /lustre/project/hpcstaff/fuji/WGSA/tmp/                                        #the location
     198 of the tmp folder, used for VEP on-the-fly annotation
     199work dir:                           /lustre/project/hpcstaff/fuji/WGSA/work/                                       #the location
     200 of the working folder, used for storing intermediate files
     201retain intermediate file:           b                            #supported option: snp or s, indel or i, both or b, no or n
     202ANNOVAR/Ensembl:                    b                            #supported option: snp or s, indel or i, both or b, no or n
     203ANNOVAR/RefSeq:                     b                            #supported option: snp or s, indel or i, both or b, no or n
     204ANNOVAR/UCSC:                       b                            #supported option: snp or s, indel or i, both or b, no or n
     205}}}
     206
     207In the example above, $WGSA_DIR='/lustre/project/hpcstaff/fuji/WGSA'.
     208
     209==== 2. Upload input files ====
     210 Upload two input files to Cypress. You can place them in any directory. Here let's create a directory 'WGSA_TEST' under '/lustre/project/hpcstaff/fuji/'
     211
     212
     213{{{
     214mkdir /lustre/project/hpcstaff/fuji/WGSA_TEST
     215cd /lustre/project/hpcstaff/fuji/WGSA_TEST
     216}}}
     217
     218==== 3. Create the pipeline slurm job script ====
     219Example of Slurm job script is:
     220{{{
     221#!/bin/bash
     222#SBATCH --job-name=WGSA       # Job Name
     223#SBATCH --output=WGSA.out     # File in which to store job output
     224#SBATCH --error=WGSA.err      # File in which to store job error messages
     225#SBATCH --qos=normal          # Quality of Service (like a queue in PBS)
     226#SBATCH --time=0-10:00:00     # Wall clock time limit in Days-HH:MM:SS
     227#SBATCH --nodes=1             # Node count required for the job
     228#SBATCH --ntasks-per-node=1   # Number of tasks to be launched per Node
     229#SBATCH --cpus-per-task=20    # Number of cores per task
     230#SBATCH --mem=128000          # Max RAM request 128GByte
     231
     232# Module load
     233module load java-openjdk/1.8.0
     234
     235# Set the dirctry where WGSA installed
     236export WGSA_DIR=/lustre/project/hpcstaff/fuji/WGSA
     237
     238# Set 'setting/configuration file'
     239SETTING_FILE=test1000g-hg38-WGSA085.EC2.setting
     240
     241# Setup
     242echo "Understand" | java -cp $WGSA_DIR WGSA085 $SETTING_FILE -m 128 -t 20 -v hg19
     243
     244# Run job
     245sh ./${SETTING_FILE}.sh
     246}}}
     247Save it with a name, for example 'Slurmscript' on the same directory where two input files are placed.
     248
     249==== 4. Run the pipeline job script ====
     250
     251{{{
     252sbatch Slurmscript
     253}}}
     254
     255It will take about 7 hours to finish.