Context Navigation

Changes between Version 10 and Version 11 of cypress/WGSA

Timestamp:: 07/20/20 15:02:42 (5 years ago)
Author:: fuji
Comment:: —

Legend:

: Unmodified
: Added
: Removed
: Modified

cypress/WGSA

-              v10
+              v11
 wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GeneHancer/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
 }}}
+Guidance for using external resources (COSMIC, SPIDEX, CADD indel, dbNSFP) can be found [https://sites.google.com/site/jpopgen/wgsa/use-external-resources here].
+=== Procedure to run on Cypress ===
+==== 1.Prepare input files ====
+ Two input files are needed. One is a variant file and the other is a configuration/setting file.
+ The standard variant file is a plain text format file with TAB-delimited columns (tsv format).
+An example of variant file, 'clinvar_subset.txt' can be downloaded [https://sites.google.com/site/jpopgen/wgsa/using-wgsa-via-aws/clinvar_subset.txt?attredirects=0&d=1 here].
+A setting/configuration file is a plain text format file, in which the users provide information for the name of the input file, name of the output file, directory to various resources and options for annotation. Example template files can be found [https://sites.google.com/site/jpopgen/wgsa/using-wgsa-via-aws/example-config-file here].
+To run the pipeline on a local machine, '''the directories settings (line 3 to 9) shall be modified''' to reflect the absolute paths to the corresponding directories on the local machine.
+{{{
+input file name:                    clinvar_subset.txt                #name of the input file
+output file name:                   clinvar_subset.txt.annotated             #name of the output file
+resources dir:                      /lustre/project/hpcstaff/fuji/WGSA/resources/                                  #the location
+ of the resouces folder
+annovar dir:                        /lustre/project/hpcstaff/fuji/WGSA/annovar2019Oct24/annovar/                    #the locatio
+n of the ANNOVAR annotate_variation.pl
+snpeff dir:                         /lustre/project/hpcstaff/fuji/WGSA/snpeff/snpEff/                              #the location
+ of the snpEff snpEff.jar
+vep dir:                            /lustre/project/hpcstaff/fuji/WGSA/vep/ensembl-vep-release-94/   #the location of the VEP va
+riant_effect_predictor.pl
+.vep dir:                           /lustre/project/hpcstaff/fuji/WGSA/.vep/                                       #the location
+ of the .vep folder
+tmp dir:                            /lustre/project/hpcstaff/fuji/WGSA/tmp/                                        #the location
+ of the tmp folder, used for VEP on-the-fly annotation
+work dir:                           /lustre/project/hpcstaff/fuji/WGSA/work/                                       #the location
+ of the working folder, used for storing intermediate files
+retain intermediate file:           b                            #supported option: snp or s, indel or i, both or b, no or n
+ANNOVAR/Ensembl:                    b                            #supported option: snp or s, indel or i, both or b, no or n
+ANNOVAR/RefSeq:                     b                            #supported option: snp or s, indel or i, both or b, no or n
+ANNOVAR/UCSC:                       b                            #supported option: snp or s, indel or i, both or b, no or n
+}}}
+In the example above, $WGSA_DIR='/lustre/project/hpcstaff/fuji/WGSA'.
+==== 2. Upload input files ====
+ Upload two input files to Cypress. You can place them in any directory. Here let's create a directory 'WGSA_TEST' under '/lustre/project/hpcstaff/fuji/'
+{{{
+mkdir /lustre/project/hpcstaff/fuji/WGSA_TEST
+cd /lustre/project/hpcstaff/fuji/WGSA_TEST
+}}}
+==== 3. Create the pipeline slurm job script ====
+Example of Slurm job script is:
+{{{
+#!/bin/bash
+#SBATCH --job-name=WGSA       # Job Name
+#SBATCH --output=WGSA.out     # File in which to store job output
+#SBATCH --error=WGSA.err      # File in which to store job error messages
+#SBATCH --qos=normal          # Quality of Service (like a queue in PBS)
+#SBATCH --time=0-10:00:00     # Wall clock time limit in Days-HH:MM:SS
+#SBATCH --nodes=1             # Node count required for the job
+#SBATCH --ntasks-per-node=1   # Number of tasks to be launched per Node
+#SBATCH --cpus-per-task=20    # Number of cores per task
+#SBATCH --mem=128000          # Max RAM request 128GByte
+# Module load
+module load java-openjdk/1.8.0
+# Set the dirctry where WGSA installed
+export WGSA_DIR=/lustre/project/hpcstaff/fuji/WGSA
+# Set 'setting/configuration file'
+SETTING_FILE=test1000g-hg38-WGSA085.EC2.setting
+# Setup
+echo "Understand" | java -cp $WGSA_DIR WGSA085 $SETTING_FILE -m 128 -t 20 -v hg19
+# Run job
+sh ./${SETTING_FILE}.sh
+}}}
+Save it with a name, for example 'Slurmscript' on the same directory where two input files are placed.
+==== 4. Run the pipeline job script ====
+{{{
+sbatch Slurmscript
+}}}
+It will take about 7 hours to finish.