Version 10 (modified by 4 years ago) ( diff ) | ,
---|
Installing and Setup WGSA in a local directory on Cypress
This instruction is based on this page and adapted for Cypress.
Decide a folder dedicated for the pipeline, for example '/lustre/project/group/WGSA'.
Setup an environment variable and create workspaces as
export WGSA_DIR=/lustre/project/group/WGSA mkdir $WGSA_DIR cd $WGSA_DIR mkdir work mkdir tmp chmod 777 work chmod 777 tmp
Create a space for ANNOVAR,
mkdir $WGSA_DIR/annovar2019Oct24
Download the ANNOVAR main package from here. The package comes as annovar.latest.tar.gz, save it to $WGSA_DIR/annovar2019Oct24. Unzip it.
cd $WGSA_DIR/annovar2019Oct24 tar -zxvf annovar.latest.tar.gz
Download RefSeq and Ensembl gene models for ANNOVAR:
cd $WGSA_DIR/annovar2019Oct24/annovar perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar refGene humandb/ perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar ensGene humandb/ perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar knownGene humandb/ perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar refGene humandb/ perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar ensGene humandb/ perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar knownGene humandb/
Install SnpEff (required for annotating indels with SnpEff or annotating SNVs with SnpEff on-the-fly) Download SnpEff v4.3t main package and save the zip file to $WGSA_DIR/snpeff:
mkdir $WGSA_DIR/snpeff cd $WGSA_DIR/snpeff wget http://sourceforge.net/projects/snpeff/files/snpEff_v4_3t_core.zip unzip snpEff_v4_3t_core.zip
To use a newer version of JavaSDK, you have to login to a computing node.
Start a interactive session:
idev -c 1 -t 4
It will take more than one hour. See here for more about 'idev'.
Once you get to a computing node, make sure your corrent directory is $WGSA_DIR/snpeff
Download RefSeq and Ensembl gene models for SnpEff:
module load java-openjdk/1.8.0 cd snpEff java -jar snpEff.jar download -v hg19 java -jar snpEff.jar download -v GRCh37.75 java -jar snpEff.jar download -v hg38 java -jar snpEff.jar download -v GRCh38.86
Exit from the computing node:
exit
Install htslib, which is required for VEP API.
mkdir $WGSA_DIR/htslib cd $WGSA_DIR/htslib wget https://github.com/samtools/htslib/releases/download/1.9/htslib-1.9.tar.bz2 tar -vxjf htslib-1.9.tar.bz2 cd htslib-1.9 make prefix=$WGSA_DIR/htslib install
Setup the environmental variables
export PATH=$WGSA_DIR/htslib/bin:$PATH export CPATH=$WGSA_DIR/htslib/include:$CPATH export LD_LIBRARY_PATH=$WGSA_DIR/htslib/lib:$LD_LIBRARY_PATH
Install VEP (required for annotating indels with VEP or annotating SNVs with VEP on-the-fly)
Download VEP 94 main package and save it to $WGSA_DIR/vep:
mkdir $WGSA_DIR/vep cd $WGSA_DIR/vep wget https://github.com/Ensembl/ensembl-vep/archive/release/94.zip unzip 94.zip
Install VEP API to /WGSA/vep and download RefSeq and Ensembl gene models to $WGSA_DIR/.vep
cd $WGSA_DIR/vep/ensembl-vep-release-94/ mkdir $WGSA_DIR/.vep export DEST_DIR=$WGSA_DIR export PERL5LIB=$WGSA_DIR perl INSTALL.pl -c $WGSA_DIR/.vep --ASSEMBLY GRCh37
Go through the steps of the installing process and following the guidance at http://useast.ensembl.org/info/docs/tools/vep/script/vep_tutorial.html. When being asked for the cache files, choose “242 : homo_sapiens_merged_vep_94_GRCh37.tar.gz”. When being asked for fasta files, choose “27 : homo_sapiens”. When being asked for the plugins, choose "7:LOF". The fasta file downloading is required for the current version of WGSA.
*This takes very long time…
perl INSTALL.pl -c $WGSA_DIR/.vep --ASSEMBLY GRCh38
When being asked for the cache files, choose "243 : homo_sapiens_merged_vep_94_GRCh38.tar.gz". When being asked for fasta files, choose “54: homo_sapiens”. When being asked for the plugins, choose "n" as LOF has already been installed.
*This takes very long time…
Change the permissions for these directories…
chmod 777 $WGSA_DIR/.vep/Plugins chmod 777 $WGSA_DIR/.vep/homo_sapiens/94_GRCh37 chmod 777 $WGSA_DIR/.vep/homo_sapiens/94_GRCh38
Install LOFTEE LOF plugin for VEP API
cd $WGSA_DIR/.vep/Plugins wget https://github.com/konradjk/loftee/archive/v0.1.1-beta.zip unzip -j v0.1.1-beta.zip rm v0.1.1-beta.zip
Download the pipeline programs and other resources
cd $WGSA_DIR wget http://web.corral.tacc.utexas.edu/WGSAdownload/WGSA085.class mkdir $WGSA_DIR/resources cd $WGSA_DIR/resources wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/javaclass/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/hg19/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/hg38/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/precomputed/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/SpliceAI/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GRASP/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/human_ancestor_GRCh37_e71/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/Neandertal/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GWAS_catalog/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GenoCanyon/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/clinvar/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GeneHancer/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"