wiki:cypress/WGSA

Version 10 (modified by fuji, 4 years ago) ( diff )

Installing and Setup WGSA in a local directory on Cypress

This instruction is based on this page and adapted for Cypress.

Decide a folder dedicated for the pipeline, for example '/lustre/project/group/WGSA'.

Setup an environment variable and create workspaces as

export WGSA_DIR=/lustre/project/group/WGSA
mkdir $WGSA_DIR
cd $WGSA_DIR
mkdir work
mkdir tmp
chmod 777 work
chmod 777 tmp

Create a space for ANNOVAR,

mkdir $WGSA_DIR/annovar2019Oct24

Download the ANNOVAR main package from here. The package comes as annovar.latest.tar.gz, save it to $WGSA_DIR/annovar2019Oct24. Unzip it.

cd $WGSA_DIR/annovar2019Oct24
tar -zxvf annovar.latest.tar.gz

Download RefSeq and Ensembl gene models for ANNOVAR:

cd $WGSA_DIR/annovar2019Oct24/annovar
perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar refGene humandb/
perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar ensGene humandb/
perl annotate_variation.pl -buildver hg19 -downdb -webfrom annovar knownGene humandb/
perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar refGene humandb/
perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar ensGene humandb/     
perl annotate_variation.pl -buildver hg38 -downdb -webfrom annovar knownGene humandb/    

Install SnpEff (required for annotating indels with SnpEff or annotating SNVs with SnpEff on-the-fly) Download SnpEff v4.3t main package and save the zip file to $WGSA_DIR/snpeff:

mkdir $WGSA_DIR/snpeff
cd $WGSA_DIR/snpeff
wget http://sourceforge.net/projects/snpeff/files/snpEff_v4_3t_core.zip
unzip snpEff_v4_3t_core.zip

To use a newer version of JavaSDK, you have to login to a computing node.

Start a interactive session:

idev -c 1 -t 4

It will take more than one hour. See here for more about 'idev'.

Once you get to a computing node, make sure your corrent directory is $WGSA_DIR/snpeff

Download RefSeq and Ensembl gene models for SnpEff:

module load java-openjdk/1.8.0
cd snpEff
java -jar snpEff.jar download -v hg19
java -jar snpEff.jar download -v GRCh37.75
java -jar snpEff.jar download -v hg38
java -jar snpEff.jar download -v GRCh38.86

Exit from the computing node:

exit

Install htslib, which is required for VEP API.

mkdir $WGSA_DIR/htslib
cd $WGSA_DIR/htslib
wget https://github.com/samtools/htslib/releases/download/1.9/htslib-1.9.tar.bz2
tar -vxjf htslib-1.9.tar.bz2
cd htslib-1.9
make prefix=$WGSA_DIR/htslib install

Setup the environmental variables

export PATH=$WGSA_DIR/htslib/bin:$PATH
export CPATH=$WGSA_DIR/htslib/include:$CPATH
export LD_LIBRARY_PATH=$WGSA_DIR/htslib/lib:$LD_LIBRARY_PATH

Install VEP (required for annotating indels with VEP or annotating SNVs with VEP on-the-fly)

Download VEP 94 main package and save it to $WGSA_DIR/vep:

mkdir $WGSA_DIR/vep
cd $WGSA_DIR/vep
wget https://github.com/Ensembl/ensembl-vep/archive/release/94.zip
unzip 94.zip

Install VEP API to /WGSA/vep and download RefSeq and Ensembl gene models to $WGSA_DIR/.vep

cd $WGSA_DIR/vep/ensembl-vep-release-94/
mkdir $WGSA_DIR/.vep
export DEST_DIR=$WGSA_DIR
export PERL5LIB=$WGSA_DIR
perl INSTALL.pl -c $WGSA_DIR/.vep --ASSEMBLY GRCh37

Go through the steps of the installing process and following the guidance at http://useast.ensembl.org/info/docs/tools/vep/script/vep_tutorial.html. When being asked for the cache files, choose “242 : homo_sapiens_merged_vep_94_GRCh37.tar.gz”. When being asked for fasta files, choose “27 : homo_sapiens”. When being asked for the plugins, choose "7:LOF". The fasta file downloading is required for the current version of WGSA.

*This takes very long time…

perl INSTALL.pl -c $WGSA_DIR/.vep --ASSEMBLY GRCh38

When being asked for the cache files, choose "243 : homo_sapiens_merged_vep_94_GRCh38.tar.gz". When being asked for fasta files, choose “54: homo_sapiens”. When being asked for the plugins, choose "n" as LOF has already been installed.

*This takes very long time…

Change the permissions for these directories…

chmod 777 $WGSA_DIR/.vep/Plugins
chmod 777 $WGSA_DIR/.vep/homo_sapiens/94_GRCh37
chmod 777 $WGSA_DIR/.vep/homo_sapiens/94_GRCh38

Install LOFTEE LOF plugin for VEP API

cd $WGSA_DIR/.vep/Plugins
wget https://github.com/konradjk/loftee/archive/v0.1.1-beta.zip
unzip -j v0.1.1-beta.zip
rm v0.1.1-beta.zip

Download the pipeline programs and other resources

cd $WGSA_DIR
wget http://web.corral.tacc.utexas.edu/WGSAdownload/WGSA085.class
mkdir $WGSA_DIR/resources
cd $WGSA_DIR/resources
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/javaclass/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/hg19/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/hg38/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*" 
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/precomputed/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/SpliceAI/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GRASP/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/human_ancestor_GRCh37_e71/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/Neandertal/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GWAS_catalog/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GenoCanyon/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/clinvar/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
wget http://web.corral.tacc.utexas.edu/WGSAdownload/resources/GeneHancer/ --recursive --continue --timestamping --no-host-directories --cut-dirs=2 --no-parent --reject="index.html*"
Note: See TracWiki for help on using the wiki.