Frequently Asked Questions¶
How can I filter my variants using a list of gene names?¶
This filtering can be done several ways within a VarSeq project. For example say you have the following list of ACMG genes that you want to use to filter the variants in your project.
BRCA1,BRCA2,TP53,STK11,MLH1,MSH2,MSH6,PMS2,APC,MUTYH,VHL,MEN1,RET,PTEN,RB1, SDHD,SDHAF2,SDHC,SDHB,TSC1,TSC2,WT1,NF2,COL3A1,FBN1,TGFBR1,TGFBR2,SMAD3, ACTA2,MYLK,MYH11,MYBPC3,MYH7,TNNT2,TNNI3,TPM1,MYL3,ACTC1,PRKAG2,GLA,MYL2, LMNA,RYR2,PKP2,DSP,DSC2,TMEM43,DSG2,KCNQ1,KCNH2,SCN5A,LDLR,APOB,PCSK9,RYR1, CACNA1S
First you must annotate your data by a gene source that uses the same gene naming convention as the list above. You can do this by going to Add > Annotation... and selecting the RefSeq Genes 105v2, NCBI source from your Local folder.
Then from your open project go to Add > Computed Data... and under the Gene > Project/Cohort selection choose Match Gene List.
On the second dialog select your gene name field from your gene annotation results.
Now give the new field an informative name (ex. In ACMG Genes?) and paste in the above gene list. Then click OK
The results is a True/False column in your variant table that you can then right-click on to select Create Filter Card for this Column
Another option for this type of workflow include running sample specific versions of this tool. See Match Gene List (Per Sample) for further details.
How can I compute coverage statistics for my sample BAM files?¶
To be able to run coverage statistics on your BAM files you must pair the BAM file to your sample(s) during the initial import, see Associate Sample BAM Files for further information. Each BAM file should be unique to the sample and have a corresponding index file (BAI) adjacent to it in its location.
You will also need a region file that will be used to define the areas where coverage will be computed. A BED file or interval source may be used to define the region. If using a BED file there must also be an index (TBI) adjacent to it in its location.
If you need to compute an index for your BAM or BED files go to Tools > Manage Data Sources, navigate to the directory where your files are saved using the Browse button. Once the files are visible in the dialog right-click on each and select Computations on Source, then check Genomic Index and click OK on the following dialog.
Once indexing of your files is complete you can go to Add > Computed Data... and under the Sample options select the Targeted Region Coverage algorithm.
On the next dialog select your BED or interval track that will be used to define the regions of interest and optionally specify an additional depth threshold.
Results will be produced in two forms, the first will be the per region output provided in the Coverage Regions table.
The second will be overall sample statistics provided in the Samples table.
How can I prioritize my variants based on a known phenotype?¶
VarSeq has implemented a gene ranking algorithm PhoRank modeled on the Phevor algorithm.
The PhoRank algorithm will rank the genes that overlap the variants in your data based on their proximity to user-specified phenotypes.
To run this algorithm you must first annotate your variants against a gene source which you can do by going to Add > Annotation... and selecting the gene source from your Local annotation folder.
Then, go to Add > Computed Data... to select PhoRank. There are two options for running this tool, if you have a single list of phenotype(s) that will be the same for all the samples in your dataset then select the Gene > Project/Cohort > Variant PhoRank Gene Ranking
If you have a different phenotype(s) for each sample then select the Gene > Per Sample > Sample PhoRank Gene Ranking option.
For this example we will run the per variant version of PhoRank. After selecting the correct version for your needs you will be prompted to select the gene name field from your gene annotation source.
On the next dialog enter your phenotype(s) of interest and click OK.
The results will be available in the Variants by Gene table as well as the Variants table. The results can be sorted by either Gene Rank and Gene Score and filter cards can also be created from these values from the right-click menu.