Frequently Asked Questions

How can I filter my variants using a list of gene names?

This filtering can be done several ways within a VarSeq project. For example say you have the following list of ACMG genes that you want to use to filter the variants in your project.

BRCA1,BRCA2,TP53,STK11,MLH1,MSH2,MSH6,PMS2,APC,MUTYH,VHL,MEN1,RET,PTEN,RB1,
SDHD,SDHAF2,SDHC,SDHB,TSC1,TSC2,WT1,NF2,COL3A1,FBN1,TGFBR1,TGFBR2,SMAD3,
ACTA2,MYLK,MYH11,MYBPC3,MYH7,TNNT2,TNNI3,TPM1,MYL3,ACTC1,PRKAG2,GLA,MYL2,
LMNA,RYR2,PKP2,DSP,DSC2,TMEM43,DSG2,KCNQ1,KCNH2,SCN5A,LDLR,APOB,PCSK9,RYR1,
CACNA1S

First you must annotate your data by a gene source that uses the same gene naming convention as the list above. You can do this by going to Add > Annotation... and selecting the RefSeq Genes 105v2, NCBI source from your Local folder.

Then from your open project go to Add > Computed Data... and under the Gene > Project/Cohort selection choose Match Gene List.

Select Match Gene Algorithm

Select Match Gene List Algorithm

On the second dialog select your gene name field from your gene annotation results.

Select Gene Name Field

Select Gene Name Field from Annotation Source

Now give the new field an informative name (ex. In ACMG Genes?) and paste in the above gene list. Then click OK

Add in gene list

Add in Gene List

The results is a True/False column in your variant table that you can then right-click on to select Create Filter Card for this Column

Filter by Gene List

Create Filter Card for these results

Another option for this type of workflow include running sample specific versions of this tool. See Match Gene List (Per Sample) for further details.

How can I compute coverage statistics for my sample BAM files?

To be able to run coverage statistics on your BAM files you must pair the BAM file to your sample(s) during the initial import, see Associate Sample BAM Files for further information. Each BAM file should be unique to the sample and have a corresponding index file (BAI) adjacent to it in its location.

You will also need a region file that will be used to define the areas where coverage will be computed. A BED file or interval source may be used to define the region. If using a BED file there must also be an index (TBI) adjacent to it in its location.

If you need to compute an index for your BAM or BED files go to Tools > Manage Data Sources, navigate to the directory where your files are saved using the Browse button. Once the files are visible in the dialog right-click on each and select Computations on Source, then check Genomic Index and click OK on the following dialog.

Compute Index

Compute Index for BAM files

Once indexing of your files is complete you can go to Add > Computed Data... and under the Sample options select the Targeted Region Coverage algorithm.

Select Coverage Statistics options

Select Targeted Region Coverage Statistics Algorithm

On the next dialog select your BED or interval track that will be used to define the regions of interest and optionally specify an additional depth threshold.

Select Region Track

Select Region Track

Results will be produced in two forms, the first will be the per region output provided in the Coverage Regions table.

Coverage Regions Statistics

Coverage Regions Statistics Output

The second will be overall sample statistics provided in the Samples table.

Sample Coverage Statistics

Sample Coverage Statistics Output

How can I prioritize my variants based on a known phenotype?

VarSeq has implemented a gene ranking algorithm PhoRank modeled on the Phevor algorithm.

The PhoRank algorithm will rank the genes that overlap the variants in your data based on their proximity to user-specified phenotypes.

To run this algorithm you must first annotate your variants against a gene source which you can do by going to Add > Annotation... and selecting the gene source from your Local annotation folder.

Then, go to Add > Computed Data... to select PhoRank. There are two options for running this tool, if you have a single list of phenotype(s) that will be the same for all the samples in your dataset then select the Gene > Project/Cohort > Variant PhoRank Gene Ranking

Per Variant PhoRank

Variant PhoRank Gene Ranking

If you have a different phenotype(s) for each sample then select the Gene > Per Sample > Sample PhoRank Gene Ranking option.

Per Sample PhoRank

Sample PhoRank Gene Ranking

For this example we will run the per variant version of PhoRank. After selecting the correct version for your needs you will be prompted to select the gene name field from your gene annotation source.

Select Gene Name Field

Select Gene Name Field

On the next dialog enter your phenotype(s) of interest and click OK.

Phenotype Dialog

Enter in phenotype(s) to be used for ranking.

The results will be available in the Variants by Gene table as well as the Variants table. The results can be sorted by either Gene Rank and Gene Score and filter cards can also be created from these values from the right-click menu.

Results of PhoRank

Results of the PhoRank Algorithm