3.8.3. VarSeq CNV Reference Manager

Reference Location

The reference sample manager looks for reference samples stored in the Reference Samples folder. The Reference Samples folder can be set by going to Tools -> Options and changing the CNV References Folder. This folder will be used by default when adding the CNV algorithm to a VarSeq project.

Exploring Current Reference Folder

Open the Manage Reference Samples dialog by going to Tools -> Manage Reference Samples…. This dialog will display the all of the reference samples in the current Reverence Samples folder. Each reference sample has the following attributes:

  • Sample Name - This is the name of the sample. If the reference file was created with the Manage Reference Samples dialog this value is read from the input BAM file, if the file was created by VarSeq the sample name will be the sample name used when the variants were imported.

  • Panel Name - The name of the BED or TSF file used when computing Target coverage, or Binned Coverage Statistics and the number of bins if computing coverage on whole genomes using the binned coverage method.

  • Panel Hash - This is used to match sample coverage to corresponding reference samples. It is computed using the chromosome, start, and stop, of the regions spanned by the panel. If reference samples have the same panel hash then they belong to the same reference set.

  • Target Count - The number of targets regions in the panel or binned genome.

Adding Reference Samples

Click the Add References button in the Manage Reference Samples dialog to add reference samples to the current Reference Samples folder.

Selecting Sample BAMs

The first step to adding reference samples is selecting the BAM files that will be used to compute the coverage statistics for the samples. Select the BAMs by using the Add buttons or by dragging the files into the dialog.

Reference Sample Algorithm Selection

Select which coverage algorithm to run.

The BAM files for the references should follow the same prep and secondary analysis that is used for the samples of interest. Inconsistencies between the samples of interest and the reference samples can lead to unpredictable results. Once all of the BAMs for the reference samples have been selected click Next.

Selecting Panel Type

Select Taget References to compute coverage across a set of target regions, you must then select a interval track which defines the target regions. Alternatively you can select Binned References if you are computing whole genome coverage. If computing Binned References you have the option to select a blacklist interval track, this can be used to mask out regions of the genome that you do not want to include in your analysis.

Reference Sample Algorithm Selection

Select which coverage algorithm to run.

After you have made your selection, click Create to start the computation.

Computing Coverage

The computation step will first compute genomic indexes on the selected BAM’s if they do not already exist. Finally computed is computed over the selected regions of the BAM files. Click Close to view the new reference samples in the Manage Reference Samples dialog.