3.10.25. Gene List Coverage Statistics

Gene List Coverage Statistics allow for the computation of sample level coverage statistics on a gene by gene basis. For each gene the coverage is computed using the exons of the clinically relevant transcript as the target regions. The total coverage as well as strand based coverage is computed from the quality filtered pileup depth for each region. Aggregate statistics are computed for each sample across all of the regions to provide a high level overview of the sample’s coverage.

Requirements

BAM File

Each sample must be paired with a BAM file during the initial data import see Associate Sample BAM Files. Each BAM file should be unique to the sample and have a corresponding index file (.bai) adjacent to it in its file location.

Gene Track

The Gene track is used to lookup the genes from the user defined names. The exons from these genes are used to define the regions over which to compute coverage statistics.

Options

  • Exon Padding: The number of bases on each side of the exons to include in the target region.

  • Only Include Coding Exons: If this is checked non-coding utr regions will not be included in each genes targets.

  • Values to Match: The names of the genes to match. Genes that are not found in the gene track are underlined in red. Genes which were found in the alias field are underlined in blue. Genes that are found in the alias field will be converted to their primary name before running the algorithm.

  • Include alias field for string matching If checked the Alias field of the gene track will be searched as well as the Gene Name field. If the the value is found in the Alias field, the corresponding Gene Name will be used by the algorithm.

  • Additional Depth Thresholds: The percentage of bases in each region is computed by default for depths of 1x, 20x, 100x, and 500x. An additional depth may be specified to augment these fields. The percentage of bases at this depth will be computed for each region.

  • Count only proper pair flagged (0x02) reads: Including this flag signifies only the use of reads in which both ends of the read were properly mapped and they were mapped within a reasonable distance given the expected distance provided to the alignment software.

Sample Coverage Output

The fields from the file used to define the regions will be included in addition to the fields that are computed by the coverage statistics algorithm.

  • Span: The width of the region. Computed from the difference between the stop and start positions.

  • Mean Depth: The mean coverage depth for all of the bases in the region.

  • Mean Forward Depth: The mean coverage depth for all of the bases in the region on the Forward Strand.

  • Mean Reverse Depth: The mean coverage depth for all of the bases in the region on the Reverse Strand.

  • Mean Filtered Depth: The mean coverage depth for reads that were filterd out of this region. The reads that are filtered have a poor mapping quality, indicating they may map to multiple regions.

  • Min Depth: The minimum total depth (forward depth + reverse depth) across the region pile up.

  • Min Forward Depth: The minimum depth on the forward strand pileup across the region pile up.

  • Min Reverse Depth: The minimum depth on the reverse strand pileup across the region pile up.

  • Max Depth: The maximum total depth (forward depth + reverse depth) across the region pile up.

  • Max Forward Depth: The maximum depth on the forward strand pileup across the region pile up.

  • Max Reverse Depth: The maximum depth on the reverse strand pileup across the region pile up.

  • % 1x: The percentage of bases with a coverage depth of at least 1 in the region.

  • % 20x: The percentage of bases with a coverage depth of at least 20 in the region.

  • % 100x: The percentage of bases with a coverage depth of at least 100 in the region.

  • % 500x: The percentage of bases with a coverage depth of at least 500 in the region.

Output of the Coverage Regions Table

The coverage statistics algorithm will generate a ‘Coverage Regions’ table view. This table will include records for all of the regions in the region file.

Searching the Regions Table

The regions table can be searched by right clicking on a column title and selecting search this column. This allows for the examination of coverage regions that fall above or below user defined thresholds for the field.

Variants by Region Table

This composite table view includes all of the regions that cover one or more variants from the filtered Variant table. The regions appear in the left hand table, and the corresponding variants in the right hand table. The variants that fall within each region can be viewed by changing the row selection in the region table.

Output in the Variant Table

Variants will be matched to any regions they fall within. The values for each of the matching regions will be listed in their respective fields which are appended to the Variant table.

Output in the Samples Table

Summary statistic fields are appended to the Samples Table. These fields provide summary information computed across all of the regions.

  • Sample Mean Depth: The average coverage of the sample over all of the regions. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.

  • Sample Mean Forward Depth: The average coverage of the sample over all of the regions on the Forward Strand. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.

  • Sample Mean Reverse Depth: The average coverage of the sample over all of the regions on the Reverse Strand. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.

  • Sample %1x: The percentage of bases in all of the regions with at least 1x coverage.

  • Sample %20x: The percentage of bases in all of the regions with at least 20x coverage.

  • Sample %100x: The percentage of bases in all of the regions with at least 100x coverage.

  • Sample %500x: The percentage of bases in all of the regions with at least 500x coverage.

Note

If an Additional Depth Threshold was specified a corresponding sample level field will also be computed.