Binned Region Coverage

Sample level coverage statistics allow for the computation of basic coverage information over fixed width bins from a corresponding BAM file. The total coverage as well as strand based coverage is computed from the quality filtered pileup depth for each region. Aggregate statistics are computed for each sample across all of the defined regions to provide a high level overview of the sample’s coverage.

Note

This algorithm is designed for the consistent coverage profile of WGS data. It is generally used as the input to the CNV Caller on Binned Regions.

Requirements

BAM File

Each sample must be paired with a BAM file during the initial data import see Associate Sample BAM Files. Each BAM file should be unique to the sample and have a corresponding index file (.bai) adjacent to it in its file location.

Options

  • Bin Size: Defines the size in base pairs of the equally spaced regions over which coverage will be computed. Each bin will generate its own record in the final output.
  • Additional Depth Threshold: The percentage of bases in each region is computed by default for depths of 1x, 20x, 100x, and 500x. An additional depth may be specified to augment these fields. The percentage of bases at this depth will be computed for each region.
  • Masked Regions: The masked region file is used to specify regions to be excluded from coverage computation. A BED file or interval source may be used to define the regions; the file must be indexed.

Sample Coverage Output

The fields from the file used to define the regions will be included in addition to the fields that are computed by the coverage statistics algorithm.

  • Span: The width of the region. Computed from the difference between the stop and start positions.
  • Mean Depth: The mean coverage depth for all of the bases in the region.
  • Mean Forward Depth: The mean coverage depth for all of the bases in the region on the Forward Strand.
  • Mean Reverse Depth: The mean coverage depth for all of the bases in the region on the Reverse Strand.
  • Mean Filtered Depth: The mean coverage depth for reads that were filtered out of this region. The reads that are filtered have a poor mapping quality, indicating they may map to multiple regions.
  • Min Depth: The minimum total depth (forward depth + reverse depth) across the region pileup.
  • Min Forward Depth: The minimum depth on the forward strand pileup across the region pileup.
  • Min Reverse Depth: The minimum depth on the reverse strand pileup across the region pileup.
  • Max Depth: The maximum total depth (forward depth + reverse depth) across the region pileup.
  • Max Forward Depth: The maximum depth on the forward strand pileup across the region pileup.
  • Max Reverse Depth: The maximum depth on the reverse strand pileup across the region pileup.
  • % 1x: The percentage of bases with a coverage depth of at least 1 in the region.
  • % 20x: The percentage of bases with a coverage depth of at least 20 in the region.
  • % 100x: The percentage of bases with a coverage depth of at least 100 in the region.
  • % 500x: The percentage of bases with a coverage depth of at least 500 in the region.

Output of the Coverage Regions Table

The coverage statistics algorithm will generate a ‘Coverage Regions’ table view. This table will include records for all of the regions in the region file.

Searching the Regions Table

The regions table can be searched by right clicking on a column title and selecting search this column. This allows for the examination of coverage regions that fall above or below user defined thresholds for the field.

Variants by Region Table

This composite table view includes all of the regions that cover one or more variants from the filtered Variant table. The regions appear in the left hand table, and the corresponding variants in the right hand table. The variants that fall within each region can be viewed by changing the row selection in the region table.

Output in the Variant Table

Variants will be matched to any regions they fall within. The values for each of the matching regions will be listed in their respective fields which are appended to the Variant table.

Output in the Samples Table

Summary statistic fields are appended to the Samples Table. These fields provide summary information computed across all of the regions.

  • Sample Mean Depth: The average coverage of the sample over all of the regions. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.
  • Sample Mean Forward Depth: The average coverage of the sample over all of the regions on the Forward Strand. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.
  • Sample Mean Reverse Depth: The average coverage of the sample over all of the regions on the Reverse Strand. The average is weighted by the size of the regions to give the average depth over all of the bases that fall within each regions.
  • Sample %1x: The percentage of bases in all of the regions with at least 1x coverage.
  • Sample %20x: The percentage of bases in all of the regions with at least 20x coverage.
  • Sample %100x: The percentage of bases in all of the regions with at least 100x coverage.
  • Sample %500x: The percentage of bases in all of the regions with at least 500x coverage.

Note

If an Additional Depth Threshold was specified a corresponding sample level field will also be computed.