Annotate Overlapping CNVs

This algorithm annotates against CNVs in the selected source. This algorithm identifies CNVs that overlap CNV regions in the annotation source. The minimum required similarity is configurable and the algorithm can be made to match on CNV type.

Requirements

An annotation source to use for annotating the regions or variants.

Options

The user may specify the following options:

  • Minimum similarity coefficient: The minimum similarity coefficient required to match a CNV region.
  • Match CNV Type: This option will cause the algorithm to match on CNV type.
  • Field of CNV Type: The name of the CNV type field to match on.
  • Include counts of matching CNVs: This option will include counts for each CNV type.

Output

The output includes columns for the selected source. If multiple features match the CNV then the results will be joined together in a list for each field. If a CNV does not have an overlapping feature the fields will be filled in with missing values.

The following additional fields are also included in the output:

  • # Matched: The number of events overlapping this region.
  • # Gains: The number of duplications overlapping this region.
  • # Losses: The number of deletions overlapping this region.
  • # Matches Type: The number of events overlapping this region with matching CNV type.
  • Region: The genomic position of the region.
  • Span: The similarity of the two overlapping regions, defined as the size of the intersection divided by the size of the union.
  • Similarity Coefficient: The similarity of the two overlapping regions, defined as the size of the intersection divided by the size of the union, also known as the Jaccard index.