LoH Caller¶
This algorithm calls Loss of Heterozygosity events based on variant allele frequency. This algorithm is modeled on the H3M2 algorithm.
H3M2 uses a heterogeneous Hidden Markov Model (HMM) that incorporates inter-marker distances to identify LoH events from allele frequency data. The model has three hidden states representing homozygous, non-homozygous, and trisomy states of the genome, while the observations are the variant allele frequency values at each position. The emission probability distributions are defined by two truncated Gaussian mixture models as follows:
where is the weight of the
-th component of the mixture model,
is a parameter used to modulate the spread of the distributions, and
denotes the allele frequency.
The values
,
, and
denote homozygous, heterozygous, and trisomy states
respectively.
The transition probability for moving from a non-homozygous state to either homozygous or trisomy is given by:
where denotes the likelihood of moving from the non-homozygous
state,
is the genomic distance between position
and
,
and
is used to modulate the effect of the genomic distance on
the transition probabilities.
The probability for moving from a trisomy or homozygous state is defined similarly,
using a parameter
to specify the likelihood of the state change.
Requirements¶
Requires Variant Allele Frequency (VAF) and Genotype Quality (GQ) sample level fields.
Output¶
The LoH Caller algorithm will generate a LoHs table view. This table will include records for all called events.
- Region: Genomic coordinates (Chr: Start-Stop)
- # Samples: Number of samples in the event
- Span: The width of the event. Computed from the difference between the stop and start positions.
- LoH Event: True if the LoH is present in the current sample.
- LoH State: The state of the event; either LoH or Trisomy.
- Variants Considered: Number of variants in the LoH
- Percent in Expected State: Percentage of variants with VAF consistent with the called LoH state.