Two approaches are available for computing linkage disequilibrium
(LD), depending upon the method used for imputing the two-marker
haplotype frequencies upon which the LD computations depend,
expectation-maximization (EM) vs. the composite haplotype method
3.6.2. Computing LD using the Composite Haplotype Method (CHM)
3.6.3. The D-Prime Statistic
If the minor allele frequencies of the respective markers are small,
the magnitude of the statistic cannot get very large,
even if the marker is in almost complete linkage disequilibrium,
compared to the magnitude it could have had if the allele frequencies
of the markers were almost equal.
The D-prime statistic was designed to compensate for
this. is defined as normalized by
the maximum possible value that could possibly have
given the allele frequencies in each of the markers.
if , and
The overall D-prime statistic is defined as
For EM, the above formula is used directly on the values of
, , and , where the
are imputed using the technique of
Computing LD using Expectation-Maximization (EM).
For multi-allelic CHM, we use
if , and
otherwise, with the overall D-prime statistic being defined as
For bi-allelic CHM, we use the same formulas as for multi-allelic CHM,
except that for the final , we take the original overall
obtained as above and use a Hardy-Weinberg correction on
where , , and
are defined as in Bi-Allelic.