3.10.7. Count Alleles¶
This algorithm counts the number of alternate alleles in the genotype field across all of the samples.
Requirements¶
Requires a genotype (GT) sample level field.
Options¶
Sample Grouping: Optionally takes a categorical sample level field and counts the alleles for each category. You can add these fields during the import process or use the default field such as Affection Status.
Remove No-Calls Genotypes: By default, the # Alleles field includes no-call genotypes such as ./., meaning in general it will always be twice the number of samples. If you select this option, no-calls will reduce this value and also change the computed Allele Frequencies to match. This may make sense in multi-sample calling pipelines, but beware you may encounter situations where you have high allele frequencies simply because variants appear in only a few samples and in all other samples was considered a No-Call.
Output Sample Names: When selected, a new Sample Names field is created that lists the names of the samples containing a variant genotype when the number of samples with this condition passes the specified threshold.
Output¶
Allele Counts: Counts of each alternate allele for each site across all samples. In most cases, there is only a single alternate and so the count is the number of observations of this allele across all chromosomes of the samples.
For example, a homozygous variant for a sample gets a count of 2, while a heterozygous genotype gets a count of 1.
Allele Frequencies: The Allele Counts divided by the total number of observed alleles (# Alleles). Missing genotypes are assumed to be bi-allelic, which adds 2 to the total.
# Alleles: Total number of observed alleles in called genotypes.
# Hets: Count of the number of heterozygous genotypes across all samples.
# HomoVar: Count of the number of homozygous (or hemizygous) non-reference called genotypes across all samples.
# Samples: Count of the number of samples that have one or more variant allele.
Sample Names: (Optional) The names of the samples containing a variant genotype (not reference or missing).
Homozygous Sample Names: (Optional) The names of the samples containing a homozygous variant genotype.
Heterozygous Sample Names: (Optional) The names of the samples containing a heterozygous variant genotype.