Sample Statistics

Sample level statistics compute statistics for each sample over the called sites. This provides a high level view of type of variants found for each sample and can be used to make quality control decisions.

Requirements

The project must have one or more samples imported.

Output the Samples Table

The selected statistics will be appended to each samples Table. These fields provide a summary of each samples variants.

TiTv Ratio

Counts of the two classes of single nucleotide variations, transitions, and transversions.

  • Transition Count The number of transitions for each sample.
  • Transversion Count The number of transversions for each sample.
  • TiTv Ratio The ratio of transitions to transversions

Coding TiTv Ratio

When run after the Annotate Transcripts algorithm, filtering by exon regions can be selected. Counts of the two classes of single nucleotide variations, transitions, and transversions are computed for exonic regions.

  • Transition Count The number of transitions for each sample in exon features.
  • Transversion Count The number of transversions for each sample in exon features.
  • TiTv Ratio The ratio of transitions to transversions in exon features.

Variant Count

  • Var Count The total number of non-missing, non-reference alleles for each samples.

Coding Variant Count

When run after the Annotate Transcripts algorithm, filtering by exon regions can be selected. Counts are reported only for features in exon regions.

  • Var Count The total number of non-missing, non-reference alleles for each samples in exon features.

Singleton Count

The number of times that a sample has a alternate allele which is not found in any other sample at that site. To be counted a sample may be homozygous or heterozygous for a singleton alternate allele.

  • Singleton Count The number of singletons for each sample

SNV Count

The number of single nucleotide variants (SNVs) for each sample.

  • SNV Het Count The number of heterozygous SNVs for each sample.
  • SNV Hom Count The number of homozygous SNVs for each sample.
  • SNV Het/Hom Ratio The ratio of heterozygous SNVs to homozygous SNVs.

Note

Hemizygous SNVs, Multi-nucleotide polymorphisms, (MNPs) and complex variants, are not included in this calculation.

Indel Count

The number of insertions and deletions (Indels) for each sample

  • Indel Het Count The number of heterozygous Indels for each sample.

  • Indel Hom Count The number of homozygous Indels for each sample.

  • Indel Het/Hom Ratio The ratio of heterozygous Indels to homozygous

    Indels.

Note

Hemizygous Indels are not included in this calculation.

Heterozygous Rate

The number, and ratio, of heterozygous genotypes for each sample.

  • Het Count The number of heterozygous genotypes for each sample.

  • Het Ratio The Het Count divided by the number of non-reference,

    non-missing, genotypes for the sample.

Homozygous Rate

The number, and ratio, of homozygous genotypes for each sample.

  • Hom Count The number of homozygous genotypes for each sample.

  • Hom Ratio The Hom Count divided by the number of non-reference,

    non-missing, genotypes for the sample.

Hemizygous Rate

The number, and ratio, of hemizygous genotypes for each sample.

  • Hemi Count The number of hemizygous genotypes for each sample.

  • Hemi Ratio The Hemi Count divided by the number of non-reference,

    non-missing, genotypes for the sample.

Reference Rate

The number, and ratio, of reference genotypes for each sample.

  • Ref Count The number of reference genotypes for each sample.

  • Ref Ratio The Het Count divided by the number of non-missing genotypes

    for the sample.

Note

It is important to remember that the number of reference calls made for a sample can change for gvcf files depending on the other samples it is imported with. Homozygous reference calls are inserted for a given sample when a genomic site is encountered, which has a variant for another sample, and a covered region for the given sample.

Call Rate

The ratio of genotypes which are non-missing for each sample to the total number of genomic sites in the project. Hemizygous called genotypes are treated as non-missing.

  • Called Genotypes Number of non-missing genotypes for each sample.

  • Call Rate Ratio of Called Genotypes to total genomic sites in the

    project.

Gender Inference

By specifying a gender chromosome and heterozygous threshold. The gender of a sample can be inferred from the heterozygous rate.

  • Gender Chromosome Het Ratio The ratio of the number of heterozygous

    variants in the specified gender chromosome to the total number of variants in the gender chromosome.

  • Inferred Gender The gender (Female or Male) of each sample.

Variant Type Count

The variant classification for each site that the sample has a non-reference, non-missing genotype. The classification is completed at the variant site level.

  • SNP Count The number of sites classified as single nucleotide polymorphisms where each sample has a non-missing, non-reference genotype.
  • MNP Count The number of sites classified as multi nucleotide polymorphisms where each sample has a non-missing, non-reference genotype.
  • Ins Count The number of sites classified as insertions where each sample has a non-missing, non-reference genotype.
  • Del Count The number of sites classified as deletions where each sample has a non-missing, non-reference genotype.
  • DelIns Count The number of sites which have insertion and deletion alleles where each sample has a non-missing, non-reference genotype.
  • Complex Count The number of sites classified as complex where each sample has a non-missing, non-reference genotype.