Predict Phenotypes From Existing Results

Overview

This feature will predict phenotypes using existing Allele Substitution Effects (ASE) and fixed effect coefficients and genotype and fixed effect information.

The initial spreadsheet, if in genotypic format, will be numerically recoded to ensure that the major/minor alleles are the same as that used in either GBLUP or Bayes, or in K-Fold.

Note

This method uses (with a genotypic spreadsheet) or assumes (with a numerically recoded spreadsheet) an additive genetic model.

Predict Phenotypes From Existing Results Dialog Window

Predict Phenotypes From Existing Results Dialog Window

Options

  • Computation Method(s): The following methods are available to represent genotype values:

    • As is: Genotype values will be coded as either 0, 1, 2 (additive model) (This is how Bayes C/C\pi treats them)
    • Centered: Genotype values will be coded in the additive model, but will then be subtracted by the mean. (This is how GBLUP treats them).
  • Impute Missing Genotypic Data As: Missing genotypic data can be imputed by either of the following methods:

    • Homozygous major allele: All missing genotypic data will be recoded to 0.

    • Numerically as average value: All missing genotypic data will be recoded to the average of all non-missing genotype calls (using the additive model).

      Note

      If Correct for Gender (see below) is also selected, and there is non-missing data for both males and females in a given marker, averages for males and females will be computed and used separately.

  • Correct For Gender: Assumes the column coded as if the male were heterozygous for the X-Chromosome allele in question. For GBLUP and Bayesian implementations, please see Correcting for Gender and Gender Correction.

    Note

    This option will only be available if there is a marker map and it contains at least one column in a chromosome that is listed in the assembly file as an allosome. The drop down list will only have chromosomes that are both allosomes and in this spreadsheet.

  • Transformed Data: If data had been standardized to perform calculations, done in Bayes C/C\pi, then the mean and standard deviation can be entered here and the resultant predicted phenotypes will be transformed using these values. Please see Standardizing Phenotype Values.

  • Correct for Additional Covariates: Allows additional fixed effects to be added to this model from this spreadsheet. Fixed effect coefficients can be binary, integer, real-values, categorical, or genotypic. In all cases, if a marker is chosen as an additional fixed effect, it will not be included in the analysis in any other way. To begin, check this option, then clock on Add Columns to get a choice of spreadsheet columns to use.

  • Model Values: Select the spreadsheets containing the ASE and fixed effect coefficient values. The fixed effect spreadsheet should have, but doesn’t have to, a “Reference Covariate?” column to ensure the reference factors in any categorical covariates match between the fixed effect spreadsheet and the spreadsheet the prediction is being performed on.

Output

A single spreadsheet will be created:

Predicted Phenotypes: Contains the predicted phenotypes for all samples from the original spreadsheet.

Model

We can predict phenotypes values with the following model:

\hat y = X \hat \beta + M \hat \alpha

where \hat y are the predicted phenotypes, X is the fixed effects matrix, \hat \beta are the fixed effect coefficients, M is the genotype matrix, and \hat \alpha are the ASE values. Please see the Bayes Problem Statement and the GBLUP Problem Statement for more information on the model.