3. Perform Regression with ROH Covariates

At this point we will do regression analysis using the fractions provided in the First Column - Cluster of Runs spreadsheet as covariates. First we need to merge this with the phenotype spreadsheet.

  • Open Phenotype Dataset - Sheet 1 and select File >Join or Merge Spreadsheets.

A Navigator Window Chooser window will appear prompting you to select a spreadsheet to join with.

  • Select the First Column – Cluster of Runs spreadsheet and click OK.
  • Within the Join or Merge Spreadsheets dialogue, enter the New Dataset Name: Pheno + First Column - Cluster of Runs.

Choose to create the Spreadsheet as Child of: Current Spreadsheet, leave the rest as defaults and click OK.

Figure 13. Merged Pheno + Common ROH spreadsheet

Figure 13. Merged Pheno + Common ROH spreadsheet

A new spreadsheet (Figure 13) will be created containing case/control status, population and the common ROH covariates.

You can now perform association using Numeric Association Analysis or Regression Analysis. You will perform Regression Analysis in order to control for population stratification in the second step.

  • Left-click the Case/Control column header to set the column as the dependent variable (magenta).
Figure 14. Regression Parameters

Figure 14. Regression Parameters

  • Select Analysis >Numeric Regression Analysis and choose the radio button Regress once on each of the 24 numeric columns under Selection Parameters.
  • Next select the Output Parameters tab and check Output data for P-P/Q-Q plots (Ouput –log10(P) will be checked automatically). Your window should match Figure 14. Click Run.
Figure 15. P-value Plot

Figure 15. P-value Plot

A Regression Results spreadsheet will appear.

  • Right-click on the –log10 Full-Model P column header (3) and select Plot Variable in Genome Browser. The resulting plot (Figure 15) will show association on chromosome one. As mentioned earlier, population is a possible confounding factor so we should control for it in our analysis.
  • Close the plot and the regression results spreadsheet.
  • Open the Pheno + First Column - Cluster of Runs spreadsheet and select Analysis >Numeric Regression Analysis.
Figure 16. Full vs. Reduced Model Parameters

Figure 16. Full vs. Reduced Model Parameters

  • Under Regress on each of the 24 numeric columns check Correct for covariate(s). The Reduced Model Covariates section becomes active.
  • Click on Add Covariate and select Population from the list. Click Add and then Close. Verify that your window matches Figure 16 and click Run.
  • Plot the new –log10 P’s as you did in the previous analysis.

The plot (Figure 17) shows that population stratification explains some of the variation in our data, however even after controlling for the confounding variable, we still see significant association.

Figure 17. P-value Plot Full vs. Reduced Model

Figure 17. P-value Plot Full vs. Reduced Model

Previous topic

2. Identify Runs of Homozygosity

Next topic

4. Examine Loss of Heterozygosity