7. Genotype Association Analysis

After assuring the quality of the data, association testing can be performed.

A. Genotype Association Testing

  • Open the Filtered Data for Association Testing spreadsheet and make sure the Phenotype 1 Binary column header is still set as the dependent variable.

  • Choose Analysis >Genotype Association Tests.

  • Make sure the Additive Model: (dd) -> (Dd) -> (DD) radio button is selected and check only Correlation/Trend test and Exact form of Cochran-Armitage test under Test Statistic or Method.

  • Under Multiple Testing Correction, make sure Bonferroni Adjustment is checked and uncheck the other options.

  • Check Output data for P-P/Q-Q Plots and click Run.

  • Upon completion a new spreadsheet is created, Association Tests (Additive Model). This spreadsheet displays several association statistics for each SNP (Figure 7-1).

    Association test results

    Figure 7-1. Association test results

The results from the Exact Cochran-Armitage Test should be examined in the case when a SNP has a significant p-value but the counts in the contingency table of Case Status by Number of Minor Alleles has at least one count less than 5. In this case the assumptions of the Correlation/Trend test are violated.

In this study the most significant marker: SNP_A-2070191 (Corr/Trend p-value = 2.729e-7) has the following contingency table:

  dd Dd DD Total
Case 100 112 19 231
Control 148 71 5 224
Total 248 183 24 455

This results in an Exact Armitage P-Value of 2.416e-7. There is little difference because all cell counts are 5 or higher.

B. Generating Q-Q Plots

Q-Q plots are generated by plotting the expected chi-squared values against the observed chi-squared values.

  • From Association Tests (Additive Model), select Plot >XY Scatter Plots. Two list views will appear.
  • Select Corr/Trend expected X^2 (seventh down) in the left list box and Corr/Trend X^2 (sixth down) in the right list box.
  • Click Plot.
  • Select Graph 1 in the Graph Control Tree.
  • Under the Add Item tab select f(x) = m(x) + b and click Add.

This will generate a straight line with a slope of 1 and y-intercept of 0. You should have a Q-Q plot that looks like Figure 7-2.

Q-Q plot of association results

Figure 7-2. Q-Q plot of association results

  • To change the weight and color of this line, select its associated graph item in the Graph Control Interface and choose the color and weight you like.
  • When you’ve finished, close the Plot Viewer and rename its associated node in the Project Navigator to Q-Q Plot.

Similarly, you might also plot a P-P plot by using the expected -log10 P on the X axis and -log10 P on the Y.

C. Generating P-Value Plots

  • From the Association Tests (Additive Model) spreadsheet, right-click on the Corr/Trend -log10 P column (2) and select Plot Variable in Genome Browser.

Notice the full-domain view now has chromosome bands and the X-axis is represented by chromosome and physical position (Figure 7-3).

P-value plot in genome browser

Figure 7-3. P-value plot in genome browser

There are many ways to zoom in the genome browser: double-clicking on a chromosome in the full domain view (upper band), double-clicking a cytoband or gene in the Annotation Tracks pane (lower band), manually selecting a chromosome and/or position in the Graph Attributes tab (at the top above the Full Domain Band), or “rubber band ” zooming in the plot view itself.

  • Double-click the 6 in the Full Domain View since this is the location of the most significant p-values.

Zooming displays the karyogram view of chromosome 6. More information about SNPs are available with different annotation tracks.

  • Zoom further into the peak (click and drag on the x-axis) and left-click on the top-most point in the plot (Figure 7-4).
Zoomed in around the peak in the p-value plot

Figure 7-4. Zoomed in around the peak in the p-value plot

This displays the marker name, its p-value, chromosome, and position in the Data Console (bottom-left pane), along with additional links to online resources.

  • You can see what gene(s) this SNP and others in the peak reside by looking at the annotation tracks pane at the bottom.
Observing the gene track in the Annotation Tracks pane

Figure 7-5. Observing the gene track in the Annotation Tracks pane

Note

You can increase the size of the Annotation Tracks window by dragging the top of the pane up (Figure 7-5 with additional zooming).

D. Creating a Manhattan Plot

Manhattan plots are popular images for publication purposes as they color-code by chromosome making it easy to see where significant markers reside.

  • First, right-click anywhere on the plot view and select Reset Zoom.
  • Select the Corr/Trend -log10 P graph item in the Graph Control Interface.
  • Under the Color tab, select the By Variable radio button and click Select variable....
  • Choose Chromosome from the list and click OK.
Manhattan plot

Figure 7-6. Manhattan plot

This will split the graph into 22 different colors, one for each chromosome (Figure 7-6). You can change the color of each chromosome by selecting its respective node in the color tab.

E. Saving Plots as Images

You can save all displayed plots in the plot view to a number of popular image formats.

  • Select Graph 1 in the Graph Control Tree and uncheck the Legend box in the Graph tab.
  • Choose File >Save as Image.

This will bring up a preview window (Figure 7-7). Here you can manipulate various image parameters.

  • Uncheck the Full domain view option under Graph Options.
  • You can also change the size and margins of the image.
  • Next, Browse to a folder where you want the image saved, give it the name Manhattan Plot and click Save.
  • Click Save again at the bottom of the preview window to save the image.
  • Once the image is saved close the Plot Viewer and rename its associated node in the Project Navigator to Manhattan Plot.
Save as Image preview window

Figure 7-7. Save as Image preview window

You have now performed a cursory genome-wide association study on a case/control phenotype. For more challenging analyses, try running association tests and regression on the other phenotypes. If you click on the first node in the Project Navigator, SNP_GWAS_Tutorial, you will get more information on what can be found with each phenotype.