2. Calculate Principal Components and Eigenvalues

Now that your data is in the correct orientation, the next step is to calculate as many principal components as possible (the limit is the number of samples less 1) and the eigenvalues for only the autosomes.

A. Run Numeric Principal Component Analysis

  • Open Pheno + LogRs - Sheet 1 and select Quality Assurance >Numeric Principal Component Analysis.
  • Set the parameters in the Numeric Principal Component Analysis window as shown in Figure 2a where Find up to to ____ components is equal to your total number of samples minus 1.
  • Make sure Center data by marker is checked and click Run.
Figure 2a. Numeric PC parameters.

Figure 2a. Numeric PC parameters.

Upon completion, two spreadsheets are created Principal Components (Center by Marker), and Eigenvalues (Center by Marker).

B. Generate Scree Plot of PC Eigenvalues

  • From the PC Eigenvalues (Center by Marker) spreadsheet, right-click on the Eigenvalue column header and select Plot Variable.

This generates a scree plot of the PC eigenvalues (Figure 2b). There should be an apparent bend in the plot, referred to as the “elbow”.

Figure 2b. Zoomed scree plot with potential elbow highlighted.

Figure 2b. Zoomed scree plot with potential elbow highlighted.

C. Determine Min and Max Number of PCs

Rather than examine all possible number of principal components to find the ideal number, you can usually, by visual inspection, narrow the search by selecting a range to search through that covers the elbow.

  • Determine the minimum number of principal components and the maximum number of principal components. For this example, the minimum number of components will be set to 1 and the maximum number of components set to 60 in order to ensure that the optimum number of components is not missed. However, the minimum number of components could have been chosen to be a number closer to the potential elbow region such as 20.