Welcome to the CNV PCA Search Tutorial!ΒΆ


Updated: March 1st, 2014

Level: Advanced

Packages: CNV Analysis, Power Seat

For both microarray and aCGH data, significant bias can be introduced by batch effects (plate, machine, and site variation), genomics waves, and population stratification. Other sources of variation include sample extraction and procedures, cell types, temperature fluctuation, and even ambient ozone levels in a lab. These can lead to complications ranging from poorly defined copy number segments to false and non-replicable findings. Utilizing SVS 8’s powerful principal component analysis methods enables you to simultaneously correct for all these variations, while significantly improving signal-to-noise ratios. But the question is, “How many principal components should you correct for to remove the batch effects without also removing the signal?”

This tutorial leads you through a holistic approach to determine the optimal number of principal components to correct for with copy number data by utilizing both PCA and association analysis techniques.


To complete this tutorial you will need:

  • A marker mapped spreadsheet containing a case/control affection status column and log ratio data. If the phenotype of interest is quantitative, you will need to transform this variable into a dichotomous trait by using a cut-off value.


The dataset used in this tutorial is the phenotype and log ratio data that is provided in the CNV Quality Assurance tutorial.

We hope you enjoy the experience and look forward to your feedback.