Outlined below are additional data visualization techniques that will allow for further examination of your copy number data.
Open up the plot previously created. Deselect the pvalue plot so that only the heat map is visible.
The copy number variations become more apparent with this color scheme. Since the heat map is sorted by case/control status, color-consistent streaks on either the lower or upper half of the plot could signal an association. Though at first glance there are not any obvious differences between cases and control there are a couple regions of interest. Figure 4-1 highlights large CNVs for individual samples in red, and copy variations among all the samples in blue.
Let’s look at one of the common CNVs first.
The plot should now look like Figure 4-2 with three common CNVs shown. Notice that not every sample has the exact same starting and ending boundaries for each region. This is common when using the univariate segmentation method. The consequence is that the output spreadsheet containing the “first column” of each CNV segment will contain a substaintial amount of redundant data. As it is likely that the same feature (probably a common indel region) is being detected in each of the subjects, the overlapping area is probably the most correct representation of the underlying biology. The multivariate segmentation method will determine the most likely endpoints by comparing data across all samples and result in uniform endpoints for common CNV regions.
Let’s add the Database of Genomic Variants annotation track to see how this region has been catalogued and the Affymetrix 500K probe track to see how dense the genotyping was around these areas.
The plot now looks like Figure 4-3 (you can make the Annotation Tracks section larger by clicking and dragging up the separation bar at the bottom of the heat map).
Notice that these regions have been extensively cataloged by the Database of Genomic Variants. You can zoom into each region to explore them further.
Now let’s take a look at the individual samples with larger chromosomal aberrations.
You can now plot the LRs and segmentation covariates together to investigate these samples of interest.
Notice the large gain in the left of the plot (Figure 4-4).
To see which sample this is, zoom in on the Y-axis and click on the green streak.
Let’s add this sample’s LRs to the plot.
Now let’s plot the segmentation covariates on top of the LRs as before.
The graph should look like Figure 4-5. Notice the shift in the LRs and segmentation covariates for that region.
Congratulations! You have now worked your way through an entire copy number analysis project, no easy task. We wish you the best of luck on your own study. As always, if you have any questions or need help with any portion of this tutorial or your own analysis, please give us a call. We’d be happy to help!