# Copy Number Analysis¶

## Copy Number Analysis Overview¶

### Copy Number Variation¶

A normal base pair has two copies, one on each chromosome. (A base pair on the X chromosome in men will normally have only one copy.) Even if the two base pairs are different alleles, there are still considered to be two copies.

However, under certain circumstances, and especially in the case of certain diseases, there may sometimes be a base pair, or even an entire chromosome, that will be replicated more than two times, appear just once, or deleted entirely. The number of copies of a base pair is termed “copy number”, and this variation of the copy number is termed “copy number variation” (CNV).

For both micro-array and array CGH (aCGH) scans, the more copies there are of a base pair, the higher the total intensity will be, irrespective of which alleles may be present, even if the base pair is a polymorphism. Typically, a lot of processing is needed to transform intensity data to a quantile-normalized log base-2 (log2) ratio of intensities of observations versus a reference population. When the intensities of the observations are the same as the reference population median for a given base pair, the log2 ratio will be equal to zero. Amplifications over the reference standard will be significantly greater than zero, and deletions will be significantly less than zero.

### Copy Number Analysis Package (CNV Analysis)¶

Golden Helix SVS supports reading micro-array and aCGH log2 ratio data from the
Affymetrix, Agilent, Illumina, and NimbleGen platforms, as well as
processing Affymetrix CEL files to generate log2 ratios, with the object
of determining where CNVs occur. Log2 ratio data from other platforms
may also be prepared and analyzed by Golden Helix SVS by first converting it to the
Affymetrix CNT text file format (see *Affymetrix CNT File Format*). Subsequently,
association analysis can be performed on the log2 ratios directly or on
related covariates over found CNV regions.

An abbreviated workflow for CNV association analysis is as follows:

- Import and/or prepare log2 ratio data.
- Perform quality assurance on log2 ratios.
- Execute CNAM (Copy Number Analysis Method) optimal segmenting on the “cleaned” log2 ratios.
- Create a new covariate spreadsheet consisting of average log2 ratios over found CNV regions for each sample.
- Import a spreadsheet with phenotypic data such as case-control status.
- Join the log2 ratio covariate spreadsheet with the spreadsheet containing phenotypic data.
- Perform association analysis on a phenotype with the covariates from the joined spreadsheet.

## Preparing Log2 Ratio Data¶

The CNAM (Copy Number Analysis Method) optimal segmenting algorithm uses log2 ratio data as input. Therefore, before you can begin copy number analysis, you must import log2 data into an SVS project.

For the Affymetrix 500k, SNP 5.0, and SNP 6.0 arrays, SVS supports
reading CEL intensity files and calculating normalized log2 ratios for
copy number segmentation and analysis in SVS. For details about
reading CEL intensity files, see *Affymetrix CEL Files*.

For the Affymetrix 10k, 100k, and 500k arrays, you may use the
Affymetrix CNAT Batch Analysis tool to create CNT files; or for the
100k, 500k, and SNP 6.0 arrays use Genotyping Console to create CNCHP
files. These files contain normalized log2 ratios and can be imported
into SVS for analysis. See *Affymetrix Files* for
instructions on creating and importing these files. Once the files have
been parsed to extract the log2 ratio values, the data is ready for copy
number analysis.

For the Illumina platform, you must use GenomeStudio with the
SVS DSF Export Plug-In to export the log2 ratio values from
your GenomeStudio project. For instructions on how to install and use
the plug-in, see *Exporting DSF Data from GenomeStudio using Plugin version 4.0*.

For the Agilent platform, you must use the Agilent import menu item and
select the correct intensity field for importing the log2 ratio data.
See *Agilent Files* for more information.

For the NimbleGen platform, you must use the NimbleGen Data Summary
Files import menu item and select at least one field to import. See
*NimbleGen Data Summary Files* for more information.

You may also create a minimal data file with your own normalized log2
ratio data using the Affymetrix CNT file format. This file must contain
log2 ratio data and marker map data. See *Affymetrix CNT File Format* for details. You
can then import the Affymetrix CNT files to create a spreadsheet of log2
ratio data.

## Using CNAM (Copy Number Analysis Method) Optimal Segmenting¶

CNAM Optimal Segmenting represents the second step for performing copy
number analysis after importing log2 ratio data into a project and
applying an appropriate genetic marker map. See *Genetic Marker Maps Overview* for more
information. CNAM Optimal Segmenting uses both the genetic marker map
information and the log2 ratios in the spreadsheet to discover regions
of markers in which the log2 ratios vary significantly from segment to
segment. While the genome has numerous regions of copy number variation,
these regions are approximated by the segments found with the CNAM
Optimal Segmenting algorithm. These segments will, with high
probability, be where there are regions of copy number loss, neutral or
gain in the data.

Upon segmenting, at least two new spreadsheets are created in the current SVS project: the segment means spreadsheet and the covariates spreadsheet. The segment means spreadsheet lists every region computed, its beginning and ending marker, and the segment mean log2 ratio value for every sample within that region. A covariate segment is created for all start and end positions for all samples. Each sample will have exactly the same number of covariates. The value of a sample’s covariates is determined by the segment mean for the segment that the covariate start and end positions are contained in. The covariates spreadsheet can be output in one of two formats, either a column is created for every active marker in the spreadsheet that was segmented, or a column is created for the first marker in every covariate segment. Optionally, a Wiggle file may also be generated which contains the locations of these regions.

Options and other fields within the CNAM Optimal Segmenting tool are
described below (see *CNAM Optimal Segmenting Window*).

### Log2 Ratio Spreadsheet¶

In order to use CNAM optimal segmenting on a spreadsheet, a spreadsheet
must contain log2 ratios and have a genetic marker map applied. From
this spreadsheet, select **Numeric** > **CNAM Optimal Segmenting**.

### Selecting Chromosomes¶

For large datasets, it is better to only segment a chromosome at a time or a few chromosomes at a time. As CNAM Optimal Segmenting does not segment across chromosomal boundaries results will not change by subdividing the segmenting by chromosome.

To select a chromosome or a few chromosomes, use the **Select** >
**Activate by Chromosomes** option then select the chromosomes you wish
to segment. It is not necessary to create a subset spreadsheet, as the
segmenting algorithm will only run on active numeric columns.

### Chromosome Segmenting Options¶

Variations of the CNAM Optimal Segmenting algorithms for obtaining the
regions of CNV are documented in the Formulas and Theories chapter, see
*CNAM Optimal Segmentation Algorithm*. Certain parameters for this algorithm may be
changed within these segmenting options.

#### Algorithm¶

CNAM offers two types of segmenting methods, univariate and multivariate. These methods are based on the same algorithm, but use different criteria for determining cut-points denoting CNV boundaries.

The multivariate method segments all samples simultaneously, finding general CNV regions which may be similar across all samples. This method is preferable for finding very small CNV regions. For a given sample, the covariate is the mean of the log2 ratios within each segment for that sample. These covariates can then be used for association analysis. This model makes the tenuous assumption that for a given disease, the beginning and end of a CNV region will be similar for subsets of the cases. That is, if the regions are conserved for enough cases it is expected there is sufficient power to find a statistical association. If this assumption holds true, very small CNV regions can be found because the signal will be assessed over multiple samples.

In reality there may not always be consistent CNV regions across
multiple samples. The univariate method segments each sample separately,
finding the cut-points of each segment for each sample individually and
a spreadsheet is created showing all unique cut-points found among all
samples. The univariate method discovers the optimal segments for each
sample and outputs the mean, for every sample, of every unique segment
found across all samples. This output can be displayed in one of two
formats ready for subsequent association analysis or for plotting
results. The output spreadsheets are discussed in *Outputs from CNAM Optimal Segmenting*.

#### Univariate Outlier Removal¶

The univariate outlier removal option helps to address the influence of
large negative or large positive values on determining segment
boundaries. It works by excluding found cut-points that bracket single
marker segments before running permutation tests to determine the
strength of the segment. This option is only valid when the minimum
number of markers per segment is set to 1. If outliers are not removed
and the minimum number of markers per segment is set to a number greater
than one, a single marker outlier could force adjacent markers to create
a segment that is driven only by the single outlier. This would inflate
the number of segments that had the minimum number of markers allowed,
and incorrectly specify boundaries if the number of markers in the
region was actually less than the the minimum number of markers allowed
in a segment. If the minimum number of markers in a segment was set to
one with the univariate outlier removal box not checked then single
marker segments would be found, but they would not be deemed significant
with permutation testing. As a result, the algorithm looks for fewer
segments at the expense of the larger, real segments. See
*CNAM Optimal Segmentation Algorithm* for more details on this option.

#### Use Moving Window¶

If “Moving Window” is selected, the segmenting is performed using a moving window of markers that sweeps across each chromosome. Segmentation is constrained to the window, and then the results from each region are combined to produce the whole-chromosome results. This can greatly reduce the run time of the algorithm for large chromosomes, but may also introduce edge effects. CNAM chooses window boundaries in such a way that edge effects are reduced, but still cannot guarantee globally optimal results when using a moving window. The run log contains details on what window boundaries are chosen by CNAM.

#### Moving window size (markers per window)¶

The number of consecutive markers analyzed in the moving window. This option is only available if “Use Moving Window” is selected. Smaller moving window sizes speed up the run-time of the algorithm, especially when not using hardware acceleration. Note, however, there is a somewhat higher risk of false discoveries using a moving window approach as there is the potential for anomalies due to looking at a window of data instead of all of it. Permutation testing does minimize this.

#### Max segments¶

CNAM uses the following basic approach to segment a chromosome:

1. Identify candidate segments by performing a least-squares fit of segment boundaries to the Log R Ratios.

2. Perform permutation testing to discard segments that don’t vary significantly from their neighbors. Continue discarding until only significant segments remain.

This option determines how many candidate segments CNAM starts with for each chromosome. It can be thought of as an upper limit on the number of segments that CNAM will be able to detect. Larger values make it possible to detect more copy number variations, but also slow down CNAM since more candidate segments must be found and permutation tested.

There are two ways to specify this limit:

1. **Per 10,000 markers in a chromosome** By default CNAM will look for
up to 10 segments per every 10,000 markers in a chromosome. This per 10k limit
can be changed to suit your needs. So, if a sample has 40,000 markers in
Chromosome 1, CNAM would search for up to 40 CNVs in that chromosome. All
Chromosomes will be treated as having at least 10,000 markers when computing
this limit.

2. **Single Constant Value** Alternately, it is also possible to
specify a single limit that will be used for all chromosomes (or window
if the moving window option is enabled) regardless of size or number of markers.

When performing univariate segmentation, smaller values for “Max Segments” are ideal for detecting large rare variants. Smaller values also help keep the run-time manageable. Large “Max segments” values can detect common smaller variants, but also suffer from increased false-positives due to additional multiple testing. Consider using the Multi-variate algorithm to detect small, common CNVs.

When performing multivariate segmentation, more segments are usually needed to detect smaller, common CNVs. The multivariate algorithm’s performance also scales better with more segments, so increasing this will have less effect on run-time compared to the univariate algorithm.

#### Min #markers per segment¶

This constrains the algorithm to only find CNV regions with this minimum number of markers in each segment.

This parameter allows you to prevent finding CNV regions based on short spans of noise. In general the permutation testing should prevent small spurious segments from showing up, but a good default for this parameter is 1 marker with univariate outlier removal on for univariate analysis. For multivariate analysis, a minimum number of 1 marker is still a good default. It is important to take into account any outliers in the log2 ratios for a sample. Outliers can still drive the segmentation results even after permutation testing, although their effect is minimized, to remove their effect use the univariate outlier removal option.

#### Max pairwise segment p-value¶

The “Max segments” parameter sets an upper bound on the number of segments found. However, the problem remains to determine the actual number of valid CNV regions in the data. The process used is, once a set of segments is found, each pairwise set of segments is compared through a permutation testing procedure. If every pair is statistically significant according to the “Max pairwise segment p-value”, then the -way split is retained. Otherwise, the algorithm continually decreases by one until every adjacent segment is significantly different from its neighbor or no segments are found, whichever comes first.

Larger p-values increase sensitivity by rejecting fewer segment pairs, but also increase the false-discovery rate. Conversely, smaller p-values decrease the false-discovery rate but also decrease sensitivity. Smaller p-values also require more permutations to accurately test, and can significantly increase the segmentation running-time.

CNAM uses random permutation testing to estimate the p-value for each segment pair. CNAM evaluates random permutations of the log ratios from the segment pair, where is this parameter. Each permutation is checked to see if it has a better split (smaller sum of squared deviations from the means) than the original input segments. If the percentage of random permutations that have a better split is greater than , then the pair is rejected as insignificant.

#### Segment Means Output¶

These options select which segment means output to generate, see
*Outputs from CNAM Optimal Segmenting* for details.

#### Log Output¶

Here you can enable the **Full Logging** option. This option outputs
extra messages that more thoroughly detail CNAM’s activity.

### Hardware Options¶

Several options exist to improve CNAM’s performance on modern computers.

#### Number of CPU Threads¶

Both the Univariate and Multivariate algorithms can take advantage of multi-processor or multi-core machines by performing some of their work in parallel threads. It is usually a good idea to match this number to the number of computational cores you have available on your system. The number of cores detected will be displayed to the right of this option.

This option only effects the number of threads on the system CPU, but not on any hardware accelerated devices(such as GPUs). For accelerated devices, CNAM automatically chooses the ideal number of threads. However, some operations (such as permutation testing) do not use hardware acceleration, so this option should still be set correctly even when using hardware acceleration.

#### Use Hardware Acceleration¶

CNAM can now take advantage of Graphics Processing Units(GPUs) and other OpenCL compatible devices to speed up segmentation. This option can dramatically improve performance without sacrificing accuracy. To use this option, you will need a device that supports OpenCL, such as a modern graphics card. You will also need up-to-date drivers to ensure full support.

If you have more than one OpenCL capable device, you can use the device
drop-down to choose which one you want CNAM to use. Currently, CNAM does
not support using multiple OpenCL devices simultaneously. For device
details and troubleshooting information, click the **OpenCL Info...**
button.

Note

For Windows Remote Desktop users, most GPUs can not be used when running SVS via Remote Desktop. This is because remote desktop sessions use a special video driver that is incompatible with OpenCL.Hopefully a work-around will be available in the future.

#### Specify Memory Limit¶

This option allows users to fine-tune the memory usage for multivariate segmentation. When left unchecked, SVS will estimate a good memory limit based on your current hardware. Specifying this parameter can improve performance on high-end hardware, or improve stability on low-end hardware.

If you are using a GPU, this option limits the amount of video memory CNAM will use. If using a CPU, it will limit the amount of system RAM used. It is a good idea to make this limit smaller than the total amount of memory available in order to leave room for the operating system, device drivers, and other software.

### Optional Output Files¶

On the Optional Output tab, checking the Optional Bookmark File Output
box exports the segment means to a UCSC Wiggle Track (WIG) file format
file for Genome Browser import. Use the **Browse** button for file name
selection.

If the WIG files are output while using the Univariate segmenting algorithm, the browse button will have you select a directory location as a WIG file will be generated for each sample. These files will be named using the sample name from the file.

### Excluding Markers¶

If desired, markers can be excluded from the segmenting algorithm and its results by inactivating the columns corresponding to those markers.

### Run Log¶

A log can be viewed during the computation by clicking on the Run Log tab after selecting the desired options and before running CNAM. The log informs you of the progress in segmenting sub-regions of the total region of markers being analyzed. If the number of segments found in a given window is equal to the maximum number of segments per window, a warning message will be printed in red, suggesting the user consider increasing that parameter.

Note

During processing, a normal progress bar is also shown in a separate window.

## Outputs from CNAM Optimal Segmenting¶

### CNV Covariates Spreadsheet¶

The first one (or two) spreadsheet(s) created upon segmenting is a covariates spreadsheet(s). This spreadsheet contains the average log2 ratio value for each sample within each segment of markers. In this spreadsheet, the rows correspond to the samples, the columns correspond to the overall segments which have been determined, and the data are the average log2 ratio values. The spreadsheet created also has the original marker map applied. There are two covariate spreadsheet variants, one or both can be output from the segmenting results and are described below.

#### First column of each segment¶

In this variant, each column is identified by the chromosome number
and the beginning marker of each segment (see *Segment Covariates Output for First Marker of Segment Only*).
The markers are identified by chromosome position. A new column is
created every time there is a new cut-point over all of the samples.
This creates common segments for all samples, although for a
particular sample there may be more columns than there are
cut-points. In the case where a new column is introduced but a
cut-point was not found for that sample, the CNV segment mean is
repeated for all columns in a segment. A new segment mean only occurs
for each new segment for each sample. This spreadsheet is ideal for
association testing as the Bonferroni multiple testing correction is
reduced.

#### One column per marker¶

In this variant, each column is identified by the marker name (see
*Segment Covariates Output for Every Marker*). A new column is created for every marker present
in the original segmented spreadsheet. The mean segment value is
repeated for every marker in each segment found, and only changes when
cut-points are reached. This spreadsheet is ideal to use for plotting
as there is a value for every marker, better demonstrating copy number
amplifications and deletions.

### CNV Segment List Spreadsheet¶

The third spreadsheet created upon segmenting is a list of CNV
segments means (see *Segment List Spreadsheet*). This spreadsheet contains
columns for the chromosome name, segment base start position, segment
base end position, segment mean, the number of markers in the segment,
the segment column start index, and the segment column end index.
Spreadsheet rows correspond to segments for each sample. If a sample
has 1,000 segments, then there will be 1,000 rows for the sample
before the next sample starts.

### Segment Run Log¶

The details of the segment run log displayed while segmenting, is
saved and output in the Segment Run Log for future reference (see
*Segment Run Log*). This log details if the maximum number of segments
found was reached and if the window size was doubled due to avoid
potential edge issues with the segmenting algorithm.

### Wiggle Track (WIG) File¶

If you chose to output WIG files they will be saved in the directory you selected.

## Manipulating and Analyzing CNAM Output¶

Several options for further investigating CNAM output, such as the Segment List spreadsheet, are located in the CNV sub-menu of the Analysis menu in the spreadsheet toolbar. The available functions are outlined below.

### Discretize CN Segment Covariates with Counts¶

Discretizes the Segment Covariates based on a two- or three-state copy number model, specified by the user.

If the three-state model is selected, two thresholds must be specified. The values in the covariate spreadsheet are replaced by a -1 if the segment mean is below the lower threshold, a 0 if it is between two thresholds, and a 1 if it is above the upper threshold.

If a two-state model is selected, one threshold must be specified. The two state models include a copy number loss model and a copy number gain model. If the loss model is selected, a value less than the threshold is indicated with a 1 and values above the threshold are indicated with a 0. If the gain model is selected, a value greater than the threshold is indicated with a 1 and values below the threshold are indicated with a 0.

If there is a missing value then it will still be indicated as missing.

A second spreadsheet will also be created reporting the number of copy number loss, neutral and gain values for each marker in the segmentation covariates spreadsheet. Also included in the output are the mean value for the marker and the absolute difference between the threshold values and the mean marker value. If a two-state model is used the appropriate count column will contain only zeros and the absolute difference column not used will be filled with missing values.

### Count Number of Segments per Sample¶

This function can be used on the **Segment List** spreadsheet that is output
from CNAM Optimal Segmenting or the **Homozygous Runs...** spreadsheet that is output
from the Runs of Homozygosity tools.

This function will count the number of segments or runs found in each sample and will provide the total combined length for each sample. The count and length information is output as a new child node spreadsheet of the Segment List or Homozygous Runs spreadsheets.

### Sample Statistics for Discretized Segment List¶

Summary statistics are calculated for the Discretized Segment List spreadsheet, created by Discretize Segment List function. Summary measures are output as separate spreadsheets containing summary statistics by Chromosome and are as follows:

- Loss Count by Chromosome
- Loss Minimum Segment Length by Chromosome
- Loss Maximum Segment Length by Chromosome
- Loss Mean Segment Length by Chromosome
- Gain Count by Chromosome
- Gain Minimum Segment Length by Chromosome
- Gain Maximum Segment Length by Chromosome
- Gain Mean Segment Length by Chromosome

Each of the measures listed above are output in separate spreadsheets with samples as rows and summaries for each chromosome columnwise. All of the above measures are also calculated over all chromosomes and contained in a separate output spreadsheet, All Chromosome Segment Statistics.

Note

All lengths are reported in units of kilo base pairs (kB).

### Discretize CN Segment List¶

Discretizes the Segment List based on a two- or three-state copy number model, specified by the user.

If the three-state model is selected, two thresholds must be specified. The values in the covariate spreadsheet are replaced by a -1 if the segment mean is below the lower threshold, a 0 if it is between two thresholds, and a 1 if it is above the upper threshold.

If a two-state model is selected, one threshold must be specified. The two state models include a copy number loss model and a copy number gain model. If the loss model is selected, a value less than the threshold is indicated with a 1 and values above the threshold are indicated with a 0. If the gain model is selected, a value greater than the threshold is indicated with a 1 and values below the threshold are indicated with a 0.

Missing values will still be indicated as missing.

### Create Sparse Segment Matrix¶

Converts either the segment list spreadsheet or the ROH runs of homozygosity matrix into a sparse matrix with one column per segment per sample.

To create unique column names, the sample name, chromosome name and start position are concatenated together as this combination is assumed to be unique. A single column is created for each row in the segment list or ROH runs of homozygosity spreadsheet. All values in the column are missing (0’s for ROH sparse matrix) except for the particular sample that has a segment mean (1’s for ROH sparse matrix) for that segment.

In the marker map created for the spreadsheet, both the Position and Stop position (“End Position” in the list spreadsheets) are included. This enables GenomeBrowse to plot the values of the sparse matrix as intervals in the heat map.

## Visualizing Copy Number Analysis Results¶

There are several ways to visualize copy number analysis results. It all depends on what results you want to visualize. Below are several different ways to visualize copy number data. These ways are not exhaustive, but are indicative of the typical ways CNV data is viewed.

### Log2 Ratios¶

If the quantile-normalized intensities of the log2 ratio data are to be
plotted for visual inspection for the presence of CNVs, this can be done
from the log2 ratio spreadsheet. From the log2 ratio spreadsheet,
go to **GenomeBrowse** **>** **Numeric Value Plot** for row marker mapped
spreadsheets, or **GenomeBrowse** **> New Window** and click on the **Add**
button in the middle of the window to add samples in either case.

See *Numeric Value Plot* for more information.

### CNV Segment Mean Covariates¶

If the segment mean covariates of the log2 ratio data is to be plotted
for visually inspecting CNVs, this can be done from the CNV Segment
Covariates spreadsheet (preferably the spreadsheet with a column for
every marker). To plot the data, go to **GenomeBrowse** **>** **Numeric Value Plot**
for row marker mapped spreadsheets, or **GenomeBrowse** **> New
Window** and click on the **Add** button in the middle of the window to
add samples in either case. See *Numeric Value Plot* for more information.

A heat map of segmentation covariates is a good tool for visually
detecting interesting copy number regions. This can be done with either
the segment mean covariates or the discretized covariates. It is useful
to first sort the samples by case/control status in order to easily
inspect the top and bottom halves of the heat map. To do this, merge the
CNV Segment Covariates spreadsheet with a phenotype spreadsheet
containing case/control status for the samples, then sort on the column
containing case/control status. From this spreadsheet, select **GenomeBrowse
> Heat Map**. See *genomicHeatMap* for more information.

### CNV Segment Means Histogram¶

Plotting the histogram of segment means can be useful in visually
identifying thresholds between copy number states of loss, neutral and
gain for a dataset. To plot the histogram, select **Plot** >
**Histograms** from the CNV Segment List Spreadsheet, and from the
Histogram Parameters dialog select the “Segment Mean” column. Additional
parameters can be changed from their defaults. See
*Histograms* for more information.

### CNV Segment Counts Histogram¶

Plotting the histogram of segment counts can also be useful in visually
identifying noisy samples with a large number of segments. To plot the
histogram, select **Plot** > **Histograms** from the CNV Segment
Counts Spreadsheet, and from the Histogram Parameters dialog select the
“Segment Counts” column. Additional parameters can be changed from their
defaults. See *Histograms* for more information.

### Log2 Ratios and CNV Segments Together¶

It can be a useful visual tool to plot CNV segments on top of original log2 ratio data. To do this, make sure a marker map is applied to both the original log2 ratio transposed spreadsheet and the CNV segments transposed spreadsheet.

From the log2 ratio spreadsheet, plot the sample or
samples of interest by going to **GenomeBrowse** **> Numeric Value Plot**
for row marker mapped spreadsheets, or **GenomeBrowse** **> New Window**
and click on the **Add** button in the middle of the window to add
samples in either case. See *Numeric Value Plot* for more information.

Next, select the first log2 ratio graph node and right-click to select
**Add Item(s)** tab in the plot tree. Click on the **Project** location
In the node selection list select the CNV segments spreadsheet. Once
the samples have been loaded into the plot data panel, select
the same sample for the graph that had the log2 ratio values plotted.

In general, it is usually desired to have the CNV segments be plotted as a step line, and to leave the log2 ratio values as points on the plot. To change the options for the segment mean covariate log2 ratios, select the appropriate item under the log2 ratio plot in the Plot Tree panel on the left-hand side of the window. On the Display tab change the line style to Left Step. The line can be brought to the front of the viewer by clicking on the name of the item and dragging it above the first item in the log2 ratio plot container. The items can be renamed by right-clicking on the name and selecting “Edit Title”.

More graphs can be added for additional samples by selecting “Add” in
the tool bar or right clicking in the Plot Tree. Again select the
**Project** location and selecting both occurrences of
another sample from both the log2 ratio spreadsheet and the segmentation
covariate results spreadsheet. The above procedure can be repeated for as many
samples as desired. See *GenomeBrowse: The Genomic Scale Data Visualization Tool* for more information.