GenomeBrowse Options for Specific Plot Types

The options dialog contains various controls for setting global VarSeq options. Available controls include a choice for where new plots are added to the plot view, which axes position tracking is enabled on, download save target, and color customization.

The various types of data sources that can be visualized in VarSeq are listed below.

Numeric Value Plot

The Numeric Value Plot is a plot of the genomic coordinates on the X-axis and the value from the field plotted on the Y-axis.This plot type is used to look for trends associated with genomic position or the values for one variable.

Value Plot

Numeric Value Plot

Plot Description

Value plots can be drawn from any field consisting of numeric data from an annotation source or file that can be visualized in GenomeBrowse.

Controls

Display Tab for Plot Container

On the Display tab, the controls include:

  • Labels
  • Value
  • Connector
    • Type
    • Size
  • Smoothing
    • Type
    • Window radius
  • Y-Range

The Labels control provides the ability to change the data field that provides the feature labels that are drawn on the plot. More labels will appear the closer the plot is zoomed in.

The Value control allows the data field that provides the data points for the plot to be changed. By default this value is the field selected when the plot was created. If there are multiple items in the plot, changing the value control at the top level will set the value for each plot item to the same value.

The Connector control allows for connecting the data points drawn in the plot. Options include:

  • None: (Default) Data points are not connected
  • Drop Line: Connect all data points to the x-axis with a vertical line.
  • Line: Connect all data points with a line
  • Left Step: Connect all data points by first stepping vertically and then horizontally to the next data point.
  • Mid Step: Connect all data points by placing the vertical step at the midpoint of the horizontal step. I.e. step half-way horizontally, then the full vertical step, then the remaining half-step horizontally.

The connector width is specified in the integer selector box beside the Connector control option. This value indicates the thickness of the connector lines.

The Smoothing control allows for smoothing data based on a specified range of values. The smoothing options include:

  • None
  • Mean Symmetric
  • Median symmetric
  • Mean Asymmetric
  • Median Asymmetric

The window radius value is specified in the integer selector box beside the Smoothing control option. This value indicates the number of points to use for smoothing on either side of the point being smoothed. For example, a window value of 2 replaces each point with a 5 point median or mean value.

The difference between Symmetric and Asymmetric smoothing is how the boundary cases are handled.

For the other controls that are common with most plot types, see Display Controls.

Style Tab

On the Style tab, the controls include:

  • Style By
    • Field
    • Save
  • Style
    • Color
    • Shape
    • Size
  • Restyle
    • Method
    • Various Styling Options

The Style By control enables the user to select a single dimension in which colors can be used to discriminate between complementary data categories. A dimension can be selected by clicking the “Style By” button. Fields available from the source that can be used for styling are available in the list. Selecting a numeric field will enable the Cutoff control to specify a threshold value to use for splitting the style of the data. To save the style, click the Save button.

The Style list allows for the specification of the style of the data drawn in the plot. There are controls for changing, the color, shape and size of the data points. If a field is specified to Style By then there will be controls for each group as determined by the field and threshold selected.

To change the shape and size of all categories, first change the shape and size with Style By = None then set the Style By to the desired field such as Chromosome.

The Restyle control allows for all styles in all selected plot items to be recolored or reshaped incrementally. When a single plot is selected, it has the same effect as selecting all of its items. The available methods include:

  • From Current: Uses the first style as the starting point and increments the colors and shape by the specified amount for each remaining style. An increment of 0 sets all of the colors and/or shapes to the starting values.
  • Color Gradient: Set the starting color and then specify the Hue, Saturation and Value increments.
  • Color From: Set the starting color then specify the color increment.
  • Shape From: Set the starting shape then specify the shape increment.

Filter Tab

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

Please see Filter Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Add Tab

On the Add tab there are buttons for adding additional items or line items to the plot. Clicking on the Add Item(s) button opens up the data source library add data sources dialog. Clicking on the Add Line Item(s) button opens up the Line Parameters dialog to add a horizontal, vertical or line with a slope and intercept. See gbLineItems for more information.

Display Tab for Plot Items

On the Display tab, the controls include:

  • Labels
  • Value
  • Connector
  • Smoothing

The Labels control provides the ability to change the data field that provides the feature labels that are drawn on the plot. More labels will appear the closer the plot is zoomed in.

The Value control allows the data field that provides the data points for the item to be changed. By default this value is the field selected when the item was created.

The Connector control allows for connecting the data points drawn in the plot. Options include:

  • None: (Default) Data points are not connected
  • Drop Line: Connect all data points to the x-axis with a vertical line.
  • Line: Connect all data points with a line
  • Left Step: Connect all data points by first stepping vertically and then horizontally to the next data point.
  • Mid Step: Connect all data points by placing the vertical step at the midpoint of the horizontal step. I.e. step half-way horizontally, then the full vertical step, then the remaining half-step horizontally.

The connector width is specified in the integer selector box beside the Connector control option. This value indicates the thickness of the connector lines.

The Smoothing control allows for smoothing data based on a specified range of values. The smoothing options include:

  • None
  • Mean Symmetric
  • Median symmetric
  • Mean Asymmetric
  • Median Asymmetric

The window radius value is specified in the integer selector box beside the Smoothing control option. This value indicates the number of points to use for smoothing on either side of the point being smoothed. For example, a window value of 2 replaces each point with a 5 point median or mean value.

The difference between Symmetric and Asymmetric smoothing is how the boundary cases are handled.

For the other controls that are common with most plot types, see Display Controls.

Style Tab for Plot Items

  • Style By
    • Field
    • Save
  • Style
    • Color
    • Size
    • Shape
  • Restyle
    • Method
    • Various Styling Options

The Style By control enables the user to select a single dimension in which colors can be used to discriminate between complementary data categories. A dimension can be selected by clicking the “Style By” button. Fields available from the source that can be used for styling are available in the list. Selecting a numeric field will enable the Cutoff control to specify a threshold value to use for splitting the style of the data. To save the style, click the Save button.

The Style list allows for the specification of the style of the data drawn in the plot. There are controls for changing, the color, shape and size of the data points. If a field is specified to Style By then there will be controls for each group as determined by the field and threshold selected.

To change the shape and size of all categories, first change the shape and size with Style By = None then set the Style By to the desired field such as Chromosome.

The Restyle control allows for styles in all selected plot items to be recolored or reshaped incrementally. The available methods include:

  • From Current: Uses the first style as the starting point and increments the colors and shape by the specified amount for each remaining style. An increment of 0 sets all of the colors and/or shapes to the starting values.
  • Color Gradient: Set the starting color and then specify the Hue, Saturation and Value increments.
  • Color From: Set the starting color then specify the color increment.
  • Shape From: Set the starting shape then specify the shape increment.

Filter Tab for Plot Items

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

See Filter Controls for more information.

Detail View

Clicking on a plot container for value plots will print out information about the data source including all of the fields in the source.

Clicking on a data point in the plot will result in the value, the label for the feature as well as any applied styling.

Line Items

Line items can be added to value plots by either right clicking on the plot and selecting Add Line Item(s) or by clicking on the Add Tab for the plot container and clicking on the line item(s) button.

The Line Parameters dialog includes the following controls:

  • Line Type Selection
  • Slope/Intercept
  • Color
  • Width

The Line Type Selection control allows of of the three available line types to be selected. There are three different types of lines that can be added: Horizontal, Vertical and Slope/Intercept.

The Slope/Intercept controls will change depending on the selected line type.

  • A Horizontal line is specified by a numeric y-intercept value.
  • A Vertical line is specified by genomic coordinates (Chr#:Position) or by an x-intercept value.
  • A Slope/Intercept line is specified by a numeric slope and numeric y-intercept value.

The Color control sets the color or the line.

The Width control sets the thickness of the line. This thickness is absolute, so it will be the same size on screen regardless of the current zoom.

Special Features

If a numeric feature has both a chromosome start and stop position then the value will be drawn as an interval. In this case it is recommended that the shape drawn for the values be changed to one of the shapes that stretches better such as a rectangle or plus sign.

Variant Maps

A Variant Map provides a visual interpretation of genotypic data for one or more samples. Variant maps can be created from variant call format (VCF) files or annotation sources with sample level variant calls.

VariantMapPlot

Variant Map Plot

Plot Description

The variants for each sample in a variant map are displayed in rows along the genomic x-axis. For the cases of deletions, and substitutions, the variant may be drawn to cover multiple bases.

Variant data that matches the reference (non-variant) is downplayed in visual significance at close zooms. This makes the actual variants more obvious. Optionally, reference allele matches can be hidden at all zoom levels.

For wide-zoom views, the variants are displayed as a gray scale density plot, indicating the locations of variant data.

When zoomed in close enough, the variants are colored according to the GenomeBrowse global color options. Each sample’s row is split vertically to allow for indication of zygosity. A variant of a single color therefore indicates a homozygous alternate variant call, whereas a two color variant indicates a heterozygous variant call. Missing calls are displayed as question marks (?) in light gray. In this way missing variants and half-called variants can be indicated as well. By default variants are labeled with the variant call or “genotype”.

  • Single Nucleotide Variations (SNVs) variants are colored as a single base.
  • Insertions are drawn with a zero-width I bar at the location of the inserted base(s).
  • Deletions are represented as a magenta solid block representing the missing base(s).
  • Substitution variants are drawn with each base indicating the variant alleles at that position.

The y-axis corresponds to the Sample Label.

Controls

Display Tab

On the Display tab, the controls include:

  • Reference Alleles
  • Labels

The Reference Alleles control will draw a line for samples that have the reference allele(s) at close zooms if checked. Otherwise, reference allele matches are not drawn.

The Labels control allows for selection of the data field that provides labels for the marker or column of variants when zoomed in close enough.

For the other controls that are common with most plot types, see Display Controls.

Filter Tab

On the Filter tab, there are two filter boxes. The top filter box can be used to filter variants based on field data, such as a variant level quality score or another INFO field in a VCF file. The bottom filter box can be used to filter samples based on sample names.

To add a filter either click on Insert or right-click anywhere in the appropriate Filter list box and select Insert.

Please see Filter Controls for more information.

Group By Tab

If there is sample field meta data available from the plot source that can be used for grouping, on the Group By tab, these fields can be selected. If a numeric field is chosen a cutoff or group split-point can be specified as well. The resulting groups will be displayed as rows in the box below. Each group can be hidden or shown using the check box, and its color can be changed by clicking on the color button and choosing a new color.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

Clicking a marker or column of variants in a variant map displays information about the marker from the data source in the data console. The information will include, if available, the marker label, a list of alleles found at the marker, all associated data fields from the data source, and a list of each sample’s genotype at the marker.

When zoomed in close enough, a particular sample’s variant can also be clicked. The information displayed is the same as for the marker but will also include the clicked genotype near the top of the report.

Linkage Disequilibrium

The LD plot is a triangular heat map of the LD statistics (D', R^2) between pairs of variants across the genome in a variant annotation source.

Note

Linkage Disequilibrium (LD) is the non-random association of alleles in a population. LD is useful during analysis to identify linkage relationships of interesting variants. LD can be calculated from most sources with more than one sample. In particular, multiple sample VCF files.

Plot Description

The variants are the individual pentagons along the spine of the LD graph. As LD is a pairwise computation, LD is not available at large zooms. At large zooms the plot will show the density of the variants in a “rug plot”. To see LD values zoom into regions of interest, the maximum zoom is 2*10^6-1.

Linkage Disequilibrium in |vsName|

Linkage Disequilibrium for region in Chromosome 14

Controls

Display Tab

On the Display tab, the controls specific to LD are:

  • Statistic
  • Labels

The Statistic control gives the option to either display the LD R^2 or D^{\prime} statistic in the plot.

The Labels control allows for selection of the data field that provides labels for the markers.

See Display Controls for more information on controls available for all plot types.

Filter Tab

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

Please see Filter Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. The control specific to LD is:

  • Invert

The Invert control allows for the specification of the location of the spine of the visualization to be either along the top or the bottom of the plot. This control is checked by default, corresponding to the spine at the top of the plot.

See Layout Controls for more information on controls available for all plot types.

Detail View

The Data Console provides a detailed html formatted text output. There are 2 elements comprising an LD graph:

  • LD Value
  • Summary information for each variant or marker

The summary information includes the name of the variant or maker, and the allele information for each contributor to the LD calculation.

Heat Maps

A Heat Map is an intensity plot of numeric values from a data source containing multiple samples such as a multi-sample VCF file. The X-axis consists of the variants from the source. The Y-axis consists of sample labels.

Heat maps are useful to find non-random patterns in the data, particularly for read depth or quality scores.

Heat Map

Heat map of Read Depth from a VCF file

Plot Description

Controls

Display Tab

On the Display tab, the controls include:

  • Aggregation [Method]
  • Auto-Compute Color Values
  • Color Values
  • Presets

The Aggregation control allows for the specification of how pixels are aggregated if there are not enough pixels available to draw all of the data selected. A decision needs to be made as to what value to show for each pixel. The options available are Mean, Minimum, Maximum and Extreme.

The Auto-Compute Color Values check box indicates whether the colors should be assigned based on automatically computed values. The method for auto-computing the values for the colors is as follows:

n &= \text{number of color values} \\
k &= \frac{6}{n-1} \\
\bar{x} &= \text{mean of all numeric data in the source} \\
\sigma^2 &= \text{variance of the numeric data} \\
\sigma &= \sqrt{\sigma^2} = \text{standard deviation} \\
i &= \text{index of color value; } i \in [0,n) \\
a_i &= i - \frac{n-1}{2} \\
z_i &= a_i * k \\
SP_i &= \bar{x} + z_i*\sigma

The Color Values box lists all of the colors and the values associated with the colors (whether manually specified or auto-computed. To edit the colors, double-click on the color box for the color value. To edit the value, uncheck Auto-Compute Color Values and right-click on a value. New points can be added or values can be deleted from the right-click menu as well.

The Presets buttons are different preset color combinations to use for coloring the heat map. The default color combination is Gain/Loss.

For the other controls that are common with most plot types, see Display Controls.

Filter Tab

On the Filter tab, there are two filter boxes. The top filter box is to filter features based on a feature field in the source of the data. The bottom filter box is to filter features based on a sample field in the source of the data.

To add a filter either click on Insert or right-click anywhere in the appropriate Filter list box and select Insert.

Please see Filter Controls for more information.

Group By Tab

If there is sample field meta data available from the plot source that can be used for grouping, on the Group By tab, these fields can be selected. If a numeric field is chosen a cutoff or group split-point can be specified as well. The resulting groups will be displayed as rows in the box below. Each group can be hidden or shown using the check box, and its color can be changed by clicking on the color button and choosing a new color.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

Clicking in a Heat Map value produces the number of features clicked on as well as the mean, minimum and maximum values in the heat map bin. When there is only one feature clicked on, the value for all three statistics will be the value of the feature. The sample used for computing the three summary statistics is also listed in the data console.

BAM File Type

Please see readAlignmentSource for information on visualization of data from this file type.

File Information

BAM files can be added into a GenomeBrowse window by either selecting it from the Add Data Sources dialog (Add dialog) or by dragging the file into an open GenomeBrowse window.

Before visualization of the data GenomeBrowse must first compute index (BAI) and coverage (COVIDF) files for the BAM. Once the index file has been computed then visualization of the data will be available only for zoomed in regions of the plot until the coverage file is done being computed. If a BAI file was provide GenomeBrowse will not generate another index but will instead use the existing file if it is saved in the same directory as the BAM file.

BAM files need to be sorted to be loaded into GenomeBrowse. Additionally to be able to compute index and coverage files GenomeBrowse needs to be able to identify the reference sequence that corresponds to the genome build associated with the data in the BAM.

GenomeBrowse will use the BAM header information to identify the correct reference sequence by matching the chromosome names and lengths from the header exactly to that information in an available genome assembly file, see Genome Assemblies for more information on assembly files. Once the correct reference sequence is identified for the BAM you must download a local copy of the reference sequence for GenomeBrowse to use in computing the index and coverage files. Please see Downloading Data for information on downloading the correct reference sequence.

If the BAM file is unsorted or if the header is not formatted correctly, it is recommended that a third party tool such as SAMtools, http://samtools.sourceforge.net, be used to edit the file into the correct format.

BED File Type

BED files can either contain interval data or gene information. Please see Interval Sources for information on interval BED files and see geneSource for information on gene sources.

File Information

BED files need to be sorted to be loaded into GenomeBrowse, if a file is not sorted, it is recommended that a text editor or spreadsheet editor such as MS Excel be used to get the data in the correct order.

Once the file has been sorted, it can be added into a GenomeBrowse window by either selecting it from the Add Data Sources dialog (Add dialog) or by dragging the file into an open GenomeBrowse window. Adding a BED file will also compress and index the file. Once the file has been compressed and indexed then only the compressed and indexed files are needed for visualization in GenomeBrowse.

Cytoband Sources

Sources contain cytoband information including Giemsa stain results.

CytobandPlotGB

Cytoband Plot

Plot Description

The plot displays a karyotype view of the cytobands (cytogenetic bands) within each chromosome.

Controls

Display Tab

On the Display tab general plot controls can be changed. See Display Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Special Features

If one of the human assemblies is chosen for the genome build then the appropriate cytoband source shipped with the software will be set as the domain view plot by default.

If a cytoband source is available for your species/build it is recommended that it be set as the domain view plot.

To set a plot as the domain view plot, right-click on the plot and select Set as Domain View Plot.

Interval Sources

Sources contains features of variable width and the may have multiple data fields associated with each feature interval.

IntervalPlot

Interval Source Plot

Plot Description

Displays information for genomic intervals. There can be multiple overlapping intervals, so they will be stacked on the vertical axis to avoid visual overlap. The style of each interval can be specified to convey meaning. For instance, the dbNSFP Gene Annotation source contains information about genes with nonsynonymous variants. The color of each interval is based on the Inheritance Type information in the source.

Controls

Display Tab

On the Display tab, the controls include:

  • Labels

The Labels control provides the ability to change the data field that provides labels for the features. Labels will be displayed when zoomed in close enough.

For the other controls that are common with most plot types, see Display Controls.

Style Tab

On the Style tab, the controls include:

  • Style By
    • Field
    • Save
  • Style
    • Color
    • Shape
  • Restyle
    • Method
    • Various Styling Options

The Style By control enables the user to select a single dimension in which colors can be used to discriminate between complementary data categories. A dimension can be selected by clicking the “Style By” button. Fields available from the source that can be used for styling are available in the list. Selecting a numeric field will enable the Cutoff control to specify a threshold value to use for splitting the style of the data. To save the style, click the Save button.

The Style list allows for the specification of the style of the data drawn in the plot. There are controls for changing, the color and shape of the data points. If a field is specified to Style By then there will be controls for each group as determined by the field and threshold selected.

The Restyle control allows for styles in all selected plot items to be recolored or reshaped incrementally. The available methods include:

  • From Current: Uses the first style as the starting point and increments the colors and shape by the specified amount for each remaining style. An increment of 0 sets all of the colors and/or shapes to the starting values.
  • Color Gradient: Set the starting color and then specify the Hue, Saturation and Value increments.
  • Color From: Set the starting color then specify the color increment.
  • Shape From: Set the starting shape then specify the shape increment.

Filter Tab

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

Please see Filter Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

The data console contains information on the feature from the data source as well as the genomic position of the interval.

Gene Sources

Sources contain information on genes. Most gene sources also include coding region information as well.

GenePlotGB

Gene Source Plot

Plot Description

Gene sources draw genes as intervals using rectangles to indicate exons and a directional line to indicate introns. Genes on the forward strand are colored blue while genes on the reverse strand are colored green by default. UTR regions are a slightly darker shade of blue or green depending on the strand orientation of the gene.

Feature labels are drawn dynamically depending on the zoom range. Whenever possible gene names are drawn at larger zoom ranges. As the zoom range becomes smaller exon labels become visible and hover labels indicating the codon amino acids are available by mousing over the alternating light/dark codon segments of exon regions.

For mitochondrial codon amino acids, the alternate MT codon table is used to label gene features, see: https://www.mun.ca/biology/scarr/MGA2-03-28_mtDNA_code.jpg

Controls

Display Tab

On the Display tab, the controls include:

  • Genes [Draw Mode]

The Genes control allows the user to specify the draw mode for genes. Options include:

  • Auto: Depending on the zoom range and the vertical height of the gene plot, either show all genes and transcripts or collapse all transcripts into one gene interval drawing.
  • Compact: Always try to collapse all transcripts into one gene interval drawing. It will not always be possible to collapse all transcripts, but the preference will be to collapse regardless of the zoom range and plot height.
  • Expanded: Always draw all transcripts as separate gene intervals regardless of the zoom region/plot height.

The other two controls are available for most plots, see Display Controls for more information.

Filter Tab

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

A basic filter can be added based on the fields of the source. A custom filter can also be specified using the muParserX syntax.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

Clicking on a gene feature will display information about the gene from the gene source in the data console. The information will include, if available, the gene and transcript names, the strand, if it is coding or not, the pre-spliced size, the post-spliced size and all of the exon region information in a tabular form.

The Exon Table includes the exon number, size, start relative to the start of the chromosome, the start relative to spliced RNA, and start relative to coding RNA.

Also included are hyperlinks to gene databases if they were included in the gene source. The hyperlinks either search the databases by gene name or transcript name. A web browse

Read Alignment Sources

Sources contain reads, generally short nucleotide sequences, typically aligned to a reference genome. Currently only binary sequence alignment/map (BAM) files, generally from a secondary analysis pipeline, are supported as read alignment data sources.

BAM Plot

BAM File Plot

Plot Description

Read alignment sources are visualized in two different ways coverage and pile-up plots. The coverage is a measure of read depth or read count across the genome. The pile-up plot shows individual reads from the read alignment source stacked up on the vertical axis to avoid visual overlap.

Coverage Plot

From a whole genome or large zoom region the coverage plot shows stacked histograms of the reads (Read depth) the histograms are split into two groups to emphasize the strand of the read, either the forward or reverse strand.

When zoomed in close enough the coverage plot switches to a detail view showing stacked histograms of the nucleotide counts for the bins. Each bin is just one base-pair wide and the histograms count all of the nucleotides from the reads spanning that base-pair.

Pile-up Plot

The pile-up plot displays all the reads from the read alignment source. Many of the reads may overlap, so the reads are stacked or piled up along the y-axis. The depth of the stack is often similar to the read depth, but there often empty spaces in the stack causing it to be taller than the actual number of reads spanning a given position. If the reads provided by the read alignment source are RNA-Seq reads aligned to a DNA reference sequence, they may be aligned across introns (DNA intervals that are not represented in RNA). Such read alignments are displayed with thin (gray by default) bars between the two ends of the read.

For a whole genome or wide zoom view, the pile-up plot displays a shape that approximates the way the stacks of reads will look as the view is zoomed in. Intron spanning reads are piled at the bottom of the stacks so that the collection of reads that span an intron region will appear with a large (gray by default) rectangle between them. The wide zoom views are primarily useful for navigating to regions which contain data and maintaining a spatial reference while adjusting the view.

The way read alignments are displayed in the pile-up plot can be changed to better suit various browsing needs. None of these changes affect wide zoom views. The coloring can be changed to emphasize either mismatches from the reference sequence, or the read strand (forward or reverse). The stacking can be entirely above the axis or split so that forward reads are stacked above the axis and reverse reads are stacked below. Read pairings can also be indicated (by thin light gray lines between reads), but for paired alignments to be displayed all reads must be stacked above the axis.

Controls

Display Tab

On the Display tab, the controls include:

  • Emphasize:
    • Mismatches
    • Strand
  • Stack:
    • Above Axis
    • Split By Strand
    • Paired Ends
  • Per-base Quality Shading

The Emphasize control provides the ability to change the features emphasized. If Mismatches is selected, then plot coloring will be adjusted so that alleles that do not match the reference allele will be more visible. If Strand is emphasized, plot coloring will be adjusted so that the read’s strand color will be bolder.

The Stack control provides the ability to change how the reads are stacked. If Above Axis is selected all reads are stacked above the X-axis. If Split By Strand is selected reads on the forward strand are placed above the X-axis whereas reads on the reverse strand are placed below the Y-axis. If Paired Ends is selected, reads are stacked to connect paired end reads.

The Per-base Quality Shading control enables or disables quality proportional blending of each base’s color. High quality bases will be rendered more vibrantly and low quality bases will be blended into their background read color. This results in higher quality mismatches being visually more obvious than low quality ones.

Filter Tab

On the Filter tab, the controls include:

  • Flag Zero Quality Alignments
  • Filter Multi-Mapped Alignments
    • Mapping Quality Threshold
  • Filter Duplicate Alignments
  • Filter Duplicate Alignments
  • Filter Vendor Failed Alignments

These controls allow for different coloring or removal of certain reads that are marked with commonly used flags in the BAM file format. The flags provided are typically used by alignment programs to indicate a potential problem or lack of certainty in a read’s alignment, therefore it may be useful to highlight them or remove them from the visualization.

The Flag Zero Quality Alignments control enables or disables special coloring for alignments marked with zero mapping quality. Such alignments will be colored light gray when the option is enabled.

The Filter Multi-Mapped Alignments control disables or enables display of read alignments marked as “secondary alignment” or having zero mapping quality. Such alignments are hidden when the control is checked. The Mapping Quality Threshold will also be enabled when the control is checked. Setting the mapping quality threshold to a value greater than one will not only filter out zero mapping quality alignments, but also those with mapping quality less than the specified value.

The Filter Duplicate Alignments control disables or enables display of reads marked as “PCR or optical duplicate”. Such reads will be hidden when the control is checked.

The Filter Vendor Failed Alignments control disables or enables display of reads marked with “not passing quality controls”. Such reads will be hidden when the control is checked.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

Clicking on a read in the pile-up plot will display information about the read from the read alignment source in the data console. The information will include, if available, the read name, mapping quality, mate chromosome, mate position, template length, mismap probability, strand, several flag values, and the cigar operation string in a tabular form. The entire sequence of the read will also be included along with the corresponding list of base quality scores.

For wide-zoom views, clicking on a pile-up plot will display the maximum read depth for the aggregated region under the click position.

Clicking on a stacked histogram in the coverage plot will display information about the nucleotide counts at the associated genomic position. The information will include a table of matches, mismatches and deletions by type and nucleotide along with their counts, percentages, and mean qualities. It will also include a table of insertions and non-insertions existing at the nearest base-pair boundary along with their counts, percentages, and mean qualities.

For wide-zoom views, clicking on a coverage plot will display a table of mean read depth by strand (forward or reverse) for the aggregated region under the click position. The aggregation bin size will be shown above the table.

Allele Sequence Sources

This source contains allele sequence information. It is also known as a reference sequence.

AlleleSequenceGB

Allele Sequence Source

Plot Description

The data displayed from this source depends on the zoom range and the height given to the plot. At zoom ranges greater than 500 base-pairs the proportion of AGCT alleles (nucleotides) in the sequence are displayed as stacked histograms. At zoom ranges less than or equal to 500 base-pairs the alleles in the sequence in the forward strand are displayed by themselves.

If the height of the allele sequence plot is increased to allow more drawing room, both the forward and reverse sequences are drawn with the forward strand on the top and the reverse strand on the bottom.

As the height of the plot is further increased hypothetical codon triplets are drawn based on both the forward and reverse strand sequences.

Controls

Display Tab

On the Display tab general plot controls can be changed. See Display Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Data Console

If the zoom range is greater than 500 base-pairs the nucleotide count and percentages for the particular stacked histogram clicked on will be displayed as well as the size of the window used to compute the stacked histogram.

If the zoom range is less than or equal to 500 base-pairs, and a single nucleotide is clicked on then the nucleotide and position information will be displayed in the console.

If hypothetical codons are visible and clicked on then information about the hypothetical codon will be displayed in the console.

Special Features

For gaps in a sequence a nucleotide of N will be displayed in the track. This indicates that there is no known nucleotide at that particular location.

Variant Sites

Sources contain categorical/string values for positions included in the track. Here the focus is on the location of the variant and the information contained for each variant as a whole.

VariantSourcePlot

Variant Site Plot

Plot Description

Variant sites mark the location of all variants in the source. If alleles are detected in the source, the alleles are colored based on the nucleotide bases of the variant or as an insertion “I” bar or deletion rectangle. If allele information is not detected in the source variants are drawn as gray flags marking the location.

If there are numeric fields in a variant source, these fields can be plotted as a numeric value plot. See: Numeric Value Plot.

Controls

Display Tab

On the Display tab, the controls include:

  • Labels

The Labels control provides the ability to change the data field that provides the feature labels that are drawn on the plot at close enough zooms.

For the other controls that are common with most plot types, see Display Controls.

Filter Tab

A Filter can be used to control which features are drawn in the plot.

To add a filter either click on Insert or right-click anywhere in the Filter list box and select Insert.

Please see Filter Controls for more information.

Layout Tab

On the Layout tab general plot controls can be changed. See Layout Controls for more information.

Detail View

Clicking on a variant plot will print out information about the data source including all of the fields in the source. Clicking on a data point in the plot will result in the value, the label for the feature as well as any applied styling.

General Control Panels

Display Controls

The display controls for a plot may contain the following options:

  • Chromosome Shading: This control enables or disables alternating background shading for adjacent chromosomes. The shading will only be visible when zoomed out far enough, regardless of this controls setting.

  • Background: This control displays the current background color for a selected plot and allows a new background color to be set.

  • Feature Labels: This control enables or disables labeling of features within selected plot(s) or item(s).

  • Y-Range: Enter the numeric y-axis range to change the y-axis extents of the plot. If the current zoom mode is Fit Data or Auto, the zoom mode will be automatically changed to Hold when a new y-axis range is entered.

    Beneath the y-range control is a set of y-axis zoom mode selection buttons. Available zoom modes are:

    • Manual - The y-axis zoom is controlled manually and all zoom controls are

      enabled. This mode can be accessed using the hot-keys r or m.

    • Hold - The y-axis zoom is controlled manually but vertical panning on the

      plot canvas is disabled, protecting against accidental changes to the y-axis zoom. All zoom controls are enabled. This mode can be accessed using the hot-keys e or h.

    • Fit Data - The y-axis zoom is changed dynamically as the x-axis zoom changes

      to show all the data on the vertical axis. All vertical zoom controls are disabled. This mode can be accessed using the hot-keys w or f.

    • Auto - The y-axis zoom is changed dynamically as the x-axis zoom changes.

      When zooming in close on the x-axis the y-axis will be zoomed in as well to automatically improve the detail of the vertical axis in proportion to the horizontal axis. This mode is only available on Heat Map, Alignment Pile-up, Value, and Variant Map plots. It can be accessed using the hot-key q.

Filter Controls

There are two types of filter controls available. All plot types have the ability to filter data using feature filters. Variant Maps and Heat Maps also have the ability to filter data using sample filters.

See Expression Editor for more information.

General Filter

The general filter takes a feature level field such as Chromosome or other genome-wide data field available in the source and filters the data drawn based on the specified criteria.

Sample Filter

The sample filter takes a sample-wise field such as the sample names and removes samples from the entire plot if they do not meet the specified criteria.

Layout Controls

The layout controls for a plot may contain the following options:

  • Title: The check box indicates whether or not to display the title. The Edit button allows title of the plot to be changed. Full font, color and styling controls are available. Quick edit of a title is available by double-clicking on the title in the plot view. Right clicking on the title in the plot view or the plot tree also provides the title editing option.
  • Location: The check box indicates whether or not the location of the plot’s data sources should be displayed at the top right corner of the plot. The location can be a useful reference point at a glance and may help to differentiate between sources with the same or similar names.
  • X-Axis: Specifies whether or not to display an x-axis scale next to the plot.
    • Label: Specifies whether or not to show the x-axis label. An edit button is also provided to edit the X-axis label text and style.
  • Y-Axis: Specifies whether or not to show the y-axis scale next to the plot.
    • Label: Specifies whether or not to show the y-axis label. An edit button is also provided to edit the y-axis label text and style.
  • Height: Specify the height of the plot canvas in pixels. This control can be used to ensure that multiple plots are exactly the same height.