Custom Plotting Interface and Specialized Plots

Golden Helix SVS now (as of version 7.6.0) has integrated the Python matplotlib library to enable the creation of specialized plots via Python scripts and the custom plot interface. The addition of this library dramatically expands the number of plot types that can be created within Golden Helix SVS.

The utilities included within the custom plotting interface, as well as all of the specialized plots shipped with Golden Helix SVS with the exception of Dendrograms and Heatmap, are described below. (While Dendrograms and Heatmap is a customized plot feature, it is also considered a custom feature of RNA Sequencing Analysis, and as such, is presented in that section.)

Custom Plotting Interface

Viewing a Custom Plot

There are two different ways, panning mode and zooming mode, to modify the view of a custom plot. A user can switch to any of these modes by selecting the desired option in the View menu or choosing its icon in the shortcuts toolbar. In point mode, the user will not be able to modify the plot view.

Note

When zooming or panning within the Dendrograms and Heatmap plot (Dendrograms and Heatmap), the motions of the dendrograms and the heatmap will be coordinated with each other.

The user can undo a view change by clicking on the back arrow in the shortcuts toolbar or choosing View > Back. The user can undo all view changes by clicking on the green icon in the shortcuts toolbar or choosing View > Reset to Original.

Note

While panning and zooming within a 3D plot are available, the mechanisms work completely differently. Please see Scatter Plot 3D for more details.

Configure Subplots

This dialog allows the user to customize the location of the axes with respect to the figure canvas. If the canvas contains several plots in the form of a grid, the vertical and horizontal space within the grid can also be customized. To open the dialog, either choose Edit > Subplots Config or click on the 5th icon from the left in the shortcuts toolbar.

The Subplot Configuration Tool contains six adjustable bars, corresponding to the four axes; top, left, right and bottom; and the horizontal and vertical space; hspace and wspace. The last two mentioned are only applicable if the figure canvas contains several plots. Moving the top bar down will immediately move the top axis of the open plot down. The options take effect immediately and there is no need to save in this dialog.

Note

Edits will not be saved after the plot is closed. The user must save the plot as an image to save the customizations in a permanent manner.

Note

This feature is not available for either the Dendrograms and Heatmap plot (Dendrograms and Heatmap) nor for the 3D plot (Scatter Plot 3D).

Customize Figure

This dialog allows the user to customize several features about the plot (or each subplot if applicable). A tab will exist for each modifiable object in the plot. These objects will always include the “Axes” object of the plot and may include several tabs corresponding to different lines in the plot. To open the dialog, either choose Edit > Customize or click on the 6th icon from the left in the shortcuts toolbar.

The Axes tab will include options to modify the Title of the plot, the minimum and maximum value of both axes, the labels of both axes and the scales of both axes. The _lineX tabs will include options to make the label more informative (e.g. Female Median Bar) and modify the style, width and color of the line. The user can also add a marker to the line with style, size and color options. Each line in the plot will have its own tab and can be customized individually.

If the figure canvas contains several plots, the user will be prompted to select which plot to edit before the Customize dialog appears.

Click OK to save and view the options.

Note

Edits will not be saved after the plot is closed. The user must save the plot as an image to save the customizations in a permanent manner.

Note

For any categorically-labeled axis, only the axis label is available for modification.

Note

This feature is not available for either the Dendrograms and Heatmap plot (Dendrograms and Heatmap) nor for the 3D plot (Scatter Plot 3D).

Saving a Plot

Choose File > Save as... to save the image.

Autocorrelation Plots

This feature will create one or more autocorrelation plots from the columns selected. There are additional options to choose from, the confidence interval level (this should be entered in as a percentage, such as 95 for 95% confidence intervals), whether to use an unbiased calculation in the auto-covariance, and whether to use a Fast Fourier Transform to calculate the autocorrelation.

Note

This function uses the statsmodels.graphics.tsaplots.plot_acf function to create this plot. This function is based off of the matplotlib.pyplot.xcorr function. Please see the statsmodels and matplotlib documentation for more information.

Example Autocorrelation Plot

Autocorrelation Plot with 95% Confidence Intervals

Columnwise Venn Diagram

This plot type allows the user to create a Venn diagram using two,three,four or five columns. All column types are allowed but the inclusion specification will differ depending on the column type.

In the first dialog, select the columns you wish to plot. In the example below, five columns are selected, one of each type.

Columnwise Venn Diagram Step 1

Columnwise Venn Diagram - Step 1 dialog

In the second dialog, specify the inclusion criteria. If the column selected contains integer or real values, then the criteria can be specified by a logical comparison operator and a threshold value. In other words, you could plot a Venn diagram of all samples with integer values < in one column and all samples with real values > 1.1 in another column.

Binary, Genotype and Categorical inclusion criteria can be specified by checking the appropriate values to include.

In the figure below, the center of the resulting Venn diagram will count all samples that have all of the following characteristics:

  • values <= 0 in column 1
  • values equal to 1 in column2
  • values equal to Blue,Green or Red in column4
  • values > 0 in column9
  • values equal to A_A,A_C or C_C in column13.
Columnwise Venn Diagram Step 2

Columnwise Venn Diagram - Step 3 dialog

In final step, specify the node name, plot labels and circle colors for each column.

Columnwise Venn Diagram Step 3

Columnwise Venn Diagram - Step 3 dialog

The result plot will look similar to the below figure.

Columnwise Venn Diagram

Columnwise Venn Diagram

Meta-Analysis Forest Plot

This plot type takes the output from Meta-Analysis and generates Forest Plots for the top results. See Meta-Analysis for more information.

In the first dialog, select how many results to generate a Forest Plot for.

Note

It is recommended that the data be sorted to put the most significant results at the top of the spreadsheet before running this script.

Forest Plot Dialog

Meta-Analysis Forest Plot Dialog

The result plot will look similar to the figure below. Plot controls can be used to modify shapes and colors.

Forest Plot

Meta-Analysis Forest Plot for Top Result

N by N Scatter Plots

This plot type allows the user to investigate how several numeric variables are related. The user will choose N numeric columns from a spreadsheet and create an NxN plot grid. The diagonal will have histograms and density plots, while off diagonal plots will contain the appropriate scatter plot.

Optionally the user can choose to apply a grouping variable to all scatter plots. The grouping variable must be in the form of a categorical column with at most 10 unique values. A legend will be added to the bottom of the plot grid.

nxn_scatterplot

3x3 Scatterplot split on Gender

Scatter Plot 3D

This plot type allows you to investigate directly how three numeric variables (corresponding to three columns you choose from a spreadsheet) are related. The scatter points will be placed into a virtual three-dimensional plot which is shown as a two-dimensional view on your screen.

You may “pan”, or change your point of view of this three-dimensional plot, by holding down the left mouse button and moving the mouse to rotate the view in any direction you want.

You may zoom the (virtual) three-dimensional plot itself by holding down the right mouse button and moving the mouse up or down. The scatter points will appear to blow up or shrink down in 3D space, and the axis coordinates will be adjusted accordingly.

Note

Neither the “point”, “pan”, or “zoom” buttons in the tool bar, nor the menu options in the View menu, actually have any effect for 3D plots. Using the left mouse button for panning and using the right mouse button for zooming 3D plots are always available.

You may optionally choose to apply a grouping variable to the scatter plot. The grouping variable must come from a categorical column, and must have at most 10 unique values. A legend will be placed under the 3D plot image explaining the groupings.

three_d_scatterplot

3D Scatterplot split on Ethnicity

Plot Proportion by Group with Confidence Intervals

This plot can be used to investigate the proportion of cases (or 1’s) in a binary column over several groups. The user must select a binary column and a group column of type categorical, genotypic or binary.

The proportion of 1’s is calculated for each category as p =
\frac{\sum(column)}{n} where n is the number of non-missing values. The sum of the column is equal to the number of times a 1 occurs. Missing values are not included. The margin of error is calculated as 1.95*\sqrt(p*(1-p)/n). The endpoints of the confidence interval are calculated by subtracting and adding the margin of error to the proportion. The endpoints are adjusted if they fall outside of the range [0,1].

proportion_plots

Proportion of Cases by Gender with 95% confidence intervals

Side by Side Box Plots

This plot can be used to investigate the distribution of a numeric variable over several groups. The user must select a numeric (real or integer) column and a group column of type categorical, genotypic or binary.

Assuming N unique values exist in the grouping column, N vertical box plots will be created.

boxplots

Box plots of a continuous phenotype split on gender

Stacked Histogram

A stacked histogram can be used to relate how the distribution of several data subsets relates to the overall data distribution. The user must select a numeric (real or integer) column and a group column of type categorical, genotypic or binary. The grouping column must have at most 10 unique values.

In the stacked histogram the group order is sorted such that the largest group is on the bottom and the smallest group is on the top. A legend will be added to the top right corner of the plot.

stacked_histogram

Stacked Histogram of a continuous phenotype split on a binary phenotype