# Custom Plotting Interface and Specialized Plots¶

Golden Helix SVS now (as of version 7.6.0) has integrated the Python `matplotlib`
library to enable the creation of specialized plots via Python scripts and the
custom plot interface. The addition of this library dramatically expands the
number of plot types that can be created within Golden Helix SVS.

The utilities included within the custom plotting interface, as well
as all of the specialized plots shipped with Golden Helix SVS with the
exception of *Dendrograms and Heatmap*, are described
below. (While *Dendrograms and Heatmap* is a customized plot
feature, it is also considered a custom feature of
*RNA Sequencing Analysis*, and as such, is presented in that section.)

## Custom Plotting Interface¶

### Viewing a Custom Plot¶

There are two different ways, panning mode and zooming mode, to modify the view of a custom plot. A user can switch to any of these modes by selecting the desired option in the View menu or choosing its icon in the shortcuts toolbar. In point mode, the user will not be able to modify the plot view.

Note

When zooming or panning within the Dendrograms and Heatmap
plot (*Dendrograms and Heatmap*), the motions of the
dendrograms and the heatmap will be coordinated with each other.

The user can undo a view change by clicking on the back arrow in the
shortcuts toolbar or choosing **View > Back**. The user can undo all
view changes by clicking on the green icon in the shortcuts toolbar or
choosing **View > Reset to Original**.

Note

While panning and zooming within a 3D plot are available,
the mechanisms work completely differently. Please see
*Scatter Plot 3D* for more details.

### Configure Subplots¶

This dialog allows the user to customize the location of the axes with
respect to the figure canvas. If the canvas contains several plots in
the form of a grid, the vertical and horizontal space within the grid
can also be customized. To open the dialog, either choose **Edit
> Subplots Config** or click on the 5th icon from the left in the
shortcuts toolbar.

The Subplot Configuration Tool contains six adjustable bars,
corresponding to the four axes; top, left, right and bottom; and the
horizontal and vertical space; hspace and wspace. The last two
mentioned are only applicable if the figure canvas contains several
plots. Moving the **top** bar down will immediately move the top axis
of the open plot down. The options take effect immediately and there
is no need to save in this dialog.

Note

Edits will not be saved after the plot is closed. The user must save the plot as an image to save the customizations in a permanent manner.

Note

This feature is not available for either the Dendrograms and
Heatmap plot (*Dendrograms and Heatmap*) nor for the 3D plot
(*Scatter Plot 3D*).

### Customize Figure¶

This dialog allows the user to customize several features about the
plot (or each subplot if applicable). A tab will exist for each
modifiable object in the plot. These objects will always include the
“Axes” object of the plot and may include several tabs corresponding
to different lines in the plot. To open the dialog, either choose
**Edit > Customize** or click on the 6th icon from the left in the
shortcuts toolbar.

The Axes tab will include options to modify the Title of the plot, the
minimum and maximum value of both axes, the labels of both axes and
the scales of both axes. The **_lineX** tabs will include options to
make the label more informative (e.g. Female Median Bar) and modify
the style, width and color of the line. The user can also add a
marker to the line with style, size and color options. Each line in
the plot will have its own tab and can be customized individually.

If the figure canvas contains several plots, the user will be prompted to select which plot to edit before the Customize dialog appears.

Click **OK** to save and view the options.

Note

Edits will not be saved after the plot is closed. The user must save the plot as an image to save the customizations in a permanent manner.

Note

For any categorically-labeled axis, only the axis label is available for modification.

Note

This feature is not available for either the Dendrograms and
Heatmap plot (*Dendrograms and Heatmap*) nor for the 3D plot
(*Scatter Plot 3D*).

### Saving a Plot¶

Choose **File > Save as...** to save the image.

## Autocorrelation Plots¶

This feature will create one or more autocorrelation plots from the columns selected. There are additional options to choose from, the confidence interval level (this should be entered in as a percentage, such as 95 for 95% confidence intervals), whether to use an unbiased calculation in the auto-covariance, and whether to use a Fast Fourier Transform to calculate the autocorrelation.

Note

This function uses the *statsmodels.graphics.tsaplots.plot_acf* function to
create this plot. This function is based off of the *matplotlib.pyplot.xcorr*
function. Please see the statsmodels and matplotlib documentation for more
information.

## Columnwise Venn Diagram¶

This plot type allows the user to create a Venn diagram using two,three,four or five columns. All column types are allowed but the inclusion specification will differ depending on the column type.

In the first dialog, select the columns you wish to plot. In the example below, five columns are selected, one of each type.

In the second dialog, specify the inclusion criteria. If the column selected contains integer or real values, then the criteria can be specified by a logical comparison operator and a threshold value. In other words, you could plot a Venn diagram of all samples with integer values < 0 in one column and all samples with real values > 1.1 in another column.

Binary, Genotype and Categorical inclusion criteria can be specified by checking the appropriate values to include.

In the figure below, the center of the resulting Venn diagram will
count all samples that have **all** of the following characteristics:

- values <= 0 in column 1
- values equal to 1 in column2
- values equal to Blue,Green or Red in column4
- values > 0 in column9
- values equal to A_A,A_C or C_C in column13.

In final step, specify the node name, plot labels and circle colors for each column.

The result plot will look similar to the below figure.

## Meta-Analysis Forest Plot¶

This plot type takes the output from Meta-Analysis and generates Forest Plots
for the top results. See *Meta-Analysis* for more information.

Note

It is recommended that the data be sorted to put the most significant results at the top of the spreadsheet before running this script.

In the first dialog (*Meta-Analysis Forest Plot Dialog (Screen 1)*), select how many results for which
to generate a Forest Plot, and which overall output column to
consider. Then click *Next >*.

The second dialog (*Meta-Analysis Forest Plot Dialog (Screen 2)*) will allow you to rename your
studies so that the labels of the plot can be more informative. By
default Study #1, Study #2, etc. will be used based on the column
headers of the Meta-Analysis output.

The output of this feature (*Meta-Analysis Forest Plot for Top Result*) will be one spreadsheet
and one plot for each marker selected in the first screen of the
dialog. The spreadsheet will contain the summary information for each
study along with overall information for the Meta-Analysis. The plot
will look similar to the figure below. Plot controls can be used to
modify shapes and colors on the plot.

Included in the plot is a vertical dashed line representing the overall meta-analyzed measure of effect, and a vertical solid line, which is the line of no effect. If the confidence interval for an individual study overlaps with the line of no effect, it demonstrates that at the given level of confidence, that individual study’s effect size does not significantly differ from “no effect”. The same applies for the meta-analyzed measure of effect–if the confidence interval around the diamond overlaps the line of no effect, the overall meta-analyzed result cannot be said to differ from “no effect” at the given level of confidence.

## N by N Scatter Plots¶

This plot type allows the user to investigate how several numeric variables are related. The user will choose N numeric columns from a spreadsheet and create an NxN plot grid. The diagonal will have histograms and density plots, while off diagonal plots will contain the appropriate scatter plot.

Optionally the user can choose to apply a grouping variable to all scatter plots. The grouping variable must be in the form of a categorical column with at most 10 unique values. A legend will be added to the bottom of the plot grid.

## Scatter Plot 3D¶

This plot type allows you to investigate directly how three numeric variables (corresponding to three columns you choose from a spreadsheet) are related. The scatter points will be placed into a virtual three-dimensional plot which is shown as a two-dimensional view on your screen.

You may “pan”, or change your point of view of this three-dimensional plot, by holding down the left mouse button and moving the mouse to rotate the view in any direction you want.

You may zoom the (virtual) three-dimensional plot itself by holding down the right mouse button and moving the mouse up or down. The scatter points will appear to blow up or shrink down in 3D space, and the axis coordinates will be adjusted accordingly.

Note

Neither the “point”, “pan”, or “zoom” buttons in the tool
bar, nor the menu options in the **View** menu, actually have any
effect for 3D plots. Using the left mouse button for panning and
using the right mouse button for zooming 3D plots are always
available.

You may optionally choose to apply a grouping variable to the scatter plot. The grouping variable must come from a categorical column, and must have at most 10 unique values. A legend will be placed under the 3D plot image explaining the groupings.

## Plot Proportion by Group with Confidence Intervals¶

This plot can be used to investigate the proportion of cases (or 1’s) in a binary column over several groups. The user must select a binary column and a group column of type categorical, genotypic or binary.

The proportion of 1’s is calculated for each category as where n is the number of non-missing values. The sum of the column is equal to the number of times a 1 occurs. Missing values are not included. The margin of error is calculated as . The endpoints of the confidence interval are calculated by subtracting and adding the margin of error to the proportion. The endpoints are adjusted if they fall outside of the range [0,1].

## Side by Side Box Plots¶

This plot can be used to investigate the distribution of a numeric variable over several groups. The user must select a numeric (real or integer) column and a group column of type categorical, genotypic or binary.

Assuming N unique values exist in the grouping column, N vertical box plots will be created.

## Stacked Histogram¶

A stacked histogram can be used to relate how the distribution of several data subsets relates to the overall data distribution. The user must select a numeric (real or integer) column and a group column of type categorical, genotypic or binary. The grouping column must have at most 10 unique values.

In the stacked histogram the group order is sorted such that the largest group is on the bottom and the smallest group is on the top. A legend will be added to the top right corner of the plot.