Frequently Asked Questions

Can you walk me through installing and registering SVS 8.4.0 on my Windows machine?

Download the SVS executable (ex. SVS-Win64-8.4.0.exe) to your machine from the link provided and double-click on the executable to begin the process.

Install SVS

SVS Setup Wizard - 1st Dialog

Figure 1-1. SVS Setup Wizard - 1st Dialog

  1. Click Next > on the first dialog (Figure 1-1).
SVS Setup Wizard - 2nd Dialog

Figure 1-2. SVS Setup Wizard - 2nd Dialog

  1. Select the SVS install directory. The setup wizard will create a folder in this directory called Golden Helix SVS and the default location is C:\Program Files\. The path and the folder can be changed to the users preference. Then click Next >. (Figure 1-2)
SVS Setup Wizard - 3rd Dialog

Figure 1-3. SVS Setup Wizard - 3rd Dialog

  1. The third dialog allows you the option to create a Start Menu folder for SVS, select your options and click Next >. (Figure 1-3)
SVS Setup Wizard - 4th Dialog

Figure 1-4. SVS Setup Wizard - 4th Dialog

  1. The fourth dialog allows you to create Quick Launch and Desktop icons, select your options and click Next > (Figure 1-4).
  2. The fifth dialog is a summary of the options selected on the previous dialogs, click Install to finish the install process or click Back > to make any necessary changes to the selected options.
  3. Check Launch SVS on the last install dialog and click Finish.

Login and Register Dialog

If you are opening SVS for the first time on a machine (or for the first time period) the first dialog you will see on launching SVS is the dialog to login or register your SVS license. You may also see this dialog if you choose not to have your credentials saved or have previously logged out of SVS.

Register

If you do not have an existing Golden Helix account, click on the Register tab and fill out the registration form. Once the required fields have been filled in and the license agreement has been accepted, the Register & Log In button will become active. You can optionally uncheck the “Stay logged in” option to have SVS not store credentials locally and require logging in again after relaunching the software. See Figure 1-5.

The Golden Helix account will also be the credentials used to log into the answers.goldenhelix.com community site.

SVS Welcome Screen

Figure 1-5. SVS Welcome Screen - Register Tab

Login

If you have an existing Golden Helix account with a valid SVS license, enter in your email address and password. You can optionally uncheck the “Stay logged in” option to have SVS not store credentials locally and require logging in again after relaunching the software. After the account information is filled in, click Log In. See Figure 1-6.

Log In Page

Figure 1-6. Log in to access SVS

Activate a License Key

Once logged in, if it is the first time logging in on a particular computer, you will need to activate a license key. Either click on Activate License Key in the lower right hand corner of the Welcome Screen, or go to Help > Activate a SVS License Key. Type or paste in the provided license key and press Verify. Once the key has been verified, if there are available activations the Activate button will become active. To activate VarSeq for the particular user account and machine click Activate.

SVS can be used with limited capabilities without an active license key. This is called “Viewer Mode”.

Where are the downloaded genome assemblies and annotation data saved?

All user downloaded content is saved in the AppData directory on your computer by default. You can find a listing of the paths from the Welcome Screen of SVS before opening a project by going to Tools > Global Product Options. The paths will be listed under the Applications options list and then in the Custom Paths section.

AppData Custom Paths

Custom Download Paths

Any of these paths can be changed by selecting the Browse button and navigating to the preferred directory. For example since annotations data sources can take up a significant amount of space it is acceptable to move the tracks to an external or network drive so they are not using up the available memory on your C drive.

To view the downloaded content in these locations you can go to Tools > Open Folder and select the corresponding folder.

I have a lot of RAM installed on my computer how can SVS take advantage of that?

SVS has two settings that can allow users to maximize the amount of RAM that can be accessed by SVS. From the SVS Welcome Screen before opening a project go to Tools > Global Product Options and then under Memory Usage on the Applications options list.

Memory Usage Settings

Memory Usage Settings

Dataset memory cache limit - This will most noticeably impact the performance of opening and navigating spreadsheets in SVS. We use this number to keep as much of the dataset in memory as possible. If it’s large enough to encompass the entire dataset then most plotting and analysis operations will see massive improvements on their second and subsequent passes over the data (the first pass brings it into the cache). This setting is a soft memory limit for ALL the datasets in a project though so it will dynamically discard the least recently accessed chunks (columns) of datasets as you navigate around your project looking at different spreadsheets.

Transpose and Analysis Memory Usage - This is a soft threshold for how much memory we should take advantage of when doing a dataset transpose operation or other memory-intensive analysis operations. The only analysis operations that utilize this are ones that are in essense doing an in-place transpose such as the CNV segmentation algorithm that segments samples (rows) while the data is stored on disk in columns (LogR values at a single genomic position). Some import operations that are also doing an internal transpose such as Affymetrix CEL import use this threshold. Because transpose operations are essentially about reading data from disk in an inefficient manner, the more memory utilized, the fewer disk accesses required and hence the faster the operation. If the whole dataset can be held in the provided limit, the transpose operation will be optimally fast.

We recommend setting these limits to no more than 50% of the total available RAM on the machine because these thresholds can and will not be representative of the total amount of memory SVS will naturally consume.

I have downloaded an add-on script how can I use it?

Once you have downloaded the script to your computer from our Scripts Repository you will need to save it in your SVS User Scripts location which can be found by going to Tools > Open Folder > User Scripts Folder.

User Scripts Folder

Top Level User Scripts Folder

The user scripts directory is arranged to mirror the SVS menu structure so that scripts can be accessed similarly to tools that are already available within the software. So any script that needs to be run from a spreadsheet will be saved in the corresponding Spreadsheet folder, any script that needs to be run from the SVS Project Navigator will be saved in the corresponding SVS folder. All scripts available from our website will list the recommended directory location on both the website as well as in the PDF documentation that is included, below is two specific examples.

Example 1: Linear and Logistic Regression with Interactions

This script needs to be run from a spreadsheet with numeric values so it will be saved in the /Spreadsheet/Numeric/ folder.

Numeric folder

Numeric User Scripts Folder

Example 2: Import Unsorted VCF Files

This script needs to be run from the SVS Project Navigator as a new spreadsheet will be created so it will be saved in the /SVS/Import/ folder.

Import folder

Import User Scripts Folder

How can I back-up my SVS project or recover data from a corrupt project?

Back-up Copy

All data from an SVS project is stored in the project folder. To find the save location go to Tools > Open Folder > Project Folder. This location contains the project file (Project_Name.ghp) as well as several folders (coverage, Data, genomes, map, tmp, etc.) that contain the actual data that is inside of the project.

The easiest way to back-up an SVS project is by making a copy of the project through the SVS menu options. From the open project in SVS go to File > Save a Copy of Project, this will prompt you for the save location and the name of the copy.

Save Copy of Project

Save a Copy of SVS Project

Recover Data

If you are unable to open an SVS project there are a couple of options to recover the data. Open the project folder Tools > Open Folder > Project Folder.

  1. When a project is opened in SVS a temporary project file Project_Name.ghp.tmp will be created in the same location as the original file. You can try renaming this file by removing the .tmp extension and then opening the renamed file.

  2. All of the project data (except plots) is stored in the /data/ and /map/ folders at this location. Any spreadsheet data will be in the /data/ folder in DSF format, all marker map files will be in the /map/ folder in DSM format.

    The DSF files can be dragged and dropped into a new project to recreate the spreadsheets. See Golden Helix DSF File.

    The DSM files will need to be moved to your Marker Maps folders (Tools > Open Folder > Marker Maps Folder) and then reapplied to each spreadsheet (File > Apply Genetic Marker Map).

    Note

    In using the raw data stored in the project folder to recreate your project all spreadsheet order and child/parent relationships between the spreadsheets will be lost as each DSF file will create a Top-Level spreadsheet node in the new project in the order in which they are imported.

How can I activate my variants based on regions defined in a BED file?

GeneName     TranscriptName  Chr     StartPosition   EndPosition     Strand
ATAD3B       NM_031921       1       1407164         1431582         +
NADK         NM_001198995    1       1682671         1690081         -
PODN         NM_001199080    1       53527724        53551174        +
PODN         NM_001199081    1       53527885        53551174        +
CSF1R        NM_005211       5       149432854       149492935       -
NKX2-5       NM_001166175    5       172659107       172662315       -
BDNF         NM_170733       11      27676442        27723180        -
DZANK1       NM_001099407    20      18364011        18447829        -
TLR8         NM_138636       X       12924739        12941288        +

SVS can annotate or filter directly from the BED file. From your variant spreadsheet go to DNA-Seq > Annotate and Filter Variants then Add the BED file (GeneRegion) into the Select Source dialog then click Next >.

Select BED File

Select BED File

Then on the options dialog select to filter variants not in overlapping regions and select your options for the information you would like included in the annotation output spreadsheet.

Filter Options

Choose Filter Options

In the original spreadsheet only those variants in the defined regions will be left active and a subset spreadsheet will be create for those active variants.

How can I add Gene Name or RS ID to my spreadsheet’s marker map?

We have an add-on script Add Annotation Data to Marker Map that can do this for you.

  • Then you will need to download a local copy of the annotation track that contains the information you want added to your map, Tools > Manage Data Sources select the track from our Public Annotations repository and clicking Download in the lower left corner.

    • For adding gene name any of our gene annotation sources can be used, for example if you have human data from the GRCh_37_g1k build then Ensembl Genes 75, Ensembl can be used to add Ensembl gene names to an existing marker map.
    Download Gene Annotation Source

    Download Ensembl Gene Annotation

    • For adding RS IDs any of the dbSNP annotation sources can be used, for example dbSNP 138, NCBI.

      Download RS ID Annotation Source

      Download dbSNP Annotation

  • Now launch the script from your spreadsheet and select the downloaded annotation source. This tool will create an augmented marker map in your marker maps folder so you can give this update an informative name and click Next >

Select dbSNP Track

Select dbSNP Track for Annotation

  • Choose the field in the track that contains the required information. For RS ID the Identifier field from dbSNP should be used and for gene name the corresponding name field should be used.
Select Field to Add

Select dbSNP Identifier Field

Note

This script can only add one field at a time to the marker map, so if you would like to add additional fields from the track you will need to repeat this process for each subsequent field.

I get a warning about duplicate markers when trying to append two datasets together, how can I inactive these duplicates?

Append Error Message

Error message when Appending

We have a script Inactivate Duplicate Column Headers that can solve this issue.

  • Then from each spreadsheet you will be appending run the script to inactivate the duplicates. The script will keep the first appearance of the duplicate and inactivate each subsequent appearance of that column header.
Dups Inactivated

Number of Duplicates Inactivated

  • Once the duplicates are inactive you should be able to run the Append Spreadsheet function without error.

Note

We also have scripts available that can Inactivate Duplicate Row Labels and Inactivate Duplicate Row Values. These scripts allow you to either inactivate all but the first occurrence of the duplicate or all occurrences.

I need to append reference samples to my data to perform PCA but my marker labels are different in each spreadsheet.

Study-Ref Spreadsheets

Mismatched Column Headers

When joining datasets in SVS with either the Append Spreadsheets or Join or Merge Spreadsheets functions you are required to have matching row labels and column headers for that information you would like combined between the two sets of data.

The reference dataset uses RS IDs for marker names and the study population has the same information available in the marker map.

RSIDs in MM

RS ID field in Marker Map of Study Population

We will rename the study markers to be RS IDs so that the two sets can be correctly joined to together. Please see How can I add Gene Name or RS ID to my spreadsheet’s marker map? if your data does not currently have RS IDs included as a marker map field.

From the study spreadsheet go to Edit > Recode > Rename Marker Mapped Labels and select the RS ID field from the marker map.

Rename Labels

Rename Marker Mapped Labels

A new spreadsheet is then created with RS IDs as the marker names which can now be joined to your reference samples.

RSIDs Column Names

Headers Renamed to RS IDs

How can I import Illumina data for analysis in SVS?

Illumina provides data in several formats, raw intensity data files (.idat), Final Report text files and variant call data in VCF format to name a few.

Intensity Data Files (idat)

SVS does not support importing idat files directly, the data must first be processed through Illumina’s GenomeStudio or BeadStudio software. Once the data is imported into one of these programs the data can then be exported into a format that SVS can accept.

Supported export formats are either the Illumina Final Report format or in Golden Helix DSF format using the Plug-ins we have available.

Illumina Final Report Files

Example Illumina Final Report

Example Illumina Final Report

Illumina Final Report files are delimited text files that can come with a variety of information, including genotypes for several strands, Log R Ratios, B Allele Frequencies, GC Scores and mapping information. For import into SVS at minimum the file must contain SNP and Sample columns and one other information field to be imported. The files can come in one sample per file or multiple samples in the same file.

The header data (lines between [Header] and [Data] in the above screenshot) are not required and will not be included in the standard import. The column header line (starts with “SNP Name”) is required to correctly identify which columns contain the required information to build the SVS spreadsheet and correctly match up corresponding alleles to form the genotypes.

If your data comes in one large Final Report file then you will want to go to Import > Illumina > Import Single Illumina Final Report and follow the prompts to import the data. If your data comes in several files, either one sample per file or several unique samples per file, then you will want to go to Import > Illumina > Illumina Final Report to import the data.

Please see Illumina for further details.

Note

If you have data in a similar format but without the column header information you can use our Import Tall Skinny Format script to import the data. This script is restricted to one file at a time for import, so if you have multiple files it may be easiest to add in the column header line to each file so the Illumina Final Report tool can be used.

Variant Call Format (VCF)

Sorted VCF files that follow the 1000 Genomes specifications can be imported into SVS by going to Import > Import Sorted VCF Files and following the prompts.

Please see Import VCFs and Variant Files for further details.

Can I analyze my sequencing data with SVS?

DNA Sequence Data

SVS can accept BAM files for visualization of aligned sequence data and can perform analysis on variant call data provided in VCF format. If you only have the raw sequence data (FASTQ) available for your samples you will need to have a Secondary DNA-Seq pipeline perform alignment and variant calling before SVS can be used for analysis.

We have several blog posts available that discuss NGS Analysis including the tools and formats that are available to process your data for Tertiary Analysis with SVS.

A Hitchhiker’s Guide to Next Generation Sequencing

NGS Tools and Formats for Secondary Analysis

RNA Sequence Data

Similarly to DNA-Seq data, SVS can accept BAM files for visualization of RNA-Seq data but requires gene (or isoform) count data to perform further analysis, for example if performing differential expression analysis using the DESeq Analysis. Count data is generally provided by a Secondary RNA-Seq pipeline in some form of delimited text.

This type of data can easily be imported into SVS using our standard Import > Text or Import > Third Party tools, once the data is imported the genomic mapping information (chromosome, start position and stop position) must be converted to a genetic marker map and applied to the count data. You can find an example workflow for creating and applying genetic marker maps in the Marker Map Tutorial available on our website.

However the easiest way to import count data is if the data is formated based on the requirements of our RNA-Seq Tabularized Quantification import tool. The importer will automatically import and correctly format all of the data including the marker map information so it is directly available for analysis.

Note

If your count data was provided by Cufflinks in one delimited text file per sample then we have an add-on script available that can convert this output to our Tabularized Quantification format. Please email Golden Helix Support if you need access to the script.