Genome Assemblies

A genome assembly defines the chromosomes for a particular species and build. This definition includes the chromosome names and lengths, as well as the order in which they are arranged when displayed in a genome browser. GenomeBrowse use the current genome assembly so that plotted features will be arranged according to its chromosome definition. The species and build specified by the selected genome assembly is used to manage annotation and data sources within GenomeBrowse. This functionality is intended to help prevent accidental alignment of annotation data to the wrong genome assembly.

A genome assembly can be selected from the list of bundled genome assemblies, downloaded from the genome assemblies data repository from Golden Helix, created from a marker mapped spreadsheet, or created from a marker map. This enables the user to create a genome specific to the data at hand or for a species not in the list of bundled species.

The current genome assembly can be selected through the tool bar controls of GenomeBrowse. The genomes available can be viewed through the project navigator menu item, Tools > Manage Genome Assemblies.

Bundled Genome Assemblies

Basic genome assemblies for several species are made available with SVS. The initial list of chromosome definitions to provide was based on those available within the Integrated Genome Browser (IGB) http://www.bioviz.org, [Nicol2009]. Additional species and builds have been added to the bundled genome assemblies since then.

If genome information for a particular species and build is available and is not yet included in the bundled builds, contact Golden Helix, and we can convert that information into a genome assembly which can be made available for all customers through the genome assemblies repository. Alternately, the genome assembly can be generated at the same time as converting a Reference Sequence to a TSF annotation source. See Convert a 2Bit File or Converting a FASTA File. In SVS a genome assembly can also be created from marker map information.

Creating Genome Assemblies

If a genome assembly is not available in the bundled set, or a genome assembly is required based on a subset of chromosomes and/or markers, then one can be created based on a marker mapped spreadsheet or a marker map. From the project navigator, choose the menu item Tools > Manage Genome Assemblies (see Genome Assemblies dialog accessed via the project navigator).

selectGenomeAssembly

Genome assemblies dialog accessed via the plot viewer

manageGenomeAssemblies

Genome Assemblies dialog accessed via the project navigator

Creating a Genome Assembly from a Marker Mapped Spreadsheet

To create a genome assembly from a marker mapped spreadsheet, open up the Manage Genome Assemblies dialog from the project navigator (see above) and select From Marker Mapped Spreadsheet.... A spreadsheet selection dialog will appear with only marker mapped spreadsheets active. Select the spreadsheet from which the genome assembly should be created. A dialog will appear where the name of the genome assembly and the associated species and build can be specified. The default genome assembly name is based on the applied genetic marker map and the spreadsheet name. This name is usually very long and should be shortened if possible. Select the most appropriate species and build or enter entirely new values. Care should be taken to ensure that exactly those values are used everywhere that that species/build combination is identified in SVS. If the values are not made to match exactly, some difficulty may be encountered associating various types of data to that species and build.

Creating a Genome Assembly from a Marker Map File

To create a genome assembly from a marker map file, open up the Manage Genome Assemblies dialog from the project navigator (see above) and select From Marker Map File.... A marker map selection dialog will appear listing all marker maps available in the Genetic Marker Maps directory. Select the desired marker map file. A dialog will appear where the name of the genome and the associated species and build can be specified. The default genome assembly name is based on the selected marker map name. Select the most appropriate species and build or enter entirely new values. Care should be taken to ensure that exactly those values are used everywhere that that species/build combination is identified in SVS. If the values are not made to match exactly, some difficulty may be encountered associating various types of data to that species and build.

Downloading a Genome Assembly from Golden Helix

To download a genome assembly from Golden Helix, open up the Manage Genome Assemblies dialog from the project navigator (see above) and select Download from Golden Helix. A list of available genome assemblies will be displayed. Select one or more genome assemblies to download and click OK. The downloaded genome assemblies will be saved in the User Genome Assemblies Folder and are available immediately for use in a project.

Viewing the User Genome Assembly Folder

User created genome assemblies are stored in the user genome assemblies folder in the application directory. This directory can be viewed by clicking on the User Genome Assemblies Folder button from either of the Genome Assemblies windows.

Switching Genome Assemblies

The genome assembly can be changed in one of two ways. These are through GenomeBrowse tool bar controls or through the current project options dialog. (Tools > Current Project’s Options).

Using Tool Bar Controls to Switch Genomes

GenomeBrowse has genome assembly information located on the tool bar. The control menu contains a list of all system and user genome assemblies. Recently used genome assemblies are listed at the top, all genomes are listed in alphabetical order under the black bar.

To change the genome to a different species or build, select the desired genome assembly from the list.

After changing the current genome assembly, the user will be asked if the reference sequence track should be downloaded if it is not found in the local annotation folder. Selecting Yes will start the download of the annotation track. Analysis and visualization of the data can continue while the file is being downloaded except for BAM files. Selecting No will not download the reference file so that coverage for BAM files will not be able to be computed.

Note

  1. Switching the genome assembly will cause the data to be re-plotted with the new mapping and the zoom to be reset showing all of the data.
  2. If annotation tracks that do not correspond to the current species and build features are plotted the data may not be correctly aligned to the genome.

Managing Genome Assembly Views

Located in the GenomeBrowse tool bar to the right of the genome assembly control menu is the Hide/Show eyeball icon, clicking this icon allows the user to manage the existing segments present in the selected assembly.

Manage Genome Assembly Views

Manage Genome Assembly Views

The Genome Segment Order and Visibility dialog will contain a list of each unique segment identified in the data loaded into the GenomeBrowse window. Segments can represent chromosomes, scaffolds, contigs or a combination of all three.

The user can hide or show segments by checking or unchecking the box in the Visible column. A checked box will say Yes indicating the data from that segment will remain visible in your GenomeBrowse plot and an unchecked box will say No indicating that segment and all data in that segment will be hidden.

Segment order can also be rearranged by highlighting a segment and using the up/down arrows to the right of the segment list, the single arrows will move the segment one position at a time and the double arrows will move the segment all the way to the top or bottom of the list. Additionally the user may specify the amount of visible space they would like between the segments by entering the base pair distance in the Space box.

At any point the user may restore the original order and visibility of the segments based on the genome assembly file by clicking Restore Defaults. Once changes are complete click OK to implement then in the GenomeBrowse window.