Genetic Marker Maps and Affymetrix Library Files¶
Genetic Marker Maps Overview¶
Genetic marker maps contain chromosome and distance data for individual SNP or CN markers relative to some origin, as well as other data relative to the genetic markers.
Although it is possible, normally Golden Helix SVS is not used to analyze genetic marker map data in its own right. Rather, genetic marker map data is associated with one or more data files containing genetic marker data that are actually being studied. Frequently, a team of analysts will have one master genetic marker map which it will use for all analysis relating to a specific study or project.
Genetic marker map files must either be downloaded from the Golden Helix data server, Affymetrix NetAffx service, or may be imported from a text file with the values separated by commas, spaces, tabs, or another character.
At a bare minimum, there must be three columns in the genetic marker map. These columns must include:
- Identifiers for the genetic markers
- Chromosome information
- Distance or position information (absolute within the current chromosome)
Optionally, the genetic marker map can contain additional columns in any combination. These columns can contain any of the other fields in an Affymetrix Annotation file, or other fields listed in a text marker map. A marker map can be applied to a spreadsheet containing genetic SNP or CNV data. This will create a resulting spreadsheet whose columns are sorted according to first, chromosome, and second distance. In addition, marker map information will be stored as meta data associated with either the columns or the rows of the resulting spreadsheet, depending on if the marker names are column headers or row labels.
A marker mapped spreadsheet will have a green node icon.
Genetic marker maps need to be applied to spreadsheets for various analyses and for the plotting data in a Genome Browser. In the spreadsheet, the marker map data can be viewed by clicking on the green Map button in the upper left hand corner of the spreadsheet. Marker map fields can be hidden so that only the desired fields are shown. This is done by right-clicking on the Map button and un-checking the undesired fields by clicking on the field name. To re-show the field, click on the name again. There must be at least one field checked at all times, but this one field can be any of the fields in the marker map. To hide all fields, left-click on the Map button again.
- Versions of SVS on or after release 7.0 allow for more than six fields to be in the marker map.
- The current marker map sort order (for versions of SVS on or after release 4.4.2) is, as stated above, chromosome and distance.
- Versions of SVS between releases 4.3.0 and 4.4.2 sorted marker maps and their mapped spreadsheet columns according to chromosome, region name, gene name, and finally distance.
- Versions of SVS before release 4.3.0 sorted marker maps according to chromosome, region name, and distance.
- Versions of SVS on or before release 3.1.0 left the columns of the resulting spreadsheet in the same order as the original spreadsheet. Their maps, however, were still considered to be sorted (by chromosome, region name, and distance).
- Versions of SVS on or after release 8.3.1 are no longer case sensitive when applying a marker map to a spreasheet.
Convert Text File into Marker Map DSM Format¶
The process of importing a genetic marker map begins by selecting the Tools > Manage Marker Maps menu item which brings up the Genetic Marker Map window (see Manage Genetic Marker Maps Window).
Clicking on the Convert Text File button will bring up the Import Text Marker Map dialog. The Browse button will cause a standard file manager to be displayed so that a file can be selected. The type of delimiter used in the text file needs to be specified for the file format. If a delimiter other than comma, white-space, or tab is used in the file, the delimiter can be specified by selecting “Other ->” in the drop down menu and indicated in the text box to the right of the menu. If the text marker map uses a custom encoding for missing values, that can be specified on the Advanced Options tab.
Affymetrix marker maps obtained by exporting annotations as a tab-delimited text file from GDAS may have an extra row at the top of the file. Other Affymetrix annotation files could have 15 rows of header information, depending on how they were obtained. These extra rows can be ignored by checking the Skip button on the Advanced Options tab and choosing the number of rows to skip.
After you select OK, the text marker map will be scanned and the Choose Columns to Use dialog will appear. Columns for the marker name, chromosome and position must be specified, and additional columns can be imported from the marker map as well. Clicking OK will convert the text marker map file into a DSM file for use in any project. DSM is a binary format for storing marker map information optimized for use as reference information. If there already exists another marker map by the same name in the Marker Maps folder in the program directory, then a version number will be appended to the marker map file name.
Download from Golden Helix¶
Genetic Marker Maps can be downloaded from Golden Helix’s server through the Manage Genetic Marker Maps dialog. To download from the Genetic Marker Maps Manager dialog, select the Download from Golden Helix button from the Marker Maps dialog, Tools > Manage Marker Maps, or by selecting Download > Marker Maps from the project navigator. More than one map can be selected by clicking on a map from the list and Ctrl-selecting all of the maps to be downloaded.
Data that is available through Affymetrix NetAffx ™ is also available on the Golden Helix server, eliminating the need to go to more than one location to download maps for different arrays.
Download Affymetrix Annotation Files¶
Affymetrix NetAffx ™ provides array design and annotation information for their GeneChip ™ array results. You can sign up for and use the NetAffx ™ Analysis Center through their website at http://www.affymetrix.com/.
Golden Helix SVS is able to communicate with NetAffx ™ through a web service interface allowing you to download and update genetic marker map information mappable to Affymetrix data. See Affymetrix Files for importing Affymetrix data files.
To import genetic marker map information from the Affymetrix NetAffx ™ service, select the Download Affymetrix Annotations button from the Manage Genetic Marker Maps dialog, Tools > Manage Marker Maps, or by going to Download > Affymetrix Marker Maps. You will be prompted for login credentials for the NetAffx ™ service. Login credentials can be freely obtained through registering on Affymetrix’s website. Your login credentials will be saved in the Global Program options and will be used for future NetAffx downloads. If the login credentials need to be changed this can be done by going to Tools > Global Program Options from the Program Welcome Screen.
Once authenticated, the latest annotation listings will be shown in the dialog (see Download Affymetrix Annotations Window).
Select one or more annotation files and click Download to download, combine and sort the annotation listings into a map DSM file. Note that the 50k and 250k annotation files come in pairs. For example, be sure to select both Mapping50K_Hind240.na26.annot.csv and Mapping50K_Xba240.na26.annot.csv.
Once downloaded, the annotations will be combined and will show up in the Marker Maps folder as Mapping50K_Hind240-Xba240_na26_annot20080709.dsm.
If your Internet connection goes through a proxy requiring authentication, Golden Helix SVS may not be able to communicate with the Affymetrix NetAffx ™ service directly. You may need to configure your proxy settings from the Global Options dialog. Strict firewalls may also preclude communication with NetAffx ™.
Marker Map Utilities¶
The Utilities menu provides a menu of useful utilities for working with marker maps. The items in the menu can be user provided scripts by placing scripts under the “MarkerMapManager/Scripts” folder of the UserScripts location.
Add Annotation Data to Marker Map¶
This utility creates a copy of the marker map with the specified annotation data from overlapping interval(s) to each marker in the marker map. This utility requires that a project be open to perform.
Managing the MarkerMaps Folder¶
The following actions can be performed on the genetic marker map folder:
- Sort by any listed field by clicking on the field header.
- Reorder field columns by clicking and dragging to a new position.
- Rename a marker map name or file name by right-clicking on the name.
If a marker map is loaded into the folder with the same default name as an already existing marker map, a version number will be appended to the marker map file name.
Importing Genetic Marker Map as Spreadsheet¶
Marker maps can be imported as spreadsheets by clicking on Import Map as Spreadsheet in the Manage Genetic Marker Maps window.
As of Golden Helix SVS version 7.0 marker maps no longer need to be imported into the project to be applied to a dataset. See below for instructions on how to apply a genetic marker map.
Removing Files from the MarkerMaps Folder¶
In order to keep the marker maps folder manageable, it may be desirable to delete marker maps from the folder. To do so, right-click on a marker map to remove in the list box and select Delete. If outdated marker maps need to be kept for later reference but are not to be used regularly, they can be moved to a different folder (see below). This keeps the marker maps folder manageable and prevents an outdated map from being applied.
Moving DSM Files from One Folder to Another¶
To move a DSM file from the marker maps folder to a different location, select the “View Marker Map Folder” option. This brings up a file directory navigator window with the marker maps folder open. Any DSM file can be copied and moved to a different directory using normal file directory management actions appropriate for the particular operating system in use.
In the same manner, DSM files saved in other file directories can be moved to the marker maps folder by navigating to the appropriate directory after clicking on View MarkerMaps Folder, and copying or cutting the desired DSM file, navigating back to the marker maps folder and pasting the DSM file into this directory. This is useful for sharing marker maps between computers or users.
Only DSM files in the marker maps folder will be seen in the Marker Map list. Moving text files into this directory is not the correct way to import a text marker map. See Convert Text File into Marker Map DSM Format for instructions on how to convert a text marker map to the DSM file format.
Applying a Genetic Marker Map to a Spreadsheet¶
A genetic marker map can be applied to a spreadsheet with marker names as either column name headers or row labels. To do so select File > Apply Genetic Marker Map from the spreadsheet menu.
In order to apply a genetic marker map you first need to have the desired marker map in the marker maps folder. All markers that can be mapped will be reordered in marker map order by chromosome first and then position. Any markers that cannot be mapped will be moved to the beginning of the spreadsheet, and the marker map fields will be left blank.
Selecting a Genetic Marker Map¶
All available marker maps in the program Marker Maps folder are displayed in the list box (see Select a Genetic Marker Map Window) with several parameters to aid in determining the best marker map to use. If there are several versions of a marker map, the number of markers in the map, the last date modified or the field names available will indicate any differences between the two versions. The columns can also be reordered by left-clicking on the column header and dragging to the left or the right.
Setting Apply Options¶
Normally, if the spreadsheet has multiple markers with the same name, it will be reordered so that the markers with the same name are adjacent, and all markers in the spreadsheet contained in the marker map will contain the map information in the marker map fields.
Checking the “Drop duplicate markers” will, on the other hand, map the first occurrence of any marker contained in the marker map and delete all other occurrences of the same marker from the mapped spreadsheet.
Indicating Direction to Apply Genetic Marker Map¶
Marker names can either be column headers or row labels. It is important to select the right option for the spreadsheet to be mapped. If the wrong option is selected, an error message indicates that no columns or rows could be mapped. If both column headers or row labels are marker names, the map can be applied to only one option but not both.
Applying a Different Genetic Marker Map¶
A marker map can be changed by applying a new marker map to a previously marker mapped spreadsheet. A new navigator spreadsheet node will be created with the new marker map as a child of the original marker mapped spreadsheet. If the spreadsheet with the old marker map applied is no longer needed in the spreadsheet, a top-level spreadsheet could be created for the new marker mapped spreadsheet. Then all the unnecessary nodes can be deleted. See Create Top-Level Spreadsheet for more information.
Dropping a Genetic Marker Map from a Spreadsheet¶
Sometimes it might be desirable to remove a marker map from a marker mapped spreadsheet. This can be done by selecting File > Drop Genetic Marker Map from the spreadsheet menu. A child node is created as a descendant of the marker mapped spreadsheet.
Exporting an Applied Genetic Marker Map to a DSM file¶
To export an applied genetic marker map, select File > Export Genetic Marker Map, then select the name and location of the output DSM file by clicking Browse. You can change the name of the dataset, by editing the text in the text box. You also have the option of exporting the marker map information for only the active data or for all data.
Downloading Affymetrix Library (CDF) Files¶
Affymetrix library files can be downloaded by going to Download > Affymetrix Library Files. You will be prompted for login credentials for the NetAffx ™ service. Login credentials can be freely obtained through registering on Affymetrix’s website. Your login credentials will be saved in the Global Program options and will be used for future NetAffx downloads. If the login credentials need to be changed, this can be done by going to Tools > Global Program Options from the Program Welcome Screen.
Once authenticated, the latest library file listings will be shown in the dialog (see Download Affymetrix Library Files Window).
Choose the location of where your library files are located by clicking Browse, if it is different from the path listed. Then click Download.