A common trade-off faced by program and tool developers is providing the convenience of a Graphical User Interface (GUI) with the raw power and automation of command line oriented programs. Often, flexibility of tool use and data manipulation is sacrificed for informative visual displays and interactive dialogs. Golden Helix, Inc. has developed a unique solution to not only take advantage of these two paradigms but also open the way for cross-communication between other tools used in data analysis. By providing full access to Golden Helix SVS‘s features through the powerful Python scripting language, this program provides functionality and support for collaboratively developing new analysis techniques.
Key features of this tool are:
To open the Python shell window, select the Tools > Open Python Shell from the either the Welcome screen or the main Project Navigator Window.
A Python shell can be executed with or without an open project. The Python shell window has several buttons on a tool bar, which are from left to right:
It may be useful when developing a script to experiment interactively in the Golden Helix SVS shell.
While describing the Python Application Programming Interface (API), there will be some terms used repeatedly. Object is one of them. “Object” refers to a named python data structure that that has methods associated with it. For the purposes of our APIs, the term Modules is equivalent to Object. Different scripting commands will create objects with access to object specific methods.
There is one object that is always available to you, whether from the python shell or from within a script. This is the ghi object. You can also retrieve it explicitly with the command import ghi.
Methods are accessed using the object name followed by a period then the method name and any parameters in parentheses. For example: ghi.getObject(3). Sometimes a method will change the state of an object’s data, require interaction with the user or return data or new objects.
At any time the dir() command can be used to see all of the objects currently available. This command produces a listing of all the current objects in the shell. To see the methods available for the ghi object introduced above, use the dir command again–only this time put ghi in the parentheses; dir(ghi). When an object’s name is used as a parameter in the dir command, a listing of all the methods that the object contains will be provided.
Detailed descriptions of the scripting commands provided by Golden Helix SVS are found in the section Python Application Programming Interface (API). A brief form of help is also available from within the Python Shell by using the help() command. Using this command with the name of an object in the parentheses displays a help message for the object, such as: help(objectName). Help is also available for any method of an object by typing help(objectName.methodName). For example, in the ghi object there are several methods including openProject, newProject, and importData. To get help on importData, type help(ghi.importData) at the Python shell prompt.
There are several ways to begin developing customized script routines. You could open a text editor and save your script commands in a .py file with the help of the Tools > Open UserScripts Folder menu option. Alternatively, you could use the Python Editor Window that is part of Golden Helix SVS found under the Tools > Open Python Editor menu option. The final option is to first open the Python shell window using the Tools > Open Python Shell menu option, enter script commands interactively and later copy them to the Python Editor Window to create and save a Python script.
There are four major areas of the Python Editor Window. These are the menu bar, the toolbar, the Script Folder bar, and the text box for writing the script (see Python Editor Window – Shipped example script Filter Samples By Call Rate.py open). In addition, in the lower right corner, there is an indicator of the line and column positions of the edit cursor within the current text box.
The toolbar contains icons for most menu items. The icons on the toolbar are listed below in order from left to right:
All possible locations to save a script and have it available through the Golden Helix SVS menu structure are listed on the Scripts Folder Bar. These folder locations and how they might be used are described below.
Scripts that are to be run on a spreadsheet should be saved in the appropriate subdirectory of this folder. The subdirectories correspond to the menus available from the Spreadsheet Viewer (the Column subdirectory corresponds to the Column Header Menu). Each subdirectory also has options for creating a new script in that folder, to create a new folder, or to launch a file browser window from your operating system which is open to that subdirectory folder.
Scripts to be run from the Spreadsheet Editor need to be saved in the appropriate subdirectory of this folder. The subdirectories correspond to the menus available from the Spreadsheet Editor (the Column subdirectory corresponds to the Column Menu). Each subdirectory also has options for creating a new script in that folder, to create a new folder, or to launch a file browser window from your operating system which is open to that subdirectory folder.
Scripts not involving a specific spreadsheet, involving multiple spreadsheets, or those to be run from the Project Navigator Window should be saved in the appropriate subdirectory of this folder. The subdirectories correspond to the menus available from the Spreadsheet Viewer (the Column subdirectory corresponds to the Column Header Menu). Each subdirectory also has options for creating a new script in that folder, to create a new folder, or to launch a file browser window from your operating system which is open to that subdirectory folder.
A new script can be created through the File > New Script menu option, or by selecting the appropriate directory and subdirectory for the script to be saved and then selecting “New Script...”.
For example, if a script to calculate the global mean of a spreadsheet was to be written and available in the Analysis menu of a Spreadsheet this can be done in two ways. These are:
Each script to be run from a spreadsheet or from the project navigator window needs to have a function definition, and a command to try the function. An example of these commands are as follows.
def global_mean(): #defines the function
#Enter Python commands to calculate the global mean here
try: #tries to run the defined function
global_mean()
except: #if there is an error then print an error message in the Python Shell
ghi.error()
Each script to be run from the Spreadsheet Editor, on the entire spreadsheet or on multiple columns, needs to have a function definition defined exactly as follows. This type of script does not need a command to try the function.
'''
(This is a block comment in Python)
Function definition for editing a spreadsheet or multiple columns.
Input: dataEditModel: Python object for editing the spreadsheet data
'''
def editData(dataEditModel): #This line must be written exactly this way
#Enter Python commands here
Each script to be run from a column in the Spreadsheet Editor needs to have a function definition defined exactly as follows. This type of script does not need a command to try the function.
'''
(This is a block comment in Python)
Function definition for editing a column.
Input: dataEditModel: Python object for editing the spreadsheet data
colIndex: the column number for the selected column
'''
def editColumn(dataEditModel, colIndex): #This line must be written exactly this way
#Enter Python commands here
Note
The best way to create a script is to use a provided script as a guide for creating a new script. Scripts that are provided are a guide to the required Python syntax.
A Python script can be run by selecting the script from a menu or from the command line. An alternative way to run a script is by directly selecting the script to run. This script can be selected using the Tools > Run Python Script menu option. Selecting this option from the Tools menu will display a dialog prompting for the location of the script to be run. The path can either be entered manually, or the Browse button can be used to browse to the script path. Once the location of the script is filled in, click Run Script to start the script.
Scripts written by Golden Helix and other users to enhance the capabilities of the software can be obtained by going to Downloads > Add-on Scripts. All scripts that are available can be downloaded and saved into the appropriate scripts folder.
The prompt dialog allows the script writer to create an extremely comprehensive GUI dialog to obtain information required for the script from the user. The details of this application programming interface (API) are too detailed to exist solely in the function definition. As a result, the parameters for this API are detailed below. Please refer to ghi.promptDialog() for additional information.
An example prompt dialog can be generated by the following code:
>>> myUserInput = ghi.promptDialog([{"name":"string","label":"Enter string:",
"type":"string","tooltip":"Any string will do",
"default":"Hello World!"},
{"name":"myInt","label":"Enter integer:",
"type":"integer","min":0,"max":100,"default":50},
{"name":"value","label":"Enter double:",
"type":"double","min":-1,"default":100.0001},
{"name":"method","label":"Select method",
"type":"combobox","list":["method 1", "method 2",
"method 3"]}], title = "User Input", width = 400)
>>>
The above example Example of a Prompt Dialog for User Input displays a dialog box with four input fields, three text boxes and a drop down list. The first text box is for a string, the second for an integer and the third for a real-valued number. The drop down list allows the user to specify a method.
The following optional keywords can be specified with the promptDialog command after the list of input fields.
- scrollableLayout: bool If True, the layout for the input dialog will be placed in a scrollable frame. This is useful if you have too many inputs to fit on a normal height screen.
- width: integer and height: integer Specify minimum sizes for the input dialog. This will not shrink the dialog to be smaller than is required to display all the widgets when not using a scrollable layout but can be used to make the dialog wider or taller than it otherwise would be.
- helpUrl: string Provide a URL of a web page that provides help documentation for the current script or action.
- title: string Provide a window title for the input dialog.
- okText: integer Provide an alternative text for the OK button on the input dialog. For example, you may want the button to say Run or Next > if you intend to have further dialogs of user input.
The input fields of the promptDialog() function are specified in a list of dictionaries, each with some required and optional attributes. Every item requires a type field. Items with a type that produces an output require a name field, which will be used as the key in the results dictionary to store the output for that item. For example, to access the integer value you would use the command myUserInput['myInt'] since the name attribute is defined as ‘myInt’.
There are a number of data entry methods available. Most items require a label attribute for providing a user prompt. Items with an explicit label left of the input widget allow the setting the checkable attribute to True to make that label into a checkbox. If not checked, the input widget is disabled the item returns None for its value. The checkbox is checked by default but can be changed with the checked attribute.
Each entry may have an optional tooltip attribute, which is a message that appears when the user hovers the mouse over the field. Labels are listed at the left of the data field.You can make an input an optional field by setting its required field to False. Most input types also allow specifying a default value. The type attribute defines which type of data entry field is to be constructed.
The available types and their specific behavior and attributes are as follows:
- integer: prompts user for an integer. Does error checking to ensure that a valid integer has been entered. Aliases: int.
- Required attribute:
- label:string: the input label
- Optional attributes:
- default:integer: specifies a default value shown in the dialog window.
- min:integer: specifies that integer must be greater than or equal to the specified minimum value.
- max:integer: specifies that integer must be less than or equal to the specified maximum value.
- checkable:bool: allows the input to be required if it is checked, otherwise not required.
- checked:bool: default status if checkable:True.
- double: prompts user for a double precision (64-bit) number. Does error checking to see that a valid double has been entered. Aliases: float, real.
- Required attribute:
- label: the input label
- Optional attributes:
- default:double: specifies a default value shown in the dialog window.
- min:double: specifies that user-entered double must be greater than or equal to the specified minimum value.
- max:double: specifies that user-entered double must be less than or equal to the specified maximum value.
- checkable:bool: allows the input to be required if it is checked, otherwise not required.
- checked:bool: default status if checkable:True.
- string: prompts user for a string. If a required input, checks that the input string is not blank.
- Required attribute:
- label: the input label
- Optional attributes:
- default:string: specifies a default value shown in the dialog window.
- checkable:bool: allows the input to be required if it is checked, otherwise not required.
- checked:bool: default status if checkable:True.
- checkbox: provides a checkbox with the text provided by the label as the prompt. The check box state of check correlates to a value of True and unchecked to False. Aliases: check
- Required attribute:
- label: the input label
- Optional attributes:
- default:bool: the checked state.
- combobox: takes an additional non-optional attribute, “list”, which contains a list of strings to form a list of choices for the user to choose from. For example, the dictionary entry "list":["item 1", "item 2"] specifies a combobox with two possible values to choose from, “item 1” and “item 2”. The first item is specified by default. Returns the selected string from the list. Aliases: combo.
- Required attribute:
- label: the input label
- Optional attributes:
- default:bool: the string value of the item to be selected.
- checkable:bool: allows the input to be required if it is checked, otherwise not required.
- checked:bool: default status if checkable:True.
- radio: Similar to combobox but presents each option as a radio button. Requires “list”, which contains a list of strings to form a list of choices for the user to choose from. For example, the dictionary entry "list":["item 1", "item 2"] specifies a two radio buttons to choose from, “item 1” and “item 2”. The first item is specified by default. Returns the selected string from the list. Alias: radio.
- Required attribute:
- label: the input label
- Optional attributes:
- default:bool: the string value of the item to be selected.
- orientation:integer (one of below): specifies the layout of the buttons
- checkable:bool: allows the input to be required if it is checked, otherwise not required.
- checked:bool: default status if checkable:True.
- dir: Prompts the user for a directory. A Browse button will pop up the operating systems directory chooser. Alias: directory.
- Required attribute:
- label: the input label
- Optional attributes:
- default:string: the full path to an existing file.
- fileopen: Prompts the user for a an existing file. A Browse button will pop up the operating systems file open chooser.
- Required attribute:
- label: the input label
- Optional attributes:
- default:string: the full path to an existing file.
- filter:string: The filter string for the open file dialog. Multiple filters can be separated by a double semicolon. For example "Python File (*.py);;All Files (*)" would provide two filters in the dialog, defaulting on only showing files ending in .py.
- filesave: Prompts the user for a new file. A Browse button will pop up the operating systems file save as chooser.
- Required attribute:
- label: the input label
- Optional attributes:
- default:string: the full path to a file.
- filter:string: The filter string for the open file dialog. Multiple filters can be separated by a double semicolon. For example "Python File (*.py);;All Files (*)" would provide two filters in the dialog, defaulting on only showing files ending in .py.
- files: Prompts the user for multiple existing files. A Browse button will pop up the operating systems file open chooser. Returns the list of selected files.
- Required attribute:
- label: the input label
- Optional attributes:
- filter:string: The filter string for the open file dialog. Multiple filters can be separated by a double semicolon. For example "Python File (*.py);;All Files (*)" would provide two filters in the dialog, defaulting on only showing files ending in .py.
- markermap: Prompts the user for a marker map from the marker maps folder.
- Required attribute:
- label: the input label
- spreadsheet: Prompts the user for a spreadsheet from the open project (requires a project be open). Returns the selected spreadsheet’s ID. Alias: ss.
- Required attribute:
- label: the input label
- Optional attributes:
- default:integer: The ID of a spreadsheet.
- mapped:bool: If True, only allow for mapped spreadsheets to be selected.
- requirements:integer list: A list of requirements that a spreadsheet must meet in order to be selectable. The following is a list of possible requirements:
- spreadsheets: Prompts the user for multiple spreadsheets from the open project (requires a project be open). Returns the list of selected spreadsheets’ IDs.
- Required attribute:
- label: the input label
- Optional attributes:
- mapped:bool: If True, only allow for mapped spreadsheets to be selected.
- requirements:integer list: A list of requirements that a spreadsheet must meet in order to be selectable. Uses the same requirements as the spreadsheet input type.
- column: Prompts the user for a column from a specified spreadsheet. A select button will pop up a column chooser dialog. Returns the selected column’s index. Alias: col.
- Required attributes:
- spreadsheet:integer or string: ID of the spreadsheet or the name of the previous item of type “spreadsheet” to select the spreadsheet from which columns are chosen.
- label: the input label
- Optional attributes:
- default:integer: a valid column index.
- types:integer list: Changes the column type filter in the column chooser dialog to only allowing selection of column from the given types list. Items should be one of ghi.const.Type*. For example "types":[ghi.const.TypeReal, ghi.const.TypeInteger] would only allow the selection of real and integer columns.
- filter:integer: A filter on the columns used from the chosen spreadsheet. This uses the same filter logically ORed options as spreadsheet.dataModel(). For example, to only allow Active data use "filter":ghi.const.FilterActive
- allowLabel:bool: indicates whether or not the row labels are an appropriate input and thus can be selected by the user.
- columns: Prompts the user for a list of columns from a specified spreadsheet. A select button will pop up a column chooser dialog. Returns the selected columns’ indexes. Alias: cols.
- Required attributes:
- spreadsheet:integer or string: ID of the spreadsheet or the name of the previous item of type “spreadsheet” to select the spreadsheet from which columns are chosen.
- label: the input label
- Optional attributes:
- types:integer list: Changes the column type filter in the column chooser dialog to only allowing selection of column from the given types list. Items should be one of ghi.const.Type*. For example "types":[ghi.const.TypeReal, ghi.const.TypeInteger] would only allow the selection of real and integer columns.
- filter:integer: A filter on the columns used from the chosen spreadsheet. This uses the same filter logically ORed options as spreadsheet.dataModel(). For example, to only allow Active data use "filter":ghi.const.FilterActive.
- allowLabel:bool: indicates whether or not the row labels are an appropriate input and thus can be selected by the user.
- markermapfield: Prompts the user for a marker map field from the marker map from a provided spreadsheet. A select button will pop up a field chooser dialog. Returns the selected field’s index. Alias: mapfield.
- Required attributes:
- spreadsheet:integer or string: ID of the spreadsheet or the name of the previous item of type “spreadsheet” to select the spreadsheet from which columns are chosen.
- label: the input label
- Optional attributes:
- default:integer: a valid map field index.
- types:integer list: Changes the field type filter in the field chooser dialog to only allowing selection of map fields from the given types list. Items should be one of ghi.const.Type*. For example "types":[ghi.const.TypeInteger] would only allow the selection of integer fields.
- allowLabel:bool: indicates whether or not the row labels are an appropriate input and thus can be selected by the user.
- coordsys: Prompts the user to select a coordinate system (species and build combination) from the system provided list built from available genome maps. The default coordinate system will be the global or project default. Returns the coordinate system string for use when building Genome Browser tracks.
- default:string: coordinate system string.
- track: Prompts the user for a Genome Browser track from the project, user data folder or system folder. Returns the selected track’s URL. Alias: annotation.
- Required attribute:
- label: the input string
- Optional attributes:
- default:string: a track URL.
- source: one of “local”, “network” or “all”: Selects the source of the Genome Browser track. Local indicates tracks that have been entirely copied to a local IDF file. Network indicates tracks that are streamed on demand from data.goldenhelix.com. All provides both source options in the selection dialog. Defaults to “local”.
- trackType:string: Filters the track selector lists to only contain tracks of the given type. Example types are “Interval”, “Cytoband”, “Gene”, “Probe”, “Allele Sequence”, and “Intensity”. Defaults to no track type filter.
- idf: Prompts the user to select a new or existing IDF file. Defaulting to the user app data folder, tracks can be appended to existing IDF files or placed on new IDF files. This prompt may be desirable for scripts that need to write track information to a IDF file.
- Required attribute:
- label: the input string.
- Optional attributes:
- default:string: full path to an IDF file.
- prompt: displays a label only. This type requires no user input and produces no output. It does not require a name attribute.
There are some types that do not provide input and do not contribute to the returned output list, but allow for the dialogs to have a richer layout and organization. The name, default and required attributes are not used for these types, although label usually is.
These types are containers for other types and have a few attributes in common in common.
First, they require an items attribute, which is a list of dictionary items like the one provided to promptDialog. These are the items that will be inside the container. Containers can contain container items so multiple levels of items can be constructed for a maximum flexibility in displaying items to the user.
Second, containers have an optional layout option that is one either “vertical”, “horizontal” or “columns” where the default is “vertical”. These are described below.
- vertical: All items are laid out vertically, one on top of the other.
- horizontal: All items are laid out horizontally, one after the other.
- columns: Items are laid out in vertical columns. This requires that the items attribute is a list of list of items instead of just a list of items. The other list defines the columns. For example, "items":[ [item1, item2], [item3, item4, item5], [item6] ] would produce a three column layout with item 1 and 2 in the first, 3, 4 and 5 in the second and 6 in the third.
If the user cancels, an empty list is returned. It is wise to first check if the list is empty before trying to access its elements. If there is an error in syntax, a Python exception is returned with a description of the error. If the user hits OK, and there is any error in the input, the user is told by the dialog what the problem is, which may be remedied from within the dialog. If there are no errors, a dictionary of the user inputs is returned.