New Product Add-Ons¶
- The VS-CNV 2.0 algorithm is now available as an add-on licensed product to VarSeq! Building on the first version probabilistic model for calling CNVs on targeted gene panels, version 2.0 now scales to calling large events and chromosomal aneuploidy events on large gene panes and exomes. Included in this the ability to call Loss of Heterozygosity (LOH) events and integrate those calls with the CNV algorithm. See the CNV Caller on Target Regions and LoH Caller sections for more details.
- The PhoRank algorithm has been updated extensively to improve the ranking of genes relevant to sample phenotypes. Also, input phenotypes have been extended to include OMIM provided syndromes and phenotypes. The OMIM content add-on is required for this feature.
- A new Match Gene List from Phenotypes algorithm is available that allows you to define a gene list using input phenotype terms. This also now supports input terms in OMIM as well as HPO. (see Match Genes Linked to Phenotypes)
- A number of updates and improvements have been made to the gene
annotation algorithm that provides per-transcript variant
- New fields in the Transcript Interactions source provide the reference and alternate Amino Acid for each transcript in the interactions table. Additionally the number of exons and amino acid position are provided.
- The HGVS p. field was updated for synonymous and frameshift variants to be the long form. For synonymous variants, this includes the reference amino acid. For frameshift variants this includes the amino acid in the altered sequence.
- Intronic variants now have their nearest exon reported in the Exon Number field, and a new field 5’ Exon Number can be used to determine which two exons the intronic variant is between.
- In the case where a variant overlaps multiple genes, the order of the genes is consistent between all fields in the Summary source.
- For VSReports a number of improvements have been made to improve customization. See Customizing Reports for more details.
- Included new VSReports template that can be used to facilitate N-of-One submissions.
- A new Recently Closed Tabs menu allows you to reopen tabs you have closed during the current VarSeq session. Right clicking on any tab or tab bar location allows you to access this menu.
- Variant sets and other table record sets can now have the last change reverted. For example, in the right-click menu over the check-boxes as well in the tool bar menu there is now a Undo last change to Primary Findings.
- The “Web Browser” tab of VarSeq has been upgraded to be based on the open source version of Chrome. It now has a direct page Print capability, integrated downloads, better session management and overall performance improvements.
- VCF import updates:
- Better handling of indels with a reported END field specified.
- Fixed issue when field promotion was specified along with advanced left-align options for multiple sample import.
- The Allelic Depth field is no longer an allele matching field unless it is a list.
- The Variant Allele Frequency is now automatically computed upon the import of TCGC data.
- Allow for computing the Variant Allele Frequency on a subset of samples that have data to support the calculation.
- Now able to correctly handle VCF files with an INFO level Samples field included in the data.
- The max operator in the Expression Editor was not functional when no score was assigned to a variant. Now the Max operator checks for missing values before taking the Max.
- Fixed issue with Aggregate Compute Fields algorithm when using the length function on a matrix string field.
- In the Expression Editor, missing string array values are now correctly returned as the correct NA value for invalid expressions. This will now allow users to add expressions such as split(HGVScName,”.”).
- Now correctly saving custom visibility options for sample fields when project is opened or a table is cloned.
- Fixed naming issue for XLSX tabs when exporting multiple tables in VSPipeline.
- Fixed an issue preventing annotation tracks to be exported as text files from the Data Source Library.
- Export of Assessment Catalogs from the Data Source Library no longer contain a blank first row.
- A number of improvements were made related to selection of variants
and samples in tables including:
- Variant selection is now synchronized across tables. If multiple tables contain a given variant, changing the selection in one tables will select that row in the other table. This works for other table types like Coverage, Genes and CNV as well.
- When a variant or CNV is selected and a GenomeBrowse view tab is open, a blue highlight marker is placed over the genomic interval defined by the variant or CNV. This is similar to existing mouse-anchor selection that spans all plots but is done automatically and is visually distinct.
- Changing the current variant clears the GenomeBrowse console view so it can not be misinterpreted to be related to the now updated plot views.
- The Assessment Catalog view relies on the selection of the current variant. Previously it would show out-of-date variants when a change of sample results in the current variant not being selected. It now clears to no variant selection in sync with the table.
- The sample selected using the project toolbar selected is automatically selected in any open Samples table.
- Display name for tabs can now be renamed from the right-click context menu for the tab.
- Improved automatic association of BAM files to VCFs upon data import. Included the option to browse for BAMs for each sample through the file explorer.
- Project Templates that plot coverage statistics in the GenomeBrowse console view from the Targeted Region on Coverage algorithm will automatically reload the plots in the GenomeBrowse console upon data sync.
- Decreased the buffer size when importing variants to the Assessment Catalog for a smoother progress bar upon import.
- When display names have been updated these new identifiers will be included in the Export dialog to create unique entries.
- Creating a new Variant Set in a project with hundreds of samples now completes in a reasonable amount of time.
- Added new context arguments thisForm, and varInput to the variant-level autofill function for current variant input context. This allows one form field to respond to changes from another form field in VSReports.
- Modifying the checked state of a Variant Set and updating entries in an Assessment Catalog now produce useful log messages about the change in the project Log tab.
- The Plot BAM for Current Sample option is now properly hidden if BAM files were not associated upon data import.
- Updated the ACMG gene list included with the Exome Trio Template to include genes from the ACMG-SFv2.0 recommendation list.
- The export to VCF tool now creates compressed vcf.gz files by default.
- An error will notify users when Split Variants Based on Unique Genotypes is selected as an advanced import option if the required Genotype field is not present in the data.
- Coverage Region updates:
- Sample name will now be appended to column group name when output is first generated.
- Algorithm will now run on multiple cores.
- A new “Mean Filtered Depth” field, which can be used to detect targets with high level of filtered coverage due to read multi-mapping.
- You can CNV Reference Folder from the Options dialog of VarSeq as well as from the algorithm options for the CNV Caller.
- Export to Excel now includes the option to choose a custom delimiter for lists.
- When closing a project, you will be prompted for any un-saved reports or assessment catalog changes to be saved. This includes reports or catalogs hosted on VSWarehouse.
- Fix crash that occasionally occurred when switching samples with an VSWarehouse based Assessment Catalog open.
- Complex VSReports based reports that evaluated some JS resulting in undefined values would crash when rendering. This has been resolved by using a different JS evaluation engine when running the report render function. There should be no changes in the resulting rendered report.
- When running a bulk upload of fields into an assessment catalog, the values were not properly assigned if some of the fields were left unmapped. This has been resolved.
New Product Add-Ons¶
- CNV calling on NGS target data is now available as an add-on licensed product to VarSeq! In it’s initial release, we are recommending it only for the application of targeted gene panels on NGS data. It will run with Exomes, but it is expected that further tuning and region-specific adjustments may be needed. See the CNV Caller on Target Regions section for more details.
- You can now plot sample-level fields imported with your VCF or from the output of algorithms (such as the CNV algorithm) and have the values track the project Current sample. This dramatically improves many interpretation work flows that utilize a GenomeBrowse view.
- String and Enum fields can be plotted directly. Enum fields will automatically be colored by their values.
- Sample-level fields for variants or other tables with genomic coordinates such as Target Coverage or CNVs can now be plotted as a Heat Map for all samples using the new “Plot for All Samples” context menu on their columns.
- All fields can now be searched using the “Query Column Values” context menu item. This makes it easy to filter non-variant tables (such as the new CNV events table) on things like the CNV State field. You can also search the genomic position or range column at the table level.
- Importing variants into a configured Assessment Catalog resulted in errors when the input source contained indels.
- An issue where the bottom two rows of a table did not allow copying of the cells content to the clipboard was fixed.
- Progress reporting was fixed for XLSX export in VSPipeline.
- Table filters are now case insensitive. In the previous 1.4.1 release they were made case-sensitive unintentionally.
- Fixed issue using the split function on string array fields within the Expression Editor.
- The context menu for column groups now supports immediate toggling of the field visibility of the fields for the given source. This feature was available in 1.4.0 in a different context menu and didn’t make it into the table replacement that shipped in 1.4.1. The new implementation handles sources with many fields compactly and easily supports toggling all fields on or off.
- Selecting multiple rows of the variant, coverage or CNV table will zoom the GenomeBrowse view to include all the selected genomic features.
- You can now import VCF files with no records in them as well as export empty tables. This comes in handy when dealing with small gene panels on a project-per-sample basis.
- The log for the import algorithm now displays the common base path of imported VCF files.
- Updated family import options for multi-allelic splitting to reduce duplicate variant data being created.
- When out of date annotations are updated, they now retain their position in tables in which they are visible.
- The table has been completely rewritten with multitude of usability improvements as well as improved memory efficiency, removing previous limitations to the number of variants viewable in VarSeq. See the updated Table View section for more details.
- Drilling down on details in the table has been rebuilt with in-place details about the current cell and row. These details can be snapped to the right of the table and maintain synced with the current row. See more about the new Detail View.
- When shrinking columns to increase the number visible in a table, we now shorten column names dynamically to their symbol. Thus columns like “Read Depth (RD)” will be shorted to “RD”. Similarly when the space is available the full column name will be used. Note this pertains to “Variant Sets” toggle-fields used to flag variants to report, which previously were “fixed” to their short two-letter symbol form.
- Previously, the Group by Genes feature had a limit of 65,536 variants being in a single gene. This limit has been removed.
- Added Aggregate Compute Fields algorithm which allows for summary computations on sample fields. See Aggregate Compute Fields for more information.
- Added support for Assessment Catalogs hosted on VSWarehouse™ to be utilized by VarSeq.
- No longer allow sample specific variant sets to be created for projects with no samples.
- Fixed BAM hyperlinks in Samples table so they will work if the path contains “&” or ”,”.
- When saving Assessment Catalog fields of type “Multi-Item Select” with more than one selected item, only the first selected item was keeping its selected states. It now saves and restores properly.
- Fixed behavior in Note tabs where editing the title of an existing note would sometimes indent all the contents of the note.
- Informative error messages now provided when selecting incorrect sources to be used for subsetting data on the import dialog.
- Allow for import of VCF info fields that contain a dash in the identifier.
- Fixed error when selecting TakeAll option for merge behavior on import of variant sites fields.
- Fixed bug in report template that inhibited auto-downloading of required OncoMD annotation sources.
- Significant speedups can be expected when importing data, especially data with large numbers of samples. Under the hood, new compression algorithms are being used to store project data.
- The export dialogs have now been rebuilt as wizards with more descriptive prompts and a running log describing the export status. The back end has also been optimized with significant speedups noticeable on large exports.
- GenomeBrowse views now have a right-hand dock window to display the details of the last clicked item. It also has had the button to add plot items renamed to “Plot”.
- Clicking the histogram button on a numeric filter card now pops up a drill down on that field include a histogram chart.
- When a table is sorted, there is now a interactive purple sort indicator beside the filter indicator that allows you to toggle the sort on and off, as well as change its direction and remove it.
- Sorting a field that is a sample field for the current sample gets updated as the current sample changes. For example, you can sort your table by Zygosity, and as you change the current sample, the table will re-sort on the Zygosity of the newly selected sample.
- You can now select multiple rows using Shift and Ctrl and toggle the checked state of variants for use in reports and variant sets.
- While variants are importing, all incoming fields are now immediately visible in the table.
- You can now interact with fields as they are being computed, including sorting on them and getting their descriptions.
- The selected gene in a Variant by Genes table is now preserved as the filters are updated, as long as that gene remains in the filtered variant set.
- Made the fields in the Manage VSWarehouse™ dialog read only.
- Allow the sample algorithm fields like Zygosity to be deleted.
- The annotation source selected by default for the region import option is the current RefSeq Genes source for the selected project assembly. Dialog will display source name with the path to the source listed as the tool tip.
- Updated Human Phenotype Ontology and Gene Ontology annotations sources used with the PhoRank algorithm. See Sample PhoRank Gene Ranking for further details.
- The import algorithm will now recognize <*> in gVCF files as a spanning alternate allele that can be used to fill in reference calls when importing multiple samples together.
- Updated sample level fields naming for family samples.
- BAM file selection for association on import of samples was optimized to better handle files that are stored in different directories.
- Changed sort order of samples in projects to use use natural string sorting of sample names.
- The Assessment Catalog view now has a “Refresh” button that allows you to update to changes from the database that occurred since you selected the variant.
Projects created with VarSeq 1.4.1 or newer will not be able to be opened by previous versions. If asked to open a project that is newer than the supported version, VarSeq provides a warning.
- Added APIs to automatically generate rendered reports from VSPipeline.
- Added an FAQ section to the VarSeq Manual. See Frequently Asked Questions.
- Prevent VarSeq crashes that could occur when deleting table views. This crash was more likely to occur with cloned tables in projects with a template.
- Fixed bug that prevented sub-setting on import using a region source. Gene sources should now also remember the “Exon Only”/”Full Transcript” parameter when set in a template.
- Prevent VSPipeline hang when adding variants to a record set and then exporting tables when using a batch script or specifying all operations in one command.
- Fixed bug that prevented VSPipeline from importing sample fields.
- Fixed bug that caused VSPipeline to re-run PhoRank when project is opened in VarSeq.
- Allow download and update of annotation sources that are no longer valid.
- Always open an empty web view when opening a new Web Browser tab. This corrects the inconsistent behavior of which site the tab would open on view creation.
- Updated shipped Example Projects to use up to date versions of annotation sources.
- Added sample filling for enumerated values.
- Updated VSPipeline to auto-import all sample fields.
- CADD variant scores are now available in VarSeq. See CADD for more information.
- Added Variant PhoRank algorithm to rank genes based on their relevance to user-specified phenotypes for all samples. This algorithm will also work in projects without any sample information. See Variant PhoRank Gene Ranking.
- VCF export option now supports and defaults to compressed file format.
- Prevent crashing when trying to plot sources that are not available
- Plot Unfiltered Variants
- Plot Numeric Field...
- Fixed bug that triggers unnecessary dialog pop up when Add button is clicked.
- Fixed an issue that prevented sort order from being updated when record values were changed. To update the sort order, reselect the sort option from the right-click menu.
- Fixed issue exporting Coverage Region table for multiple samples in the project.
- Respect all field merge options when importing data from a template.
- When template contains a computed field that depends on another computed field, wait for the first to complete before starting the second.
- Fixed bug for computed fields when converting Float64 variables to Float.
- Fixed progress in convert wizard when left align is turned on.
- Prompt for download when using Add > Computed Data > Annotate Transcripts and Add > Computed Data > Annotate Regions instead of annotating against the remote source.
- Fixed issue downloading network sources during batch workflows in VSPipeline.
- Handling of multiple genotype calls for a single sample at a position has been
updated to be more representative of the variant changes. These updates include:
- Allelic primitives now sums the allelic depth when collapsing two matching (complimentary) features at the same position for the same sample.
- Allelic primitives now does not sum quality metrics such as read depth when combining alternates to make a homozygous variant genotype.
- When multi-allelic sites are split on import, use all valid alternate alleles at that site when counting alleles in the Count Alleles algorithm.
- Include all alleles present in a trio in the Alternate allele list including the alternates not present in the proband.
- When left-aligning and splitting variants into allelic primitives, merge phased heterozygous calls into a single homozygous call when collapsing the variants.
- Fixed bug that causes VarSeq to crash when selecting sample fields visibility if sample fields were not imported.
- Fixed issue with missing report fields from the ACMG Hereditary Gene Panel Template.
- Allow Evernote to open a new note even when it cannot authenticate a shared notebook. Only show notebooks that can be authenticated in the available list.
- Fixed the human GRCh_38 assembly so each chromosome had the correct length specified.
- Fixed issue when importing variants from a TSF file and selecting to subset the import by region.
- Fixed issues with computing lengths of chromosomes when converting FASTA data to create a new genome assembly.
- Renamed computed “Alt Allele Freq” field to “Var Allele Freq” to ease confusion with the “Alternate Allele Frequencies” in population catalogs. The Var Allele Freq field is computed off of read depth statistics provided by the variant caller.
- Reuse “Group by Genes” algorithm if available for PhoRank this prevents extra group by genes columns and keeps all associated algorithms on the same axis.
- Add Cancel button to Plot Numeric Field dialog.
- Added deprecated flags to the Data Source Library so older annotation sources could be hidden automatically when the current box is checked.
- Allow download of OMIM and CADD secure remote sources if licensed to use those sources.
- Added optimization to Count Alleles algorithm to improve speed for large sample sizes.
- Optimized VCF export for large projects that are filtered to a small set of variants.
- Turned on Allelic Primitives by default for all import types.
- Allelic Primitives updates:
- Now performed before Multi-Allelic splitting.
- Adds phasing information when a variant is split into multiple variants.
- On Linux and Mac OS X, import of the maximum number of open file handles allowed. Changing this parameter for the operating system allows the user to increase the maximum number of file handles for a more efficient import process. For help increasing the soft-limit on Linux, RHEL or Mac OS X contact email@example.com.
- Updated Template Import to support the following:
- Save deletion of fields during sample import
- Save filter selection during sample import
- Save and use subset by region during import
- Updated verbiage and default options on version update dialog.
- Updated Count Alleles by Gene documentation.
- Fixed an issue that prevented the sample selector from being displayed when no affected samples exist.
- Fixed export options to allow for non-visible fields to be included in the exported files.
- Guard against crashing VSPipeline when passing invalid arguments.
- Fixed field merge behavior and field upgrade behavior when importing more variants into an existing project.
- Prevent merging variants across samples with sample type of “cancer” when the alternate alleles do not match.
- Do not move DP and FILTER fields to sample level fields when there is more than one sample in a single source.
- Add guard in dialog creation when computing Targeted Region Coverage to prevent crashes when using X-terminals or MacOSX.
- Updated import default options to not make fields into lists unless it is required.
- Fixed missing value reporting for sample Coverage Statistics.
- Fixed issue with appending samples from multiple files when not all sample have data in each file.
- Fixed merge behavior for read depth field when sample has multiple values at a site and advanced options for allelic primitives and multi-allelic split are selected.
- Removed Chr M/MT aliasing to prevent merging these these two chromosomes from hg19 and 1kg assemblies when merging samples with different M/MT assemblies.
- Fixed bug with allelic primitives and multi-allelic split not sorting variants after splitting. Ensures consistent results when reimporting the generated TSF file into a new project or merging new samples in.
- VSPipeline polishes:
- Disabled instantiation of Titan Grid for table views to prevent crashes when importing millions of variants when visualization in a table is not required.
- Include all fields when using a sample_fields_file on import, only require explicitly setting expected fields if the name is different than an one of the auto-detected names.
- Dampened progress messages to only show non-redundant messages.
- Removed status of import and algorithms from table view and moved to the toolbar on the main window.
- Added “Delete Algorithm” option to download prompt for annotation sources.
- When algorithms are added that require a download, show the download pending in the algorithm status list. If the download is canceled from the download manager, have the status dialog indicate a download is required.
- Upgraded progress bar at the bottom of each window to be click-able and once expanded will show progress for each individual running task as well as the number of tasks waiting.
- Added support for choices (combo) in report template params.
- On import, when associating BAM files, the option for clearing the detected BAM files has been added. Click on the trash can to clear the list of BAM files. The auto-detection of sample name matching to BAM files has also been fixed to select BAM files that exactly match the sample name.
- Changed field type for imported Identifier field to string array to help improve merging of data from multiple files.
- When importing sample information from a text file any additional fields are imported in their original sort order instead of alphabetically.
- Added VarSeq version to project log and reformatted log columns to improve legibility.
- When merging samples ensure the “Samples” column is the first column in the sample table.
- Updated Cancer Gene Panel report template
- Updated ACMG Hereditary Gene Panel report template
- Computing Alt Allele Freq (AF) separately for each sample so that different fields can be used in the formula for each sample.
- Fixed bug preventing annotation against OncoMD variant sources
- Prevent splitting HTML encodings in Excel export.
- Use Project TEMP folder for Excel export instead of the system Temp folder.
- New Sample Statistics algorithm added which includes computed statistics for each sample over the called variant sites. See Sample Statistics for more information.
- Added new functions to the Compute Fields algorithm including: round, trim, and stdev.
- Added API for skipping header lines in the sample information file in VSPipeline: sample_fields_header_value and sample_fields_ignore_value.
- Adjusted Data Source Library Ctrl+A / Ctrl+Shift+A behavior to modify only items in the current view.
- Fixed Aggregate Counts of Variables Crash.
- Fixed Expression Editor Crash.
- Fixed Append Records option when importing data into a project created with a template.
- Fixed program freeze during import via VSPipeline when no variants are created.
- Fixed VSPipeline hangs on import command for re-import.
- Fixed VSPipeline wait_for_download to wait for annotation refresh or DAL propagation
- Fixed Text Export list delimiter to accept symbols.
- Fixed unchecking signature bug in VSReports introduced in 1.3.2
- Fixed XLSX Export bug causing unreadable and missing data in Excel.
- Count Alleles algorithm with a grouping variable selected no longer crashes VarSeq when launched while import is working.
- Left Alignment of variants is now working correctly on import.
- Export to VCF now creates single sample VCF files for all export options.
- Fixed bug that caused aggressive truncating in the data console.
- Look ahead farther when reading VCF files to ensure proper ordering of features.
- Added support for computing boolean arrays from both variant and sample field using the Compute Fields algorithm.
- Updated EULA.
- Optimized drawing of numeric value plots in GenomeBrowse to improve speed of drawing and redrawing at the whole genome scale.
- Updated truncating in data console to be less aggressive.
- Updated Annotation version control to skip tracks with no series name listed.
- Rebranded Database and Variant Catalog to Assessment Catalog.
- Added a license gate to Coverage Statistics.
- Added Allele Count to Count by Genes.
- Polished Compute Fields dialog.
- Updated manually associating BAM files to support BAMs in multiple directories.
- Added Hyperlink to BAM file path in Sample Table to load in GenomeBrowse
- Updated project names with dots in file names to match folder name.
- Updated import to support old VS projects (v1.2.0 and older).
- Added option to XLSX and Text export to fill in missings with a symbol.
- Reorganized the annotation algorithms in the selection dialog for Computed Data....
- Updated shipped templates.
- Added Gene Name, Transcript Name, and Strand columns to Aux field group.
- Updated Save Template as Project to include sort information for non-sample fields.
- Updated Genotype Zygosity algorithm to better handle half called and multi-allelic genotypes.
- Improved Project Saving On Load.
- Added parameters for subsetting variants on Import to regions defined in an annotation source including +/- BP window and if subsetting down to Exon Only regions or Full Transcripts (Regions).
- Polished variant collapsing transform algorithm on Import.
Added Annotate Transcripts algorithm to the Add > Computed Data menu. Running the Annotate Transcripts through the Computed Data dialog includes the option to modify parameters which includes specifying the splice site boundaries. See annotTranscriptLink for more information.
Added Match Sample Gene List algorithm to the Add > Computed Data menu. This new algorithm determines gene matches based on sample specific gene lists.
Added VSPipeline command to import new samples plus the existing samples into a new project.
project_open path="D:/ExistingProject" import project_path="D:/NewProjectPath" file="D:/NewFileToImport.vcf"
- Allow for import from and export to paths containing extended unicode characters. This includes support for Japanese and Arabic languages among others.
- Fixed reported time to import data in the detailed information of the Log.
- Prevent creating a variant database without a path specified.
- VSPipeline bugs fixed:
- Batch files containing the exit command no longer crashes.
- Exporting all samples (affected_only = False in the foreach_sample command) no longer crashes.
- Fixed the issue with renaming samples in the Import Wizard.
- Removed empty dialog when canceling out of the “Project Newer Than Software” warning.
- Update table when using table expressions and switching between samples.
- Prevent crash when trying to change a Container to a Local folder in the Data Source Library.
- Prevent crash when opening the Data Source Library if the source tree contains a local source that was watching a folder that was deleted while the program was closed.
- Ensure that the Data Source Library downloads to the annotation folder set in Tools > Options ... unless a different path is set through GenomeBrowse options.
- Updated shipped VSReports templates to fix error when selecting variants that are missing optional OMIM annotations that are used to auto-fill report fields.
- Fixed issue with Convert Wizard that caused the data preview to be missing for all supported file types.
- Guard Variant Assessment database schema removal crash (clicking on the “-” button) when a row/field has not been selected.
- In GenomeBrowse for BAM alignment plots, remember the edited value for Filter Multi-Mapped Alignments when the option is checked and unchecked.
- Fixed issue with loading feature information into the Details view when clicking on Interval sources in GenomeBrowse.
- Better error handling for coverage statistics and mapping of regions back to variants.
- Make sure that all executables have the correct permissions for Linux x64, RHEL and Mac OSX builds (aria2c, assistant, etc.).
- After updating, relaunch VarSeq instead of trying to launch Golden Helix SVS.
- Allow Import to use multiple threads for faster import of data in multiple files.
- Import no longer creates a sample table when importing sites only variant files.
- Allow for coverage computation for VCF files in GenomeBrowse that do not contain a Genotype (GT) field but does have other sample level FORMAT files.
- Renamed Annotation Download Window buttons to make it clear that downloaded tracks will not be deleted through this dialog.
- Created more informative error message when the Database Folder path is is not valid.
- Update Alternate Allele Frequency calculation to pull data from the sample level FORMAT fields when available.
- Make error message more informative when downloading annotation sources to a location that is out of space.
- Import more samples or variant sites into a project now available File > Import... when data already exists in a project.
- Added new algorithms:
- Count Variants by Gene: Counts the number of homozygous, heterozygous, and all variants per gene.
- Aggregate Counts per Gene: Counts the number of unique variants per sample group or over all samples per gene.
- The number of variants per each record set is now visible on filter cards as a tag indicating how many flagged variants pass through the filter. The tags are clickable and also have right-click menus to update the table to the tagged variants.
- Now prompt with an information icon when a new version of an annotation source is available.
- Added a VSPipeline commands:
- to get the user’s encrypted password
- to create, set and modify record sets
- A new mini demo project Example Tumor-Normal Pair Analysis is available by going to File > Open Example Project.
- Import variant bugs fixed:
- Prevent gVCF files from consuming linearly increasing amounts of RAM on import which can max out the available RAM.
- Import unsorted VCF files in the correct genomic order.
- Fixed Advanced Options on import to allow for moving VCF fields from the variant sites to sample level locations.
- Prevent crash on import of some “sites-only” VCF files.
- Prevent crash when running the Genotype algorithm on a VCF file with an additional INFO level “Alternates” field.
- VSReports bugs fixed:
- Prevent crash when unlinking hyperlinks in rich text dialogs when configuring report templates.
- Ensure genes are matched against all OMIM gene name fields (GeneNames and AlternativeGeneNames) to obtain OMIM information for variants in reports.
- Fixed issue with progress not showing when annotating against a batch source like OMIM genes.
- Made Save Project as... work when a report view exists in the project.
- Ensure GenomeBrowse can plot BAM files using Plot Sample Alignment when the path was specified from a sample information file on import.
- VSPipeline bugs fixed:
- Make the table wait for complicated filter chains before exporting data to ensure the filter chain is completely finished with all computations.
- Added field documentation for:
- Exon number for the Annotate by Transcript algorithm.
- Mendel Error sample field and algorithm description.
- Adjusted variant/gene batch annotation for secure sources to perform more queries per batch to increase speed on annotating against sources such as OMIM genes and OMIM variants.
- Added HTML format flags into annotation Source Editor so visualization of these fields can be improved though HTML formating.
- Updated dbNSFP voting algorithm to included the new FATHMM MKL Coding Pred for versions 3.0 or higher.
- Remove unused data files in the project’s data folder on project save.
- Release file handles after closing a project to remove file locks on project data. This will allow a project folder to be deleted without having to close VarSeq.
- Changed “Project in use” dialog buttons to be “Cancel” and “Override at your own risk” for clarity.
- Allow filter cards that are linked to an algorithm input to have their fields changed as long as the source stays the same. For example, you can change the field used in filtering a Compound Het algorithm output.
- Prevent crash when opening a new note in a project that does not have an existing note.
- Fixed crash when importing and merging TSF files containing sample matrix enumerated lists.
- Prevent hang when exporting non-variant tables from VSPipeline.
- Fixed crash when deleting the Variant Sites column group (imported data) when there are variants imported and also when no variants were imported due to filters specified on import.
- Ensure project templates verify license flags before running.
- Forced update of File menu for an open project on project change to make sure the correct import/add/export options are displayed.
- Updated the import algorithm to treat * (a symbolic allele representing a deletion) from GATK version 3.4.
- In the Details View, added elide pop-ups for table cells that contain more than 50 characters.
- In the right-click menu for cells in the Table view, only display the first 50 characters of the cell as the Copy option. This keeps the menu a reasonable size but all text will be copied.
- Allow segment queries using “M” instead of “MT” if the genome assembly lists “M” as an alias for “MT”.
- Start progress sooner when adding a new annotation. This removes the delay after adding an algorithm when there were several already existing algorithms.
- Allow algorithms on source groups to be rerun. This allows for algorithms to be rerun in old projects when the algorithm has been updated in newer versions of the program. To rerun an algorithm, right-click on an algorithm source group column header and select “Rerun <algorithm>”.
- Fixed alignment and sizes of combo boxes in the Data Source Library to ensure that part of the text will always be shown for the source type selection.
- VSReports Polishes:
- Save selection of report template between tab or view instances.
- Added a vertical scroll bar to the Configure Report Template dialog.
- Added rich text edit controls to the methods, limitations and background widgets in the Configure Report Template dialog.
- Included an example of modifying a report template in the VSReports documentation. See: Example: Adding a Data Field to a Report.
- Update required and optional sources when changing report templates.
- VSPipeline Polishes:
- Have Excel XLSX tab names be consistent with the tab names when exporting from VarSeq GUI.
- When the license is expired, look for an available non-expired license and auto-fill in the key in the Activate a VarSeq License Key dialog. This still requires the user to validate, accept the EULA and then activate the key.
- VarSeq Reports are now available! Golden Helix has added the ability to generate clinical reports for a select set of variants using custom templates to meet individual specifications. For more information see Report View and contact your account manager to get access to VarSeq Reports.
- Added the ability to create record sets or flags to select variants, samples, or coverage regions either by hand or by selecting/adding all variants in the current table view. See Record Sets for more information.
- Added a software updater which starting with the upgrade from 1.2.2 to later versions will now download only modified files making the upgrade faster and allow for installation of the update at the same time or at a later time.
- A new mini demo project Example YRI Exome Trio Analysis is available by going to File > Open Example Project.
- Support plotting numeric array fields in GenomeBrowse.
- Created menu item Tools > Options that allows the user to change the location of the Common Data and User Data folders in VarSeq.
- Prevent crash when exporting to VCF files with Genotype G_T or Zygosity fields were moved to be above the numeric GT field and other sample FORMAT fields.
- Fixed searching indexed fields in GenomeBrowse when annotation sources are loaded from Public Repository.
- Fixed coverage regions being listed out of order and not in natural chromosome order.
- Computation on computed fields now working correctly. For example after running the Allele Counts algorithm, creating a computed field of Allele Counts / Alleles now returns the correct allele frequency value instead of all missing values.
- Added the functionality to the table_export_xlsx and table_export_multiple_xlsx commands for VSPipeline to ignore the saved export path and instead use the current working directory.
- Correctly collapse intergenic sequence ontology and effect fields for Annotate by Transcript algorithm (intergenic_variant sequence ontology should be collapsed into the “Other” Effect category for Clinically Relevant transcripts.
- On MacOS X, when importing data through the Import Wizard and linking BAM files, after selecting the last BAM file, link the last BAM file even if the cursor is still in the file selection box when clicking “OK”.
- Fixed issue with hyperlinks in internal Web Browser when links are set to open new views.
- Fixed error caused by multiple algorithms writing to the same file which caused the database schema to become locked and not finish.
- Allow offline activation of VarSeq license when server connection is refused by local network.
- Excel XLSX Export:
- Changed default file name for multiple table Excel export to be the sample name as it is for single table Excel export.
- Changed the tab names for Excel export to be more informative with the limited characters allowed for tab names.
- Fixed display of the “Merge Behavior” column in the advanced option dialog for Variant Import on MacOS X for those options that cannot be changed. The behavior was blurred so that it was unreadable.
- Fixed display of the Coverage Stats options dialog on MacOS X.
- Created the menu item File > Open Example Project to open menu containing a list of complete example projects shipped with VarSeq.
- The visible column groups and columns for each table type are now stored such
- When changing the table type in the table selection box, if the table has been previously viewed the visibility preferences are restored.
- The visibility preferences for that particular table type are lost when the Table View tab is closed.
- When a table type has not previously been viewed, the preferences are copied from an existing view of that table type if it exists.
- The preferences for each table tab and its type are preserved when the VarSeq project is saved, closed and reopened.
- When visualizing annotation sources in GenomeBrowse set labels in the following preferred order: “Identifier” > “Ref/Alt” > “Gene Name” > “Name”
- Changed the controls for opening a existing note, creating a new note and deleting an existing note to match the new style of table selection.
- Polished the default names for the Note View initial view and for an untitled note.
- Added the gvcf extension to the recognized import file selection list.
- Allow selection of a remote source for specifying regions when computing coverage statistics and download the source before running the algorithm.
- Allow indexing of string array fields for annotation sources. This supports querying against these fields in GenomeBrowse.
- Coverage statistics are now available in VarSeq. Coverage statistics are
computed directly from BAM files for regions specified in a BED or other
interval source. The statistics can either be viewed per variant, per region,
or per sample. See Targeted Region Coverage for more information.
- BAM files can be linked to samples on import through the Import Wizard or through the import command for VSPipeline.
- Moved the following toolbar items to the global toolbar for the main window:
- Export tables
- Exporting tables now allows for the explicit selection of a table to export. Multiple tables can still be exported simultaneously for XLSX export.
- Plot data in GenomeBrowse
- Current sample selection
- Added controls for switching the type of a table, for example from a variant table to a variants by gene table or to a sample table.
- Added a sample table containing sample fields specified on import. Any sample statistics that are computed are also added to this table, i.e. coverage statistics.
- Now shipping a mini demo project with VarSeq that can be selected from the Recent Project list when VarSeq is initially installed in Viewer mode.
- Converted Annotate Transcript output for Sequence Ontology (Clinically Relevant) and Effect (Clinically Relevant) fields to categorical arrays to prevent algorithm error caused by too many unique strings for a categorical field.
- Fixed issue with computations not finishing for filter cards created from string fields where the name of the field contained an operator string (ex. “in”).
- Fixed Y-axis zoom in GenomeBrowse when not in automatic mode.
- Name of sample in table column groups has been fixed to be the display name, i.e. Proband (NA12878) now instead of NA12878 when displaying all samples or only affected samples.
- Fixed the broken revert link for Database Views to reset an assessment to a previous state.
- Double-clicking on VarSeq project files (.vsproject) to open the project directly in VarSeq is now working on Windows again.
- Filter Card layout updates:
- Moved expression output number outside of the expression toggle layout.
- Removed menu button on filter cards, the context menu is still available.
- Added a toggle check box on all filter cards to enable/disable a filter card or container.
- Removed filter card configure dialog box, instead the configure options are expandable by clicking on the wrench button.
- Source/field selection is opened upon adding a filter
- More informative auto titles (which now update as the filters change, not just on changing the configure options.)
- All new filter cards are auto-titled by default/
- Manual title can be set by double-click edit.
- Return to auto title by double-click edit + click auto button.
- Added the option to “Show All” or “Show Affected Only” to the bottom of the sample list in the current sample selection drop-down menu.
- When data is imported as Tumor/Normal data, instead of “Show Affected Only” those options will be “Show Tumor Only”.
- Changes to VarSeq Viewer Mode:
- Menu items and actions are no longer hidden if they are not available in viewer mode. Instead, trying to select an action that is not available will open a dialog with the option to request a free trial or enter a license key. It should now be obvious why certain actions are restricted and that the software is in viewer mode.
- Made segment filters more robust in handling spaces when getting segment names out of the text box.
- Updated the file menu to have Add/Import/Export actions that match the updated main toolbar actions.
- The Genotype algorithm has been moved from within the import function and will now appear as an optional algorithm that can be run. Genotypes can now be computed by going to Add > Computed Data... and selecting the Genotype algorithm under the Data Transformation options.
- Removed restrictions on sample limits that can be viewed in the sample
table in the Import Wizard.
- Editing is now allowed when importing more than 100 samples.
- If importing more than 1000 samples, double-click on cells to edit them.
- Added filter card and filter container descriptions to the detailed report for the Detail View generated when clicking on a filter card or container.
- On Lin64, renamed libstc++.so.6 to libstdc++.so.6.rename-if-needed so that VarSeq will run on Ubuntu 15.04.
- Added the ability to lock a filter chain and the ability to create a template with a locked filter chain to prevent changes to the filter chain after it is locked.
- Report one clinically relevant transcript per gene in a region instead of one all genes that overlap a variant.
- Prevent crash when logging out while downloading sources from the Public Annotations data repository.
- Proxy settings are now saved correctly for all platforms.
- Import bugs fixed:
- Fixed import of 23andMe samples that were previously imported or converted to TSF files.
- Use original sample names for matching with sample information from a sample field/information file even if the sample names have been renamed in the Import Wizard before loading a sample field/information file.
- Variants are now correctly sorted in genomic order when imported with the advanced Allelic Primitives options selected.
- Present correct error when trying to perform computations on a source and at the same time plot the source in GenomeBrowse instead of updating the time remaining for the completion of the task.
- Prevent crash when the current sample is set to an unaffected sample and the table includes PhoRank and you try to display only “Affected Samples”.
- Make sure that at most 65530 links are created when exporting XLSX files to prevent Microsoft Excel from treating the file as corrupted.
- Variant Databases no longer expects the Ref/Alt column to be the second column in the table, this bug caused variant databases to get in an unusable state.
- Merging TSF files in the Convert Wizard now correctly merges the segment list.
- Filter containers now display the first error message generated by the filters within the container. The filter view also displays the first error message at the bottom of the view. This prevents error messages from being hidden within collapsed cards.
- Ensure that filter cards for samples specified by relationship to the current sample (Mother/Father/Normal) display errors when the specified sample label is not found.
- No longer creating lists for certain variant sites fields when importing a single VCF file.
- Removed rich text from filter card titles and template documentation. This ensures consistency when opening a project or template across various operating systems.
- Fixed placement of top left stash button when the toolbar is hidden.
- Added parameter sample_fields_renamed_col to the import command for VSPipeline.
- No longer including file merge information when a single variant source is imported.
- Now a primary view window is tracked. This is the only window that will have the global toolbar. Closing non-primary views will just close that view. Closing the primary view will prompt for save action and close VarSeq.
- Allow multiple instances of VarSeq and other Golden Helix products to download files from the Data Source Library simultaneously.
- Tumor/Normal Support: Added the ability to import tumor/normal pairs through the import wizard. See Select Relationship for more information.
- Added a Tumor-Normal Template with a generic workflow to find variants present in the tumor samples not present in the matched normal samples.
- PhoRank can now read phenotype mappings from a “Phenotype” sample entity field if sample information is loaded from a text file. This allows PhoRank to be included in templates.
- Now able to import variants from text files provided by 23andMe. See Importing Data for further details.
- Filtering and importing of variants located only in regions specified by an annotation file can now be performed on import. See Import Summary for more information.
- Include helpful information on INS/DEL/SUB/MNP variants in the detail view for a variant report. The length and type of variant is now included after the Ref/Alt string in the title.
- New pipeline commands were added to:
- Accept the EULA from the command shell
- Download required sources
- Get a list of pending tasks
- Wait for variant annotation to complete
- Set an environment variable GOLDENHELIX_USERDATA on launch to specify the path of the VarSeq properties file “vsprops.json”.
- Pipeline bugs fixed:
- If an invalid template name is specified when creating a project, do not create the project folder.
- Batch command can now be run without a license
- If an invalid file is used to specify sample fields, provide an informative error message instead of crashing
- Fixed a bug which can cause successive exports to contain different columns
- Convert Wizard bugs fixed:
- Fixed error when a CSV file was selected for converting that caused the comma delimiter to be treated as part of the field data.
- Import Wizard bugs fixed:
- Fixed problem preventing importing variants from certain VCF files that resulted in the error “No data was provided to write the source”.
- Do not compute Alt Allele Frequency when the frequency should be missing. This bug resulted in frequency values less than 0.
- Fixed issue with missing View menu on MacOS.
- Handle full screen mode on MacOS better.
- Fixed issue with Lin64 version that caused VarSeq to shutdown when opening and closing the Select dialog in the filter card control dialog.
- Fixed a crash that could occur if a filter card is moved during the evaluation of the filter chain
- Fixed issue in VCF export when subfields were being exported when the field group option was not selected.
- Prevent accidental deletion of filter expressions from within a filter card when pressing delete on the keyboard.
- Fixed <SHIFT>+. and . keyboard shortcuts. Now . will display the filter chain output and <SHIFT>+. will display the filter chain input for all unlocked table views.
- Algorithm crashes fixed:
- Prevent crash when the reference length does not match the segment length and annotating by transcript.
- Prevent Compound Het algorithm from crashing when no probands are present in data.
- Fixed issue with identifying Compound Hets when a float array frequency field was selected.
- Allow GenomeBrowse to plot data from a VCF file that contains string arrays instead of giving the error that mapFields cannot be converted from StringArray to String.
- Ensure that the location needed by VarSeq for the properties file exists, if it doesn’t then the location is created on launch of the application.
- Fixed issue with Data Source Library preventing downloading from the public repository a previously downloaded file that is no longer present in the user annotations folder.
Import Wizard polishes:
- Make it more obvious when an invalid segment is entered into the genomic regions input box and prevent importing until a valid segment is entered or the option is cleared.
- Report any genomic region subset specified in the Import Wizard in the node change log.
- Provided more informative errors when unable to sort VCF data on import.
- Allow entity information to be imported from a text file regardless of the amount of samples.
- Allow missing affection status when importing from an entity text file.
- Trim white space from the sample field information after splitting on the specified delimiter.
- Automatically set sample fields named either “Phenotype” or “HPO” (case-insensitive) to the optional Phenotype field that can be used for automatically filling in the PhoRank phenotype fields.
- Remove quotes from string fields.
- Handle string list fields when including the field in the sample information.
- Prevent upgrading INFO level filter field to a sample matrix field if one already exists in the VCF file.
- Added less strict matching for pedigree fields when importing family information from a text file.
- Removed ‘END’ field from VCFs that are importing through the merge transform.
After import, if no data could be imported the error message is clarified to indicate it could be due to filters placed on the data during import.
- Allow for help and shell commands to be parsed from the command line without needing to prefix with -c
- Remove an empty set of temporary folders from the VSPipeline working directory on Linux and RHEL after exporting to XLSX.
- Allow shell commands to be quoted or unquoted when entered from the command line
- Command result output now supports displaying duplicate key values
- Import command now accepts parameters to subset by FILTER field or genomic region
- VSPipeline returns exit code 1 if any command fails during execution
- Error messages have been made more specific during license verification
- Warn user if sources need to be downloaded for template to complete, but still allow data to be exported
- Output annotation progress
- Use current working directory to resolve relative paths passed into commands
- Add argument to import command to subset using a gene source
- The parameter affected has been changed to affected_only to allow user to iterate over only affected samples or all samples
Convert Wizard polishes:
- Added support for converting BED files with only chromosome, start position, and stop position.
- Added support for vcf.gz (gzip) files to be run through the Convert Wizard.
Improved accuracy of progress tracking and time estimation for long running tasks.
Now only showing the appropriate labels for the available samples in the filter card configuration options.
Registering for VarSeq now has “Keep me informed!” unchecked by default and the tool tip has been updated to state:
“Stay updated with exclusive e-books, timely invitations to webcasts and events and other communications from Golden Helix.”
Recent projects that no longer exist are now shaded a lighter gray on the welcome screen and are italicized (except on MacOS) in the File > Recent Projects menu. The missing projects can now be cleaned up or all projects can be removed from the list with new menu options.
On Linux and Windows tabbing from the keyboard will now indicate focus in the Data Source Library on panel buttons including Locations, Plot Data and Information.
In the Data Source Library, using the keyboard arrows to select a source will now update the information about that source in the Information Pane.
Instead of error message will now provide truncated text for Excel export when cell size is greater than 32,767 characters.
- Keep VarSeq maximized when adding another view to the window.
Updated hyperlink in Identifier field to default to the GRCh37 archive for COSMIC identifiers.
Modified VCF export options so the default selection is a strict subset of the imported data, not dropping any fields or including any annotation sources.
Log View Polishes:
- Add template series name, version and author to project log if these fields are not empty.
- If an Alt Read Ratio cannot be computed this information is now reported in the Log View.
Clear completed downloads from the download manager after closing VarSeq.
Ensure that correct messages are provided that differentiate between sources that need downloading or algorithms that need input. Also, if there are sources that need to be downloaded this message is presented right after opening a project instead of after import or other preceding algorithms are finished.
- New Add-On Features:
- Integration with MedGenome’s Oncology Mutation Database (OncoMD) to streamline annotating and filtering variants against cancer-specific genetic alterations. See MedGenome OncoMD for further information.
- VarSeq can now be run from the command line with vspipeline to automate and make workflows accessible in clinical and research pipelines. See VarSeq Pipeline Runner for more information.
- Added the option to filter variants on import based on the FILTER field based on FILTER values specified in the headers of all VCF files selected for import. If no filters are chosen all variants will be imported.
- Added support for importing gVCF files, see gVCF Conventions for specifications on this format.
- Updated Compound Heterozygous Variants and Regions algorithm to include advanced parameter options. Tool now allows de Novo mutation to be used when classifying Compound Het regions and also allows for one missing parent in in the trio.
- New algorithms (Add > Compute Additional Data...):
- Frequency Aware Zygosity computes the variant zygosity with a correction based on an alternate allele frequency from a population catalog. See Frequency Aware Zygosity for more information.
- Mendel Error computes the Mendel Error status for a child’s genotypes given at least one parent. See Mendel Error for more information.
- Added the ability to inspect and set the view and data user IDs.
- In the Import Wizard, prevent selection of invalid files when selecting a text file that specifies sample information.
- Fixed an issue that allowed the last variant of a preceding chromosome to be included in a chromosome segment filter on the Chr:Pos column when reimporting data into an existing project.
- Ensure that the genome field is set in all project files when created from an empty template.
- Disabled Histogram plots for fields with all missing numeric data.
- Fixed issue for Mac OSX when recovering projects saved on mounted drives.
- Fixed issue preventing importing data into a project saved on a mounted/network drive on Linux and CentOS.
- Ensure that changes which modify a project can be saved, including:
- Showing or hiding table columns
- Changing the current sample
- Changing the current tab
- When exporting data as a VCF file, FILTER categories are now properly converted back to an INFO field with multiple FILTER values separated by a semi-colon (;).
- Expression Editor fixes:
- Convert computed binary arrays to enumerated arrays.
- Prevent crashes caused by MuParserX from trying to convert floats or doubles to integers.
- On RHEL removed do not create a QtProject.conf file which can result in a file lock erroneously.
- Updated default RefSeq gene annotation source based on GRCh37_g1k build to include Locus Reference Genome (LRG) identifiers, updated source name is RefSeq Genes 105v2, NCBI.
- Updated Annotate Transcript by including the use of canonical transcripts to create clinically relevant annotations.
- Update Count Alleles algorithm:
- Added # Alleles to the Count Alleles algorithm. This contains the total number of observed alleles in the called genotypes (plus two for each missing genotype). This number is the denominator used to calculate Allele Frequencies.
- Added missings category counts for grouping variable if present in samples.
- Made outliers visible by default for Histograms plots and adjusted outliers button to be consistent with other hyperlinked text.
- Import Wizard polishes:
- Only disable the option to append records after at least two sources have been added that do not have the same samples in both files.
- Allow for more values to be detected as Affected/Unaffected when specifying
sample information from a text file. These values include:
- Affected: true, t, 1 (case insensitive)
- Unaffected: false, f, 0 (case insensitive)
- Allow the specifying of a field in a sample information text file that contains renamed sample names. This allows the mapping of sample names to new names and fills in the Renamed Sample field in the import wizard.
- Providing more informative errors when there are write permission issues to the folder where the data is stored.
- Allow for genotype import from VCF files with no explicit GT format field defined in the header of the file.
- When there are site and sample level fields with the same name merging of data between the two is no longer done when sites fields are moved to sample level fields on import, instead separate fields are created to maintain unique data entries.
- Updated the Database View to immediately use the current variant.
- Export polishes:
- For VCF files only:
- The matrix filter field now uses the identifier “FT” as per the specification
- The messages from the VCF write are now shown at completion
- On completion of the export, the dialog now has the option to open the folder where the file was written.
- The dialog for the final step of the write dialog are now smaller and fit the content.
- For VCF files only:
- Add information into the license activation dialog (Help > Activate a VarSeq License Key) that lists any add-ons included with the license.
- Update Group By algorithm to maintain current visible columns from the variant table in the new secondary table view.
- Expression Editor polishes:
- Removed matrix functions that have no function in VarSeq
- Renamed strlen, strsplit, joinlist functions to the more common len, split and join functions.
- Polished function documentation to be more useful and have more consistent descriptions.
- Added string concatenation button
- Provided functions that operate on array types, for example sum(array) will provide a sum of the list of numeric values in the array.
- New Import Wizard features:
- Now have the option to modify the field types from VCF files from site to sample and handle conflicts between site fields from multiple files.
- Sample information can now be imported from a text file. Text files can include pedigree formatted data or a simple two column list of sample names and affection status. See Import Sample Information from Text File for more information.
- New variant database features:
- Added option to export an existing database schema to reapply to a new schema in the Manage Database dialog.
- Added Project and Sample fields to automatically fill in the project and/or current sample when the variant was cataloged.
- Added option to manually enter a variant without having to have that variant selected in a table.
- Added outlier detection for Histograms which allows the user to filter outliers from the plot improving visualization of the data.
- Count Alleles algorithm now accepts new sample level fields for grouping counts.
- Export > Export to Text now includes the option to export column group names for the data. Each column group title is repeated so there is a corresponding value for each exported column.
By default, the VCF Filter field is imported per-sample when merging VCF files. Because the previous behavior was to keep unique values across all samples, new projects may have more precise and stringent filtering based on this field.
- Fixed user credentials not being set after modifying a database, or accessing the database schema dialog.
- Provide an algorithm out of date error for the CompoundHet filter card when disabling or enabling prerequisite filter cards.
- No longer crashes when importing all unaffected samples into a template that requires at least one affected sample, such as a template with CompoundHet.
- Fixed index assignment operator for arrays so Expression Editor can use specific entries to calculate new fields.
- When exporting data using the Export to Text option the default extension is now set as *.txt if nothing is entered by the user.
- Templates should now work if a field used to set up the template is present in the data but of a different type (i.e. sample vs site field).
- Fixed the Annotate Transcript algorithm:
- Exonic frameshifts are now classified as such even if they result in a downstream stop gain unless it results in the affected codon(s) to change to a stop gain.
- A mutation which is 1 base exonic of a splice site is not called a splice_acceptor or donor.
- Insertions bordering splice sites are annotated as either frameshift or splice_region.
- Changed SSL (libssl.so.10 and libcrypto.so.10) bundled with RHEL build to prevent SSL Handshake errors when logging in.
- Fixed Histogram reports to have intuitive buckets and clearer notation to indicate which values are being captured by the bucket for integer fields.
- Installing VarSeq will now clean up the install directory and remove files and libraries that are no longer used by the program.
- Removed tab corner decorations e.g. (+) button unless there are tabs in the docking target area.
- Ensure that a note is only created after it is opened and that changes are preserved when a tab is closed even if the project is not saved. Changes that are not saved are still discarded if the project is closed without saving changes.
- Updates to Database views:
- Modified the view to have vertical scroll bars for the top portion (fields in the database) and the bottom portion (recent assessments). This allows for resizing the variant database view.
- Modified the view to line wrap the tool tips for fields to make them more readable.
- Updates to the Data Source Library views:
- When Data Source Library is launched from the Welcome Screen the available tracks are filtered down to show only those from the default human assembly. The assembly filter can be changed by selecting the new source from the drop down menu.
- Rearranged annotation source columns in Data Source Library so when no assembly is selected the species and build information appears directly after the source name.
- In the Import Wizard, the option to choose an assembly is now available in the Advanced Options for the final import algorithm selection page. The assembly file must match that used for alignment and the reference sequence has to be a local source in order to be used for the Left Align algorithm.
- Now prompt user to download a local copy of the data source when the public copy is selected for annotation from the Data Source Library.
- Updated default field list for dbNSFP annotation algorithm, the first column group is reserved for the prediction scores that contribute to the voting score only. All fields are available in the second column group.
- Count Alleles output headers have been abbreviated to # Het and # HomoVar to reduce the width of the columns.
- Adjusted spacing for the Excel XLSX export dialog window to remove extra spacing.
- Add PhoRank algorithm to support phenotype gene ranking. See Sample PhoRank Gene Ranking for more information.
- Added Variant Databases as a default location in the Data Source Library under Local Annotations if it exists.
- Added the ability to connect to local databases through the Database View to make user notes and editable annotations against variants. See databaseView for more information.
- Added a project note editor to take and store rich text notes in a project. See Note View for more information.
- Added a log view to provide a detailed description of the actions taken in a project. See Log View for more information.
- Added an option to import data as “Cancer Samples” with default parameters optimized for this data.
- Added an option to filter cards to support Segment filtering. See Filter View for full details.
- Replaced Excel XLS export with Excel XLSX to accommodate more variants and columns.
- GenomeBrowse Features:
- Save As Image now includes the option for saving in SVG format.
- Added support for managing Genome Assembly views , genome segment order and visibility can now be changed. See Managing Genome Assembly Views for further details.
- Added an advanced option to the Convert Wizard that allows MNP variants to be split into allelic primitives. See Allelic Primitives for specifics on this option.
- Added a full screen toggle switch (<F11>) to handle sharing of projects across platforms where one or more windows were made full screen on Mac. This switch now allows the full screen windows to be adjusted on Windows, Linux and RHEL.
- Allow arrays of missing strings to be classified as missing for filter chains.
- Fixed the variant source name when created by merging files. The number of additional files was off by one.
- Removed common prefix after matching variants to alternate alleles or individual genotypes.
- Fixed some edge case issues that occurred when selecting the Allelic Primitives advanced option for import of variant data.
- Fixed alternate allele frequency calculations for multi-allelic sites so the matching alternate allele depth is being used.
- Fixed the <Tab> control in Rich Text Edit dialogs including Save Project as Template. <Tab> and modifiers now work as expected.
- Updated help URLs on the Import Wizard to link to the import section of the manual.
- Fixed URLs in algorithm selection documentation to correctly link to manual and external web browsers when appropriate.
- For variants with non-standard Ref/Alt pairs the Annotate Transcript tool will annotate based only on position and the Sequence Ontology and Effect(Combined) entries will then be listed as Invalid since they can not be produced for these variants.
- Fixed exporting to VCF format when no genotype field is included.
- Fixed –project command line argument and double-clicking on VarSeq projects to open the project in VarSeq.
- GenomeBrowse bugs fixed:
- Plots now correctly maintain y-axis zoom levels when selecting Save As Image or when adjusting the x-axis position.
- Data Source Library bugs fixed:
- Fixed right-click Open Folder on location sub-folders in the location pane of the Data Source Library.
- Fixed sorting of fields to be in numeric order for certain fields including the size of the sources.
- Ensure that the download window opens on top of viewers when a download of an annotation source is initiated from a Data Source Library dialog.
- Made folder path in the New Project dialog a hyperlink to open a system file explorer at that location.
- Updated Import Options:
- Added an option for the multi-allelic split algorithm to treat family, individual, and cancer samples differently. See Importing Data for the description of the options.
- Multi-allelic split option no longer stripping common prefixes when reference and alternate alleles are the same. Also keeping reference alternate pairs where alternate is not present in any sample genotypes and combining them in their own record.
- Samples with the same name will now be merged into separate sample records by default.
- Polished information about advanced options and added direct links to the manual for more information about advanced import algorithm options.
- Have left align use the project genome assembly’s reference sequence by default. In the advanced options for the summary page an option is now available to specify the assembly to use for left aligning the data.
- Made export dialogs consistently sized and added more informative message about available options.
- Added the current sample name (when available) to the Table name and the current filter button, i.e. Filter Chain: NA19240.
- Prevented the upper right corner stash button from blocking the add ‘+’ tab button.
- Changed column menu item to be “Search this Column” instead of “Add Table Filter for this Column” to clarify usage of the tool. See Table View for full details. Additionally removed table filter search options from primary table of split table view as they are not applicable to these column types.
- Updated ! operator type in Expression Editor to Unary. See Expression Editor for a description of this operator.
- Added informative error messages when the source types selected are not appropriate for annotation.
- Added informative error messages when the source requires additional computation such as a genomic index before the source can be used for annotation. The error message now provides the name of the source(s) that could not be used for annotation.
- Added Delete Algorithm button to algorithm error messages to make it easier to delete and clean up column groups with errors.
- Replaced Evernote connection status messages with more informative messages to indicate when Evernote needs to be connected or when a note is loaded.
- Added Select All and De-select All keyboard support (<Ctrl> + A and <Ctrl> + <Shift> + A respectively) to all web views.
- Added search pop-up and keyboard search support (<Ctrl> + F) for all web views.
- Updated Count Alleles algorithm:
- Added Hemizygous calls to the Homozygous Variant counts output as this is how they are treated for annotation and filtering purposes.
- Made Allele Counts output alternate allele matching, should now have one count provided per alternate allele.
- Adjusted layout of Add Location dialog in the Data Source Library to prevent the dialog from being too small for its contents.
- On Windows and Linux to prevent a tab from docking, press <Ctrl> before ripping the tab out of a window.
- Changed column grouping names for Annotate Transcript functions to include name of source used to easily differentiate annotations from separate sources.
- Add downloads to the top of the download list in the Downloads dialog.
- Better handling of font scaling when projects are saved across platforms.
- Updates to shipped templates:
- Exome Trio Template now includes filter chains for X-Linked and Known Rare Pathogenic workflows.
- Updated gene panel templates to use the most recent versions of COSMIC and ClinVar annotations available.
- Reevaluated container logic when switching logic (AND to OR or vice versa) from filter container right-click menu.
- Fixed a bug that classified missing genotypes as hemizygous when merging VCF files and running the Genotype Zygosity algorithm.
- In the Expression Editor and Compute Fields algorithm, the “Chr” field is now correctly handled as a string and the “Start” and “Stop” fields are correctly handled as integers. This will now allow you to add fields to your table such as “Chromosome”, “Start”, “Stop”, “Stop - Start <= 2”, etc.
- Fixed the progress updates for exporting tables to text files.
- On import, handle splitting alleles when there are multiple affected samples and some variants are half-called (i.e. 1/. or 2|.).
- Fixed a bug in the Matched Gene List algorithm when running the algorithm from a gene table which placed the project in an invalid state.
- Fixed “p.” notation for transcript annotation in certain cases involving invalid transcripts.
- Fixed error that occurred when clicking on intermediate filter output immediately after annotating variants with an annotation source.
- Ensured that the Data Source Library is refreshed after downloading sources. In certain cases when the annotation path in the properties file used had inconsistent path delimiters the refresh failed.
- Moved variant count and notification widget to the tool bar when the tool bar is above the table (default location) to prevent the widget from blocking the right most column group header(s).
- Added a transcript status column to the Variant + Transcript Interactions column group for transcript annotation. This column indicates if the variant is in an invalid or complete transcript or in an intergenic region.
- Updated the Exome Trio Template to show the details pane by default.
- VCF import now includes the option to merge (or not) files that have the same sample names.
- Added additional keyboard support, see the Reference section of the Getting Started Guides.
- Now multiple annotation sources can be selected during Annotate Data Source....
- Added the Match Gene List annotation algorithm to filter variants based on presence in a specified list of genes. See: Match Gene List for more information.
- Added support to copy fields from the Table View.
- Categorical tables are now created for string column reports when there are fewer than 40 unique strings in the column.
- Histograms and field reports can now be created directly from a numeric filter card by clicking on the histogram icon.
- Fixed duplication and incorrect ordering of chromosomes on import of some VCF data files into VarSeq.
- Fixed table row highlighting to have better contrast on Linux.
- Adding uncompressed files to the import wizard now correctly uses template family settings.
- Now VCF import properly handles the case of merging files when one or more files have an AF (Allele Frequency field) and one or more do not have this field.
- Fixed left shifting of deletions on import of VCF files to VarSeq.
- Fixed left shifting of InDels by one base pair when exporting data to VCF files.
- Fixed crash on import of data with allelic primitives transformation when:
- The data is from Ion Torrent
- Variants are missing alternates, or
- Have format fields with an incorrect type.
- Other allelic primitives bugs fixed:
- Remove common prefixes from the Ref/Alt pairs.
- Prevent left shifting on insertions and deletions.
- Reduce Number = “A” fields to values matching the new Alternates list.
- Fixed crash on left-align for:
- Multi-allelic sites, or
- Sites with an invalid Ref/Alt value, or
- When there is a repeat sequence region.
- Other left-align bugs fixed:
- Fixed the overly aggressive stripping of bases on multi-allelic sites.
- Prevent duplication of records on import of data.
- Prevent right-shifting of variants.
- Ensure transformation is insensitive to field reordering.
- Fixed genotype representation after performing left align and allelic primitives transformations.
- Made COSMIC mutation ID hyperlinks in the details view individually clickable.
- Prevent Genotype Zygosity algorithm from adding empty zygosity columns to Compound Heterozygous Variant column groups.
- Compound Heterozygous Variants and Regions can now be computed without any other filter cards in the filter chain.
- Disabling a card tracked by a table now updates the table to pass-through the filter value.
- Fixed crash when deleting a column group in a secondary table while an annotation algorithm is running.
- GenomeBrowse bugs fixed:
- Clicking on a folder URL now launches a file explorer window at that location instead of trying to launch and failing.
- In the Convert Source Wizard, fixed the bug that made the last field invisible in the field edit list when a computed field was added.
- Updated algorithms to handle hemizygous variant calls.
- Added documentation of how to manually add AppData links for server installs. See Linux Configuration in a Shared Environment.
- Export to Excel now includes an option to not export hyperlinks.
- Removed limits on Ref/Alt when exporting to a VCF file.
- When exporting data to a VCF file, fields with type “A” the field cardinality is read and written to the file so that the fields remain type “A”.
- Moved the Lock/Unlock action for tables into the filter button in the Table View.
- Make string table filters and string filter cards case insensitive.
- Give options for table filtering of Boolean fields (True/False) “==True” and “==False” instead of using numeric filtering options.
- Allow reordering of source group sub-fields for collated sample fields.
- Close the source selection dialog when annotation sources are selected from Annotate with Data Source... and Download is clicked.
- Present a more informative error when running the Group by algorithm on a field when there are more than the maximum number of variants allowed in a particular group.
- Updates to the Compound Het algorithm:
- The selected fields for the gene and allele frequency field parameters are now made more prominent and the Cancel button on the allele frequency field selection page was renamed to Skip as clicking on this button does not cancel the algorithm but skips the input of the optional parameter.
- After computing compound heterozygous variants and regions on the variants in the table, a filter card is added automatically to the Filter Chain.
- The compound het algorithm can now have different inputs for each affected sample.
- Made all URLs work as expected.
- Direct COSMIC ID URLs to COSMIC and RS ID URLs to dbSNP.
- Compute Alternate Allele Frequency on FreeBayes output using this formula: .
- Add field documentation to the field Hide/Show column and column group dialog.
- Apply the allelic primitives and left align transformation options used when creating a template on projects created from templates.
- Add a spin cursor to the filter chain button in the Table View when evaluating a filter chain.
- Changed default visible outputs of Annotate Transcript algorithm to include Tx Name, CDot, and PDot fields.
- When creating a project from a template, hide columns from the data not seen in the table in the template.
- Created a tooltip and status tip for filter cards to indicate source and field information at a glance.
- Created a report for filter cards which displays in the Detail View when a filter card is clicked on.
- Allele Counts can now be split into affection status categories for the allele counts algorithm. This will allow you to filter out variants present in unaffected samples from the affected samples’ variants.
- Cloning a table now locks the table to the set of variants in the table when cloned. Clicking on variants in the filter chain will not change the variants seen in the cloned table. This preserves the list of variants.
- Added ability to duplicate filter cards and containers. This is in the filter menu.
- Fixed the high memory usage when running the Compound Heterozygous Mutations algorithm.
- Fixed the delete column group messages to be consistent and use the actual column group name to prevent confusion as to what will be deleted.
- Fixed file gate to not allow compressed BED files in the import wizard.
- Empty template was simplified to a single table view.
- Moved variant count and status menu into a floating widget in the corner of the table.
- Viewer Mode no longer allows you to add filters or delete sources.
- Inactivating a table filter now updates the table tab title to read “Unfiltered” to indicate no filter is applied.
- Open the filter configuration dialog automatically when adding a filter directly from the filter chain.
- Changed the Impact column in the Annotate Transcript output to be Effect and changed the categories to LoF (Loss of Function), Missense, and Other.
- Changed the default for Save Project as Template... to be build specific.
- Templates were updated to reflect the latest changes in the software.
- Now supports dragging and dropping of filter cards. This feature allows for easily adjusting your chain of existing filters both vertically and horizontally across the filter workspace.
- Added support for filter cards based on sample fields for specific unrelated samples.
- Added the ability to delete algorithm results from the right click menu of the column groups in the table view.
- VCF Import: Fixed merging of files with non-overlapping chromosomes and the same samples included in each file.
- Fixed crash due to uninitialized variable in Variant Classification.
- Fixed ability to rename samples on import.
- Fixed crash when registering VarSeq from an empty VarSeq Viewer project.
- Fixed crash when selecting File > Save Project As...
- Fixed table glitch that caused columns to appear misaligned when moving to new table view.
- Fixed the Count Alleles algorithm to count missing genotypes from merged VCF files as diploid.
- Updated the VCF Merge algorithm to encode missing genotypes for “holes” in the merged data as ”?_?” instead of ”?” for consistency.
- Updated the icons used in toolbars to be more consistently styled and representative of their actions.
- Added animated spinner to table filter to indicate process is still running.
- Updated histogram layout for real and integer columns so labeling is visible for each bin and added artificial width to single value bins for better visualization.
- More informative error message when user is unable to connect to the Public Annotation repository.
- Increased network timeouts when loading our public data repository.
- Now supports annotating against the dbNSFP reduced predictions only source.
- VCF Import: Fixed merging files containing data from non-overlapping chromosomes.
- Fixed update download link to point to the correct download page.
- Variant Classification: Handle the following edge cases:
- Mixed Indels overlapping transcript start no longer causes crash
- Transcript deletion is now handled and called a Transcript Loss or ablation.
- Fixed annotating against network sources, allow it but present warning that it could take a long time.
- Fixed out of date filter resulting from re-running Compound Heterozygous algorithm with a new filter set.
- Fixed a bug resulting in missing features on import using multi-allelic split when importing more than one affected sample. This was only seen when importing exactly one variant.
- Changed the Annotate Transcript field “% Tx Dist” to be a percent instead of a fraction.
- Allow scientific notation in table filters for numeric fields/columns.
- Tables now add sources when an algorithm starts computing and errors are now shown in the status button.
- Changed which column groups and fields are shown by default in the table for various annotation algorithms.
- Tweaked error messages, warnings and status to be more informative.
- Added documentation for Annotate Transcript column groups and fields.
- Updated table filter styling and interface.
- Polished default table toolbar layout, moved to two rows for toolbar in empty projects.
Initial Public Release¶
- VarSeq has been released! The entire manual describes all of the features contained within VarSeq.
- In the future, new features, polishes and bugs fixed will be described in the release notes section.
- To report any feature requests, polish items, or bugs please email firstname.lastname@example.org.