VSWarehouse is a scalable genetic data warehouse for VarSeq.
VarSeq can connect to any accessible VSWarehouse server, and once connected, can interact with Project, Annotations and Reports hosted on the server.
With your own VSWarehouse instance, you can:
Store variants for all samples in one centralized, queryable and scalable database
Organize samples into projects, annotate allele frequencies, host clinical reports with VSReports, and centralize captured knowledge of variants into assessments catalogs
Use rich web interface or API to query the archive of samples for retrospective analysis, customized alerts and integration with other lab processes
3.13.1. Connecting to a VSWarehouse Server¶
VSWarehouse is a single-server, often locally managed server that uses standard web ‘HTTP’ or ‘HTTPS’ protocols for communication with the VarSeq client.
Open the VSWarehouse Manage dialog at any time from Tools > Manage VSWarehouse ….
You will first need to enter in the hostname of your VSWarehouse server and click Connect. If there are any issues connecting, they will be displayed in the details view below. If you have never connected to the given server before, you may be prompted to Activate your user credentials first on the server.
VSWarehouse is licensed by the number of active VarSeq users that can connect to it. Click on the link if prompted, and you will be taken to a log in page (use your VarSeq login credentials) and then a license key Verify and Activate page.
Once you have successfully activated on the server, you can return to VarSeq and click Reconnect.
Your last connected to VSWarehouse server will always be saved, and your VarSeq credentials are always used to perform all actions.
The administrators of the server can set per-resource level permissions for individual users. You will only see projects you have permission to modify in the Projects tab etc.
3.13.2. Managing and Using Warehouse Projects¶
The projects section displays the project that have been created on the VSWarehouse instance and some details about the current version such as its size and the number of variants present in it.
The Open Link button in the bottom left allows you to open the selected project in an external browser.
Projects on VSWarehouse utilize the same workflow engine that powers VarSeq, and thus you have full control over the annotations and algorithms run on your warehouse projects by providing it with a project template.
A new project can either use the existing open VarSeq project as a template, or use a previously saved project template.
To create a project click on the Create button from the projects tab. This will open the Create Warehouse Project Dialog. Provide a Name and Description of the project. These can actually be edited later, but are used as the names of the project and its annotations.
Whether you are using the current project’s annotations and algorithms as a template, or specifying a previously saved template, you will get some summary information about the selected template.
Note that you will be warned if you do not have an Allele Counts algorithm in the selected template, as this algorithm is what provides the expected and highly useful allele counts and frequencies of all uploaded samples.
You may also get warnings about extra algorithms in your template that will not work in warehouse. It is suggested that you create a specially tailored project with just the annotations and algorithms you would like to have in your VSWarehouse instance.
After a template has been uploaded you will have the option to add samples from the current project.
Not only does your project template define which annotations and algorithms like Count Alleles that get run on your uploaded samples, but it also defines the default visibility of fields and their source groups. Use the Show/Hide Columns and Groups dialog on your spreadsheet to thoughtfully pick the fields that you would like to query and see in results views for your new warehouse project before running this step. You can edit these visibility preferences at a later time through the web-based warehouse Manage interface.
The Edit button opens the Edit Warehouse Project dialog, which is similar to the create dialog and allows you to update the Name, Description as well as the project template used by the project.
Note that when updating the template, you have a choice to use the updated template the next time samples are added or queued samples are imported, or to start an import of the existing sample using the new annotations or algorithms defined by the updated workflow template.
Adding Samples To Projects¶
You can add samples incrementally to your VSWarehouse project from any open VarSeq project.
Until a new project has some samples to it, it will not be displayed in the VSWarehouse web interface.
Select a project, click Add Samples, and the Add Samples to Warehouse wizard will launch.
Following the wizard, you will be able to:
Edit the names of samples and their other attributes.
Subset which samples from your project you want to upload using the check-boxes on the sample rows.
Select which of your optional sample fields you want to upload (for example, BAM File Path is not generally helpful and is unchecked by default).
Potentially add additional sample fields using the From Text File button. From the VSWarehouse web interface, you can filter on any of these fields in the Explore Samples table to find or subset samples of interest.
On the Review page, specify an identifier for this uploaded set of samples.
Specify whether to run an import of these samples into the project immediately or queue it for the next batch import (scheduled by the VSWarehouse administrator often on a daily or weekly basis).
Sample names must be unique in their warehouse projects. If you set the name of sample to one that has already been uploaded to the project, it will auto-rename it to make it unique. For example, it might change “NA12878” to “NA12878-2”. This may be useful to detect and remove potentially duplicate samples from your upload! You can also use the Project_Sample button to rename all your samples to be the concatenation of the current VarSeq project name and the original sample name.
3.13.3. Using Warehouse Variants Annotation Sources¶
Each resource in VSWarehouse that stores variants (Projects, Reports, and Catalogs) provides those variants as annotation sources that can be added to your VarSeq project and even become part of a saved project template workflow.
Annotations from Projects can also be plotted as annotation tracks by selecting them and clicking Plot in GenomeBrowse. This is especially useful as you investigate the genomic context of a variant of interest. While you may not have a direct annotation from your warehouse, you may find nearby variants that share potentially similar interpretations.
Annotations of variants recorded in uploaded Reports are broken down by their respective report sections. These are real-time queries against all submitted reports, and can be used as table annotations but not plotted directly.
3.13.4. Warehouse Hosted Reports¶
The VSReports templates used in VarSeq are by default saving the inputted sample data, results and per-variant assessments in the current project folder, along with the final rendered HTML version of that report.
With VSWarehouse, these report templates can be centrally managed, versioned and most importantly saves and indexes the sample and variant level report data.
To move a local VSReports template to a VSWarehouse server, click Create and provide the requested information about the local report and the genome coordinate system the report variants will be in.
All the global report template parameters that you can freely edit for a local report are frozen once added to VSWarehouse. So make any changes before uploading by going to the Configure Report Template menu from the gear icon in the open report tab. See VSReport View.
If you make changes to your local report, such as updating its rendering or adding or removing fields, you can Edit your warehouse report at any time and select Update Report Template. You can also change it’s Name and Description.
To add sample reports to your warehouse server, select a report and click Open to open a Report tab with the selected VSWarehouse report open. You can get to the same view also by opening a new Report tab and using the report selector in the top-left of the window.
When you “Sync” the current sample in an open VSWarehouse Report sample, you are either creating a new report sample record or updating an existing one with the provided sample and report findings.
A browser tab will be opened to the rendered report, but note it is now hosted on the VSWarehouse instance and not locally in the project folder.
3.13.5. Warehouse Hosted Assessment Catalogs¶
VSWarehouse can host Assessment Catalogs. This makes it easy to share and iteratively collaborate on a centralized Assessment Catalog, which can then be used as an annotation source. Additionally the VSWarehouse assessment catalogs can be viewed and manipulated using the standard filtering and search functionality that is available across VSWarehouse.
To create a new Assessment Catalog on a warehouse server click Create. This will open the assessment catalog schema editor which will allow you specify the name, description, and genome coordinate system, as well as create or load an existing schema. Once finished, click Save to create the catalog. This will open a dialog with a link to the catalog creation job on the warehouse server. Once the job is complete the Assessment Catalog will appear in the warehouse server catalog list.
If you want to change the schema or the meta information for an assessment catalog that has already been created click the Edit button. For more information on the schema editor see Editing Assessment Catalog Schemas. When finished save the changes and the catalog will reflect the changes once it has finished updating.
To open a catalog make sure that you have a project open, then click Open. This will open an Assessment Catalog View in the project linked to the selected assessment catalog. From there the catalog using the same controls that are used for local assessment catalogs. To learn more about editing variants see Record Entry.
By default all of the fields in an assessment catalog will be available when it is used as an annotation source. To use it as an annotation source add the corresponding source from the annotations tab as seen in Using Warehouse Variants Annotation Sources.
To move a local Assessment Catalog to a VSWarehouse server create a new Assessment Catalog on the VSWarehouse using the schema from the existing Assessment Catalog. Next open the assessment catalog import wizard which will allow you to select the local Assessment Catalog as an input source and transfer the existing records by mapping fields between the two sources, as shown in Batch Assessment Import.