VarSeq CNV Command Line Runner

With the addition of the “Pipeline Runner” add-on to your license, the CNV command line runner can be run from a command shell to automate the calling of CNVs and LoHs.

Note

To add the pipeline runner to your license contact info@goldenhelix.com.

Command Line Arguments

The command line runner supports a number of commands, which can be executed by running vscnv <username> <password> <command> <arguments>. Running a command with no arguments will display a help message describing the accepted arguments.

The vscnv commands have a number of optional arguments that modify command behavior. To specify one of these arguments simply provide --<argument>=<value> after the command name. An example using two command line arguments (inputVCF and callLoh) is shown below.

> vscnv user@goldenhelix.com password target \
        --inputVCF=filename.vcf.gz \
        --callLoh bamFile.bam \
        targetFile.bed output.tsv

Note that, for boolean arguments (such as callLoh), no value is specified.

Command Specification Tips

Any command parameter that contains a space must be quoted. Quotes may be double quotes (") or single quotes('). Nesting quotes, or quoting values within quotes, may be achieved by using single quotes within double quotes, or by escaping the nested quotes with backslashes (\\).

Backslashes (\\) may be used in file path parameter values on Windows systems, but keep in mind that whenever a backslash is followed by an escapable character, it is treated as an escape rather than a backslash. For this reason, double-backslashes or escaped backslashes (\\\\) should be preferred. Note that forward slashes (/) work in file paths on any system including Windows and may therefore be simpler to use in all cases.

Gene Panel Workflow

The following example illustrates a typical gene panel workflow. This workflow first adds samples to the reference set, and then calls CNVs using the added reference samples.

> vscnv user@goldenhelix.com password addtoreference \
        --referenceSamplesPath="/home/user/refpath/" \
        /home/user/bams/*.bam targetFile.bed

The above command computes coverage for all samples in the folder /home/user/bams/ and adds the samples to the reference set. The command specifies a custom reference sample folder at /home/user/refpath/ and computes coverage over the target regions defined in targetFile.bed. Next, we will call CNVs using the newly added reference samples.

> vscnv user@goldenhelix.com password target \
        --referenceSamplesPath="/home/user/refpath/" \
        input.bam targetFile.bed output.tsv

The above command calls CNVs, using the coverage information stored in input.bam. The program output is saved to output.tsv, and reference samples are selected from the folder /home/user/refpath/.

Exome Workflow

The exome workflow is similar to the gene panel workflow described above. As before, we begin by adding samples to the reference set.

> vscnv user@goldenhelix.com password addtoreference \
        --referenceSamplesPath="/home/user/refpath/" \
        /home/user/bams/*.bam targetFile.bed

Next, we will call CNVs using the newly added reference samples, but this time we will specify a few additional parameters.

> vscnv user@goldenhelix.com password target \
        --referenceSamplesPath="/home/user/refpath/" \
        --inputVCF="variants.vcf" \
        --callLoh \
        --filterOutFlaggedCNVs \
        input.bam targetFile.bed output.tsv

The inputVCF parameter specifies a vcf file containing variants for the sample, which the algorithm will use to obtain variant allele frequencies and call losses of heterozygosity. The callLoh parameter instructs the algorithm to call LoH events and use these events as evidence when calling CNVs. The parameter filterOutFlaggedCNVs directs the algorithm to remove all flagged CNV events.

Whole Genome Workflow

The final example covers a whole genome workflow. As in the previous two examples, we will begin by adding samples to the reference set, but we will now compute coverage using regions defined by equal width bins, instead of a predefined target list.

> vscnv user@goldenhelix.com password addtoreference \
        --referenceSamplesPath="/home/user/refpath/" \
        /home/user/bams/*.bam

Notice that, in the above command, we have omitted the target bed file. This will cause the algorithm to compute coverage over equal width bins. The default size of each bin is 1 million base-pairs, but other sizes may be specified with the binSize parameter. Next, we will use the bin command to call CNVs using the binned regions.

> vscnv user@goldenhelix.com password bin \
        --referenceSamplesPath="/home/user/refpath/" \
        input.bam output.tsv

This is very similar to the command used to call CNVs in our gene panel workflow, with two important differences. First, we are now using the bin command in place of target, and second, we have omitted the bed file parameter, since equal width bins are to be used in place of pre-specified target regions.