2. The Filter View InterfaceΒΆ

Highlighted Filter View

Figure 2-1: The VarSeq window with the filter view highlighted and detailed

  1. First, we will examine the input and output number of variants in the filter chain. You should see a number at the top right of the filter chain (as noted in Figure 2-1) that indicates how many variants were imported from the VCF file for the current sample. This number should be 14,597. See also Figure 2-2.

    Input Variants

    Figure 2-2: The number of variants from the VCF file for the current sample

    You should also see a number at the bottom right of the filter chain (as noted in Figure 2-1) that indicates how many variants remained for the current sample after applying the current filter chain and all enabled filters. This number should be 1. See also Figure 2-3.

    Output Variants

    Figure 2-3: The number of variants from the VCF file for the current sample after applying all of the filters in the filter chain

    Note

    Two filters are currently not enabled. These filters are grayed out and all of the variants pass through these cards (the card has no effect on the variants in the filter chain).

    You should also notice a number on each filter card enabled or otherwise. This is the number of variants that remain in the filter chain after applying the particular filter.

  2. Next, we will examine a filter in more detail. In the current project, the filters are shown in a collapsed state. These filters can be examined by expanding the cards and showing the filter criteria. Click on the open rectangle (see Figure 2-4) for the Read Depths (DP) (Current) card.

    Expand Icon for RD Card

    Figure 2-4: Click on the open square for the Read Depths (DP) (Current) card to expand the card and see the filtering criteria

    This filter is a numeric filter which means there is a numeric threshold box and options on how to filter based on that threshold. Currently only variants for the current sample are kept if the read depth is greater than or equal to 100 reads. See Figure 2-5.

    Expanded Numeric Card

    Figure 2-5: 13,222 variants had a read depth of less than 100. The number of variants with a read depth score greater than or equal to 100 was 797. 578 variants were recorded as missing for the current sample.

  3. To see what happens to the filter chain when changing a numeric filter, change the value of 100 to 50. You should see that only 1,077 variants pass this filter See Figure 2-6.

    Changed RD Threshold

    Figure 2-6: Changed the read depth threshold from 100 to 50 which resulted in the filter chain updating the resulting variants

    Now, change the threshold back to 100 and collapse the card by clicking on the horizontal line in the upper right corner of the card. See Figure 2-7.

    Collapse Read Depth Card

    Figure 2-7: Collapse the Read Depth card to hide the threshold and criteria information to conserve space

  4. Filter containers are a great way to group filter cards. To make a filter container, right click inside of the filter chain window and select Add Filter Container. See Figure 2-8.Re-name the filter container to ‘Quality Control Filters’ by double clicking on the filter container title. See Figure 2-9.

    Add Filter Container

    Figure 2-8: Right click in the empty space underneath the filter chain then select Add Filter Container

    Filter Container name chnage

    Figure 2-9: You can change the name of any filter card by double-clicking on the filter card or filter container.

    To change the order of the filtering operations, click on a card and drag it to the desired location. We will demonstrate this by moving the Read Depths (DP) (current) and the Genotype Qualities (GQ) (Current) filter cards to into the Quality Control Filters filter container.

    Then, click on the Read Depths (DP) (Current) filter card and drag and hover over inside of the Quality Control Filters filter container until the black horizontal I-bar appears under the Quality Control Filters filter container. See Figure 2-10.

    Add read depth

    Figure 2-10: Move the Read Depths filter into the Quality Control Filters filter container

    Repeat this process with the Genotype Qualities (GQ) (Current) card. See Figure 2-11.

    Add genotype qualities

    Figure 2-11: Both Read Depth and Genotype Qualities filter cards are within the Quality Control Filters filter container

    Finally, drag the Quality Control Filters filter container to the top of the filter chain by clicking on the filter container and drag and hover over the top of the filter chain until the black horizontal I-bar appears over the All MAF card. See Figure 2-12.

    Filter container moved

    Figure 2-12: Move the Quality Control Filters filter container to the top of the filter chain.

  5. To add an annotation source to the project, click on the Add Icon and select Add Variant Annotation. See Figure 2-13.

    Add annotation source

    Figure 2-13: Use the Add Icon to add annotation sources to your project

    We will add an additional population frequency database. To do this, select the Public Annotations folder and then select 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI. You can also use the Filter search bar at the top of the window to easily search for annotation sources. See Figure 2-14.

    Add frequency track

    Figure 2-14: Add 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI to the project

    Once added, scroll all the way to the right in the variant table to see 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI within the table and all of the associated fields. Any of the fields in the variant table can be added to the filter chain by right clicking on the field and selecting Add to Filter Chain. Let’s do this with the Allele Frequencies field. In addition, when you click on any column header in the variant table, the Details view will update to reflect the information from the source as well as provide a table of values contained within the column and the proportional of values seen for the input set of variants for each category. See Figure 2-15.

    Add field from annotation

    Figure 2-15: Add the Allele Frequencies field from the 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI source to the filter chain. Also note the updated details view

    Now, set the numeric filter to 0.3 for the Allele Frequencies filter card, then move the filter card under the All MAF filter card. See Figure 2-16.

    Set Threshold

    Figure 2-16: Set the threshold for allele frequency to 0.3 and move the filter card under the All MAF filter card.

  6. Next, expand the All MAF filter card and click on the yellow histogram icon. This will give a histogram of the minor allele frequencies for input variants for this card. See Figure 2-17.

Set Threshold

Figure 2-17: The histogram of the All MAF (NHLBI) values for the variants input into the filter.