3. Import Phenotypic DataΒΆ

Phenotype information is needed for most, but not all, analyses in SVS. It is most often used as the dependent (e.g. case-control status) and independent variables (e.g. gender, age) in association and regression analysis. If you only have pedigree information, Affection Status would be the phenotype variable you’d use as your dependent variable.

  1. Phenotype information usually comes in the form of a text file or Excel spreadsheet. To import a text file, from the Project Navigator, go to Import > Text. Here you will specify how your data is formatted and which column you want to use as the row labels. Under the Advanced Options tab, you can specify the following:

    • How your missing data is encoded in your text file
    • Whether or not there is genotypic data and how its alleles are delimited
    • How many header rows to skip, if any
    • The base numeric type
    • How real valued columns should be encoded

    The skip header rows option pertains to a dataset that contains ancillary information about a file before the data you wanted imported starts, as highlighted in an Illumina Final Report file in Figure 3. See Text File for more information.

_images/illumina_text_file.png

Figure 3. Illumina text file

  1. If your phenotype data is in an Excel spreadsheet, from the Project Navigator, go to Import > Third Party. Click the Browse button to locate your file. Third Party includes quite a number of file formats. To import Excel files you need to select Excel (*.xls) or Excel 2007 (*.xlsx) from the file type drop down (Figure 4). Upon import you will have a phenotype spreadsheet. See Third Party File for more information.
_images/third_party.png

Figure 4. Third Party file format selection dialog

  1. In order for SVS to perform the correct statistical tests, phenotype data must be in the proper format. Data comes in all shapes and sizes and though SVS is good at detecting the format of each variable in a dataset upon import, it may not be what the researcher intended (e.g. categorical data represented as numbers will be interpreted as integers). You can use the Spreadsheet Editor (Edit > Edit this Spreadsheet) to manipulate your data to make sure every variable is in the proper format.

    For more information on using the Spreadsheet Editor see, Editing a Spreadsheet in the Golden Helix SVS Manual.

Previous topic

2. Import Pedigree Data

Next topic

4. Import Genetic Data