In order to run PBAT in SVS 7 you need, at minimum, a spreadsheet containing pedigree information (including Family ID, Patient ID, Mother ID, Father ID, Sex, and Affection Status) and genetic data (either genotypes or continuous variables, such as log ratios). A fundamental change from previous versions of Golden Helix PBAT is how phenotype information is handled. In order to access phenotype data in SVS 7, you first need to join it with your pedigree and genetic data. The following step leads you through importing each data type separately and then merging into a single spreadsheet.
Before you can begin you need to create a new project.
The first file to import is CEU - PED.csv contained within the downloaded zip file. This is a comma-delimited CSV file with pedigree information for the CEU HapMap samples (Phase III).
Note
If the default options (?/0/1) are used, the spreadsheet will not be recognized as a pedigree spreadsheet.
This will create a new pedigree spreadsheet called CEU - PED Pedigree Dataset - Sheet 1 (Figure 2a).
Note
Pedigree spreadsheets are denoted as such by a pedigree icon in the Project Navigator as well as blue headers for pedigree columns at the front of the spreadsheet. If your imported spreadsheet has neither of these, it will not be recognized as a pedigree spreadsheet and certain analysis options will not be present.
Next you need to import CEU - SIM - PHENO.csv. This is a comma-delimited CSV file with simulated phenotype information. It is used for demonstration purposes only.
This will create a new spreadsheet called CEU - SIM - PHENO - Dataset - Sheet 1 (Figure 2b).
Last, you need to import CEU - GENO - Chr22.dsf. This file contains actual genotypes on chromosome 22 for the CEU samples, which were generated by a combination of Affymetrix and Illumina platforms.
This will create a new marker mapped spreadsheet called CEU - GENO - Chr 22 - Sheet 1 (Figure 2c).
Now that you have all three spreadsheets in the project you need to join them together. When joining spreadsheets it doesn’t matter which one you start from. However, if there is certain data you want located toward the front of your spreadsheet for easier viewing (e.g. phenotype data) you will want to initiate the join from that spreadsheet. When pedigree data is available (and denoted as such) this information will always be the first few columns of the spreadsheet.
This will create a new spreadsheet PED + PHENO - Sheet 1. Now join this one with the genotype spreadsheet.
You now have all the data in one spreadsheet, CEU All - Sheet 1, and are ready for analysis.
Note
In addition to performing family-based association testing using genotypes as covariates you can also perform association with various CNV covariates. Though not covered in this tutorial, you would go about PBAT CNV Analysis in the same manner as PBAT Genotype Analysis, though instead of joining a genotype spreadsheet with your pedigree and phenotype information, you would join your CNV data. To learn more about processing CNV data, see the Copy Number Variation (CNV) Analysis Tutorial.