6. References


Abt, M., Lim, Y., Sacks, J., , Xie, M., and Young, S. S., (2001), ‘A Sequential Approach for Identifying Lead Compounds in Large Chemical Databases’, Statistical Science, 16, 154-168.


Affymetrix (2007), ‘CNAT 4.0: Copy Number and Loss of Heterozygosity Estimation Algorithms for the GeneChip®Human Mapping 10/50/100/250/500K Array Set’, Revision Version 1.2


Anders and Huber (2010), ‘Differential expression analysis for sequence count data’, Genome Biology 2010, 11:R106


Biggs, D., B. deVille, and E. Suen (1991). ‘A method of choosing multiway partitions for classification and decision trees’, Journal of Applied Statistics 18, 49.


Bolstad, B.M., Irisarry, R.A., Astrand, M., Speed, T.P. (2003) ‘A Comparison of Normalization Methods for High Density Oligonucleotide Array Data based on Variance and Bias’. Bioinformatics Vol 19 no. 2, p.185–193


Bolstad, Ben (2001), ‘Probe Level Quantile Normalization of High Density Oligonucleotide Array Data’, Division of Biostatistics, University of California, Berkley.


Browning, Brian L., and Browning, Sharon R. (2009) ‘A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals’, Appendix 1, Am. J. Hum. Genet. 84(2): 210–223.


Carlson, C., Eberle, M., Rieder, M., Yi, Q., Kruglyak, L., Nickerson, D., (2004), ‘Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analysis Using Linkage Disequilibrium’, Am. J. Hum. Genet. 74, 106–120.


Chiano M. N., Clayton D. G. (1998), ‘Fine genetic mapping using haplotype analysis and the missing data problem.’ Ann. Hum. Genet. 62, 55–60.


Cohen J. C., Kiss R. S., Pertsemlidis A., Marcel Y. L., McPherson R., Hobbs H. H. (2004), ‘Multiple rare alleles contribute to low plasma levels of HDL cholesterol.’ Science 305 (5684):869-72.


Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A, et al. (2010) ‘Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++.’ PLoS Comput Biol 6(12): e1001025. doi:10.1371/journal.pcbi.1001025.

Visit http://mendel.stanford.edu/sidowlab/downloads/gerp/


Dempster, A. P., Laird, N. M., Rubin D., (1977), ‘Maximum likelihood from incomplete data via the EM algorithm.’ J of the Royal Stat Soc B 39: 1-38.


B. Devlin, Kathryn Roeder, ‘Genomic Control for Association Studies’, Biometrics, Vol. 55, No. 4 (Dec., 1999), pp. 997–1004


Diskin SJ, Li M, Hou C, Yang S, Glessner J, Hakonarson H, Bucan M, Maris JM, Wang K., ‘Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms’, Nucleic Acids Research 36:e126, 2008


Durstenfeld, Richard, (July 1964), ‘Algorithm 235: Random permutation’ Communications of the ACM Vol 7 no. 7, p.420.


Emigh, T. H., (1980), ‘Comparison of tests for Hardy-Weinberg Equilibrium’ Biometrics 36: 627–642.


Excoffier L, Slatkin M (1995) ‘Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.’ Molecular Biology and Evolution 12: 921–927.


Fallin D, Schork NJ (2000) ‘Power of omnibus likelihood ratio test for haplotype-based case-control studies.’ Am J Hum Genet 67(S2): 214 (abstract).


Fardo DW, Ionita-Laza I, Lange C, 2009 On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Applica tion to the HapMap Data. PLoS Genet 5(7): e1000572. doi:10.1371/journal.pgen.1000572


Fernando R.L. (2009) ‘Genomic Selection: Bayesian Methods’ [PDF Document]. Retrieved from http://www.ans.iastate.edu/stud/courses/short/2009/B-Day2-3.pdf


Fernando R.L. (2009) ‘Rohan Fernando’s implementation of Bayes CPi’ [Computer Program]. Retrieved from http://www.ans.iastate.edu/stud/courses/short/2009/B-Day3/BayesCPi.R


Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, et al. (2002) ‘The structure of haplotype blocks in the human genome.’ Science 296: 2225–2229.


The Genome Sequencing Consortium (2001 Feb 15). ‘Initial sequencing and analysis of the human genome’. Nature, 409(6822), 860–921.


Green, W.H. Econometric Analysis, 3rd Ed. Prentice Hall, NJ, (1997), pp 882–886.


Habier D., Fernando R.L., Kizilkaya K., Garrick D.J. (2011) ‘Extension of the bayesian alphabet for genomic selection’, BMC Bioinformatics, 12:186, doi:10.1186/1471-2105-12-186


Halko N., Martinsson P.G., Tropp J.A. (2010) ‘Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions’, arXiv:0909.4061v2 [math.NA] 14 Dec 2010


Hartigan J.A., Wong M.A. (1979) ‘Algorithm AS 136: A K-Means Clustering Algorithm’, Journal Of The Royal Statistical Society: Series C (Applied Statistics), 28(1), 100.


Hawkins, D. M., (2002). ‘Fitting multiple change-points to data’, Computational Statistics and Data Analysis, 37, 323–341.


Hawkins. D. M., and Musser, B. J. (2001), ‘Feature selection with nondeterministic recursive partitioning’, Proceedings of the American Statistical Association [CD-ROM] Alexandria, VA: ASA.


Hawkins, D. M. and Musser, B. J., (1999) ‘One tree or a forest? Alternative dendrographic models’, Computing Science and Statistics, 30, 534–542.


Hawkins, D. M., Young, S. S., and Rusinko, A., (1997), ‘Analysis of a large structure-activity data set using recursive partitioning, Quantitative Structure Activity Relationships’, 16, 296–302.


Hawkins, D. M. and McKenzie, D. P., (1995). ‘A data-based comparison of some recursive partitioning procedures’, Proceedings, Statistical Computing Section, American Statistical Association, 245–252.


Hawkins, D. M. (1995). ‘FIRM: Formal Inference-based Recursive Modeling, release 2.’ Technical Report 546, University of Minnesota, School of Statistics.


Hawkins, D. M. and G. V. Kass (1982). ‘Automatic interaction detection’. In D. M. Hawkins (Ed.), Topics in Applied Multivariate Analysis. Cambridge University Press.


Hawkins, D. M. and Merriam, D. F. (1973) ‘Optimal zonation of digitized sequential data’. Jour. Math Geology, v. 5, no. 4, p. 389–395.


Hawkins, D. M. (1972) ‘On the choice of segments in piecewise approximation’. Jour. Inst. Math. Applications, v. 9, no. 2, p. 250–256.


Hill, D. A., L. M. Delaney, and S. Roncal (1997). ‘A Chi-Squared Automatic Interaction Detection (CHAID) analysis of factors determining critical outcomes’, The Journal of Trauma: Injury, Infection and Critical Care 42, 62–66.


Hooton, T. M., Haley, R.W., Culver, D. H., White, J. W., Morgan, W. M., and Carroll, R. J., (1981), ‘The joint associations of multiple risk factors with the occurrence of nosocomial infections’, American Journal of Medicine, 70, 960–970.


Hosmer, David W., and Lemeshow, Stanley, Applied Logistic Regression, second edition, John Wiley and Sons, 2000. See pp. 1 – 42 for a discussion of standard error and other related statistics for logistic regressions, with standard error specifically shown on p. 35.


Horvath, S., Xu, X., Lake, S.L., Silverman, E.K., Weiss, S.T. and Laird, N.M. (2004), ‘Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics’, Genet Epidemiol, 26, 61–69.


Huang, H. C., T. K. Lin, and P. W. Ngui (1993). ‘Analyzing a mental health survey by Chi-Squared Automatic Interaction Detection’, Annals of the Academy of Medicine 22, 332–337.


Ionita-Laza, Iuliana, Perry, George H., Raby, Benjamin A., Klanderman, Barbara, Lee, Charles, Laird, Nan M., Weiss, Scott T., and Lange, Christoph, (2007). ‘On the Analysis of Copy-Number Variations in Genome-Wide Association Studies: A Translation of the Family-Based Association Test’, Genetic Epidemiology 32, 1–11.


Kang HM, et al (2008). ‘Efficient control of population structure in model organism association mapping’, Genetics, 178, 1709–1723.


Kang HM, et al (2010). ‘Variance component model to account for sample structure in genome-wide association studies’, Nature Genetics 42, 348–354.


Karolchik, D., Hinrichs, A.S., Furey, T.S., et al (2004 Jan 1). ‘The UCSC Table Browser data retrieval tool’. Nucleic Acids Res., 32(Database issue), D493–6.


Kass, G. V., (1980), ‘An exploratory technique for investigating large quantities of categorical data’, Applied Statistics, 29, 119–127.


Kass, G. V. (1975). ‘Significance testing in, and some extensions of Automatic Interaction Detection.’ Ph. D. thesis, University of the Witwatersrand, Johannesburg.


Kent, W.J. (2002 April). ‘BLAT - the BLAST-like alignment tool’. Genome Res. 12(4), 656–64.


Kent, W.J., Sugnet, C.W., Roskin, K.M., Pringle T.H., Zahler, A.M., Haussler, D. (2002, June). ‘The human genome browser at UCSC’, Genome Res., 12(6), 996–1006.


Knapp, M. (1999), ‘A Note on Power Approximations for the Transmission/Disequilibrium Test.’ Am J Hum Genet 64:1177–1185.


Lange C, DeMeo D, Laird NM (2002) ‘Power and design considerations for a general class of family-based association tests: Quantitative traits.’ Am J Hum Genet 71:1330–1341.


Lange C, Laird NM (2002) ‘On a general class of conditional tests for family-based association studies in genetics: the asymptotic distribution, the conditional power and optimality considerations.’ Genetic Epidemiology 23:165–180.


Lange C, Laird NM (2002) ‘Analytical sample size and power calculations for a general class of family-based association tests: dichotomous traits.’ Am J Hum Genet 71:575–584.


Lee S, et al (2012) ‘Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies’ Am J Hum Genet 91:224–237


Lee SH, Yang J, Goddard ME, Visscher PM Wray NR (2012) ‘Estimation of pleiotropy between complex diseases using SNP-derived genomic relationships and restricted maximum likelihood’, Bioinformatics. 2012 Oct 28(19): 2540-2542. PubMed ID: 22843982


Lee S, Wu MC, Lin X (2012) ‘Optimal tests for rare variant effects in sequencing association studies’ Biostatistics 13, 4, pp. 762-775


Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan V, Kane JM, Kucherlapati R, Malhotra AK. ‘Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia.’ PNAS 2007 104(50) 19942-19947; doi:10.1073/pnas.0710021104.


Leviyang S., Hamilton M. B. (2010) ‘Properties of Weir and Cockerham’s FST estimators and associated bootstrap confidence intervals.’ Theoretical population biology 79.1 (2011): 39-52


Li B, Leal S (2008) ‘Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data’ Am J Hum Genet 83:311–321.


Liu D, Leal S (2010) ‘A Novel Adaptive Method for the Analysis of Next-Generation Sequencing Data to Detect Complex Trait Associations with Rare Variants Due to Gene Main Effects and Interactions’ PLoS Genet 6(10): e1001156. doi:10.1371/journal.pgen.1001156.


Liu H, Tang Y, Zhang HH (2009) ‘A new chi-square approximation to the distribution of non-negative definite quadratic forms in non-central normal variables’ Computational Statistics and Data Analysis 53 (2009) 853-856


Magi R and Morris AP (2010) ‘GWAMA: software for genome-wide association meta-analysis’, BMC Bioinformatics 2010, 11:288


Mehta C, Patel N (1983) J. Am. Stat. Assoc. 78:427–434


Mehta C, Patel N (1986) ‘FEXACT: a FORTRAN subroutine for Fisher’s exact test on unordered r x c contingency tables’ ACM Transactions on Mathematical Software (TOMS) Volume 12 Issue 2 pp. 154–161.


Morgenthanler S, Thilly WG (2007) ‘A strategy to descover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST).’ Mutat Res. 615(1-2):28–56.


Musser, B. J. (1999) ‘Extensions to Recursive Partitioning’ Ph.D. Thesis, University of Minnesota School of Statistics.


Neves H.H.R., Carvalheiro R., Queiroz S.A. (2012) ‘A comparison of statistical methods for genomic selection in mice population’, BMC Genetics, 13:100, 1471-2156/13/100


Ng A.Y., Jordan M.I., Weiss Y. (2002) ‘On spectral clustering: Analysis and an algorithm.’ Advances in neural information processing systems 2 (2002): 849-856.


Nicol JW, Helt GA, Blanchard SG Jr, Raja A, Loraine AE (2009) The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics. 25(20):2730-1. PubMed PMID: 19654113; PubMed Central PMCID: PMC2759552


Nielsen D, Ehm M, Weir BS (1998) ‘Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus.’ Am J Hum Genet 63: 1531–1540.


Patterson N, Price AL, Reich D (2006) Population Structure and Eigenanalysis PLoS Genet 2(12): e190. doi:10.1371/journal.pgen.0020190.


Pollard KS, Hubisz MJ, Rosenboom K, Siepel A (2010) ‘Detection of non-neutral substitution rates on Mammalian phylogenies.’ Genome Res 20:110-121, 2010. Visit http://compgen.cshl.edu/phast/background.php


Price, Alkes L., Patterson, Nick J. Plenge, Robert M. Weinblatt, Michael E. Shadick, Nancy A. Reich, David. (2006). ‘Principal Components Analysis Corrects for Statification in Genome-Wide Asssociation Studies’. Nature Genetics 38, 904–909.


Purcell, Shaun, et al (2007). ‘PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses’. Am J Hum Genet 81(3): 559–575.


Reber S. C. (2013). ‘Discovery and Visual Analysis of Tracts of Homozygosity in the Human Genome’. MA Thesis. Kent State University, 2013


Remington, D. L., et al (2001) ‘Structure of linkage disequilibrium and phenotypic associations in the maize genome’. Proceedings of the National Academy of Sciences 98.20 (2001): 11479-11484.


Rhead, B., Karolchik, D., et al (2009 Nov 11). ‘The UCSC Genome Browser database: update 2010’. Nucleic Acids Res. Epub, 38(Database issue), D613–9.


Segura V, Vihjálmsson BJ, Platt A, Korte A, Seren Ü, et al. (2012) ‘An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations’, Nature Genetics, 44, 825–830.


Nihar Sheth, Xavier Roca, Michelle L. Hastings, Ted Roeder, Adrian R. Krainer and Ravi Sachidanandam, ‘Comprehensive splice-site analysis using comparative genomics’, Nucleic Acids Research, Vol. 34, No. 14 (Aug., 2006), 3955–3967


Sorensen D., Gianola D. ‘Likelihood, Bayesian, and MCMC Methods in Quantitative Genetics’, New York: Springer-Verlag, 2002. Print.


Storey, John D. (2002) ‘A direct approach to false discovery rates’, J. R. Statist. Soc. B 64, Part 3, pp. 479–498.


Taylor, J.F. (2013) ‘Implementation and accuracy of genomic selection’, Aquaculture, http://dx.doi.org/10.1016/j.aquaculture.2013.02.017


VanRaden, P.M. (2008) ‘Efficient Methods to Compute Genomic Predictions’, J. Dairy Sci, 91, pp. 4414–4423.


Vansteelandt S, Lange C (2006). ‘A unifying approach for haplotype analysis of quantitative traits in family-based association studies: Testing and estimating gene-environment interactions with complex exposure variables’. COBRA Preprint Series Year 2006 Paper 11.


Vilhjalmsson B (2012) ‘mixmogam’ https://github.com/bvilhjal/mixmogam. Commit a40f3e2c95.


Wainschtein, P., et. al., (2019) ‘Recovery of trait heritability from whole genome sequence data’. BioRxiv preprint https://www.biorxiv.org/content/10.1101/588020v1.full


Weir BS (1996) ‘Genetic Data Analysis II.’ Sinauer Associates.


Weir, B.S., Cockerham, C. Clark (1984). ‘Estimating F-Statistics for the Analysis of Population Structure’. Evolution 38(6), 1984, pp. 1358-1370.


Willer C, Li Y, Abecasis G (2010). ‘METAL: fast and efficient meta-analysis of genomewide association scans’. Bioinformatics Applications Note, Vol. 26 no. 17 2010, pp. 2190-2191.


Wright, Sewall (1922) ‘Coefficients of Inbreeding and Relationship’. The American Naturalist, Vol. 56, No. 645 (Jul - Aug, 1922), pp. 330-338.


Xie X, Ott J (1993) ‘Testing linkage disequilibrium between a disease gene and marker loci.’ Am J Hum Genet 53, 1107 (abstract).


Yang J, Lee SH, Goddard ME and Visscher PM (2011) ‘GCTA: a tool for Genome-wide Complex Trait Analysis’, Am J Hum Genet., Jan 88(1): 76-82. PubMed ID: 21167468


Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG. (2002) ‘Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals.’ Human Heredity, 53:79–91.


Zaykin DV, Ehm, MG, Weir BS (2001) ‘Evaluating new haplotyping methods for predicting clinical response using dense maps of single nucleotide polymorphisms (SNPs).’ Work in progress. Presented at Bioinformatics Seminar Series, Research Triangle Institute, NC.


Zaykin DV, Nielsen DM (2000) ‘Hardy-Weinberg disequilibrium (HWD) fine mapping for case-control samples.’ Am J Hum Genet 67: 1238(S).


Zhang L, Orloff MS, Reber S, Li S, Zhao Y, et al. (2013) ‘cgaTOH: Extended Approach for Identifying Tracts of Homozygosity.’ PLoS ONE 8(3): e57772. doi:10.1371/journal.pone.0057772


Zhao JH, Curtis D, Sham PC (2000) ‘Model-free analysis and permutation tests for allelic associations.’ Human Heredity 2000: 50 133—139.