Doctoral Thesis Linear Mixed Models in Statistical Genetics

Defended on Thursday, 6 July 2017

Abstract

One of the goals of statistical genetics is to elucidate the genetic architecture of phenotypes (i.e., observable individual characteristics) that are affected by many genetic variants (e.g., single-nucleotide polymorphisms; SNPs). A particular aim is to identify specific SNPs that are robustly associated with a given phenotype using a so-called genome-wide association study (GWAS).

Although GWAS sample sizes have increased in recent years, the number of SNPs still tends to vastly exceed sample sizes. Hence, multiple regression cannot be used to infer the association between SNPs and a phenotype jointly. Instead, the linear mixed model (LMM) has become a popular tool in statistical genetics. By placing a reasonable prior on SNP effects, LMMs can be used to jointly estimate SNP effects and to infer their contribution to phenotypic variance.

In this dissertation, I investigate several aspects of LMMs and related methods, such as ridge regression and LD-score regression. In addition, an LMM is used to develop an online tool, called MetaGAP, which quantifies the statistical power of a GWAS in case of heterogeneity in underlying subsamples. Using MetaGAP, I show that ongoing GWAS efforts are well-powered even for considerably heterogeneous phenotypes. This prediction is bolstered by a GWAS of reproductive choices, reported here, that finds twelve robustly associated SNPs.

I conclude that current GWAS sample sizes enable researchers to uncover parts of the genetic architecture of complex social-scientific outcomes and posit that GWAS efforts will soon attain sufficient predictive accuracy for useful applications throughout the social sciences.

Keywords

genome-wide association study, meta-analysis, cross-study heterogeneity, reproductive behavior, population stratification, linear mixed model, ridge regression

Time frame

2013 - 2016

Preferred reference

R. de Vlaming, Linear Mixed Models in Statistical Genetics, Promotors:Prof. dr. Patrick Groenen,Prof.dr. Philipp Koellinger,Prof. dr. Roy Thurik, http://hdl.handle.net/1765/100428

Author

Ronald de Vlaming
Ronald de Vlaming

Supervisory Team

Patrick Groenen
Professor of Statistics
  • Promotor
Roy Thurik
Professor of Economics and Entrepreneurship
  • Promotor

Committee Members

Jonathan Beauchamp
Jonathan Beauchamp
Matthew Keller
Matthew Keller
Danielle Posthuma
Danielle Posthuma