A comparison of methods for inferring causal relationships between genotype and phenotype using multi‐omics data


Many novel associations between common genetic variants and complex human disease have been successfully identified using genome‐wide association studies (GWAS). However, a typical GWAS gives little insight into the biological function through which these associated genetic variants are implicated in disease. Indeed, rather than finding variants which directly influence disease risk, the variants implicated by GWAS are typically in linkage disequilibrium with the true causal variants. Understanding the causal role of the genetic variants in disease etiology and moving towards therapeutic interventions is not simple. Integration of additional data such as transcriptomic, proteomic and metabolomic data, measured in relevant tissue in the same individuals for whom we have genomic (i.e. GWAS) data, could potentially provide further insight into disease pathways. Yet, open questions remain on how to assess the causal direction of association between these variables.

We review currently available statistical methods for inferring causality between variables that use a genetic variant as a directional anchor. We consider Mendelian Randomisation, Structural Equation Modelling, a Causal Inference Test and several Bayesian methods. We present a simulation study assessing the performance of the methods under different conditions, assuming throughout that we have a single genetic variant and two phenotypic variants that are associated with one another, although the underlying causal relationship may vary. In particular, we consider how the causal inference is affected by the presence of common environmental factors influencing the observed traits.

Genetic Epidemiology 2015; 39(7):529-599