July 2, 2019
At the heart of my research are probabilistic graphical models and their relation to causal inference, as well as their potential to aid translational medicine and health sciences. We aim to model and analyse complex biomedical data and routinely collected health data. Therefore we strive to develop novel statistical methodology specifically tailored to the research question and the nature of the data at hand.
Causal discovery is an inspiring technical challenge, where we try to learn about the causal mechanisms underlying a data generating process. To tackle this problem we mainly work with causal diagrams which provide a convenient representation of causal relationships in the form of directed acyclic graphs (DAGs).
By going through networks compatible with observed data we can gain insights about the underlying causal mechanisms and derive putative intervention effects, though under very strict assumptions. To account for the uncertainty in the estimation of the DAG structure we follow a fully Bayesian approach to derive a posterior distribution of putative causal effects, implemented both for cross-sectional (Moffa et al. 2017) as well as by means of dynamic Bayesian networks for longitudinal (Kuipers et al. 2019) binary data. A current objective is to extend this methodology beyond the treatment of binary data to address problems with mixed data and non-linear relationships between the variables.
Target populations under study often display heterogeneity in their characteristics. To describe the joint probability distributions of measured variables we can consider mixtures of Bayesian networks, as in the genomic case study (Kuipers et al. 2018) modelling genomic data from the TCGA database to derive molecularly driven patient stratifications. Work is ongoing to extend the scope of applications of this method to mental health data. Following the same clustering approach will allow us to account for heterogeneity between patients, so that we can accommodate for potentially different underlying mechanisms across the population.
More generally DAGs are also the underlying representation of Bayesian networks for describing multivariate probability distributions. Structure learning of DAGs from data is challenging mainly due to the size and properties of the space (most notably the acyclicity contraint). By exploiting the combinatorial representation of DAGs we could derive novel and efficient algorithms to sample DAGs uniformly (Kuipers and Moffa 2015) as well as from the posterior distribution of DAGs given the data thanks to a partition MCMC algorithm (Kuipers and Moffa 2017). Blending partition MCMC in a hybrid procedure (Kuipers, Suter and Moffa 2018) allows us to reduce the complexity to that of a constraint based method. The R package provides an implementation to handle both Gaussian and categorical data. Relevant lines of research include extending this methodology for structure learning to mixed data, as well as considering measures of independence which allow us to handle non-linearities.
Jack Kuipers, Giusi Moffa, Elizabeth Kuipers, Daniel Freeman and Paul Bebbington “Links between psychotic and neurotic symptoms in the general population: an analysis of longitudinal British National Survey data using Directed Acyclic Graphs.”” Psychological Medicine, 2019. https://doi.org/10.1017/S0033291718000879
Jack Kuipers, Thomas Thurnherr, Giusi Moffa, Polina Suter, Jonas Behr, Ryan Goosen, Gerhard Christofori, Niko Beerenwinkel. “Mutational Interactions Define Novel Cancer Subgroups.” Nature Communications, 2018. https://doi.org/10.1038/s41467-018-06867-x.
Jack Kuipers, Polina Suter, and Giusi Moffa. “Efficient Structure Learning and Sampling of Bayesian Networks.” ArXiv:1803.07859, 2018. http://arxiv.org/abs/1803.07859.
Giusi Moffa, Gennaro Catone, Jack Kuipers, Elizabeth Kuipers, Daniel Freeman, Steven Marwaha, Belinda R Lennox, Matthew R Broome and Paul Bebbington. “Using Directed Acyclic Graphs in Epidemiological Research in Psychosis: An Analysis of the Role of Bullying in Psychosis.” Schizophrenia Bulletin, 2017. https://doi.org/10.1093/schbul/sbx013
Jack Kuipers and Giusi Moffa. “Partition MCMC for Inference on Acyclic Digraphs.” Journal of the American Statistical Association, 2017. https://doi.org/10.1080/01621459.2015.1133426
Jack Kuipers and Giusi Moffa. “Uniform random generation of large acyclic digraphs.” Statistics and Computing, 2015. https://doi.org/10.1007/s11222-013-9428-y
Assistant professor of statistics
Department of Mathematics and Computer Science, University of Basel
Spiegelgasse 1, 4051 Basel, Switzerland