Variational approximations in the medical sciences

In this new sub-project we will develop machine learning models and tools for Gaussian variational approximations (GVAs) and apply those models to health applications.

We want to efficiently fit regression models for family-based data from the Swedish health and population registers. The Swedish Multi-generation Register and Twin Register includes small clusters and we would like to model binary, Poisson and time-to-event outcomes. These models will be well suited to causal inference. Existing methods either provide poor estimates for small clusters or scale very poorly. GVAs allow for fast estimation of complex random effects models and scale for larger datasets. For survival outcomes, we will extend our class of generalized survival models and accelerated failure time models using GVAs and O’Sullivan splines [M5.2, M5.4, D5.1-D5.3]. We will adapt an algorithm developed by for generalized linear mixed models [M5.1, D5.4]. We aim to develop model-based approaches to study breast tissue organization and its association with breast cancer risk. We have applied complex feature extraction methods to pre-segmented (for tissue types) mammographic images [10,11], but model based-approaches, e.g. hidden Potts models, would be better suited – exact approaches are, however, not computationally feasible. Variational approximations have previously been proposed for image segmentation and for spatial models, but these processes have not been integrated. We will investigate the statistical properties of the algorithms and implement efficient algorithms using C++ [M5.3, D5.5]. We will explore scaling the algorithms to multi-core/multi-node facilities. A machine-learning post-doc will be jointly supervised by Kjellström, Humphreys and Clements who have expertise in machine learning, variational approximations, and statistical computing, respectively.