Medical image analysis and deep learning, with applications to prostate biopsies and mammograms

The ability to digitise large quantities of medical images together with recent progress in the area of deep learning and stochastic modelling of highly structured systems offers an opportunity to change and improve diagnostic procedures for screening.

Prostate biopsies

Gleason score (GS) is a well-established prognostic factor for prostate cancer that is based on the visual inspection of the stained biopsy slide to determine the cell morphology. Although GS is applied routinely for important clinical treatment decisions, the inter-assessor variability of GS is very high. Deep neural networks have a large potential for digital pathology, with recent studies demonstrating classification performance that is better than or equal to human experts. We have unique and comprehensive biopsy data from the STHLM3 prostate cancer diagnostic trial, including >80,000 core biopsies from 7,400 men who had a biopsy, combined with their disease status ​per biopsy core from a single urological pathologist. We have implemented an automated pipeline to process high-resolution histopathology images and segment them into tiles that are used for training of deep CNNs [M3.3-M3.5, D3.1, D3.2]. Models are developed using TensorFlow and trained on GPUs. We have performed a pilot study based on 4,535 biopsy images, indicating promising results for detection of cancer (AUC=0.97). We are exploring cloud-based GPU computing for model training. To improve the understanding of deep CNNs, we are developing methodologies to provide molecular-based interpretation of the CNNs through studies with both imaging and molecular profiles [M3.7, M3.8, D3.6-D3.8]. There is an urgent need for GPU nodes to enable efficient training and optimisation of the deep learning models (Laure, Dowling). We have also established collaborations with colleagues at Uppsala/eSSENCE (Wählby).

Mammograms

Mammograms

In order to design optimal individualised breast cancer screening strategies, it is essential to quantify the relationships between novel-biomarkers and screening sensitivities, tumor aggressiveness and self-detection. We have described a number of novel image based markers of risk and screening efficiency, some statistical, some adapted from techniques in computer science (visualisation of tissue organisation). We have just started using deep learning for discriminating subtypes of microcalcification clusters and linear substructures for short-term risk prediction models, which can be implemented clinically [M3.2, D3.3]. We will now develop statistical inference methods for Markov random fields using computational approaches developed in statistical physics and computer science, in order to extend studies of tissue organisation to other image modalities (MRI, tomography) [M3.1, M3.6, D3.5]. The project will combine expertise in image analysis, statistics, epidemiology and medical physics and use cloud-based computing for distributed Bayesian updating.

Investigators