Data Science

The research in e-Science is becoming increasingly data-driven, using methods from the neighboring fields of Data Science, Machine Learning, and AI. The aim of the Data Science MCP is to bring together researchers interested in data-driven methods, in order to spread and develop knowledge among the different application areas of SeRC.

Data-driven methodologies – especially Deep Learning – have developed at an incredible pace in the last decade. A prominent development, important for e-Science applications, is the advance of deep generative models, which allow more systematic analysis and modeling of systems from sparser observations than with traditional Deep Learning approaches, but also to create new data, new designs and scientific insights. Impactful examples are Dall-E.2 and Stable Diffusion, methods for generation of images from text. The Data Science MCP will in the coming years explore, exploit and develop methods along these lines in a systematic way.

The scientific work in the MCP is organized into 7 subprojects, listed below, addressing different scientific areas and application areas within the SeRC field of interest. Research is conducted independently within each subproject, but researchers in the different projects interact with each other and exchange knowledge in the context of three larger, overlapping themes covering different aspects of deep generative models. The aim is to have a more general impact, benefitting the entire SeRC by spreading knowledge about the recent development in this class of methods and their applications, through a course, and also the society and industry at large through ambitious blog posts.


People involved can be found here.


Subproject 1: Deep Generative Flow
Leaders: Hossein Azizpour and Ricardo Vinuesa

Subproject 2: Deep Stochastic Computational Methods
Leader: Henrik Hult

Subproject 3: Deep Synthetic Diagnostic Data
Leaders: Claes Lundström and Gabriel Eilertsen

Subproject 4: Deep Humanlike Representation Learning
Leaders: Hedvig Kjellström and Johan Lundström

Subproject 5: Data-Driven Deep Protein
Leaders: Björn Wallner, Arne Elofsson, and Hossein Azizpour

Subproject 6: Data-Driven Methods in Materials Science
Leader: Rickard Armiento

Subproject 7: Scientific machine learning for coastal ocean data
Leaders: Elias Jarlebring and Inga Koszalka