Skip to main content
Data-Centric Engineering

Mathematical Sciences

Below you will find Data-Centric Engineering projects offered by supervisors within the School of Mathematical Sciences

This is not an exhaustive list. If you have your own research idea, or if you are a prospective PDS candidate, please return to the main DCE Research page for further guidance, or contact us at  

Data-driven analysis and stochaistic modelling of power-grid frequency dynamics

A fundamental understanding of the current power grid system with its production and demand fluctuations is necessary to develop potential pathways towards a future 100% sustainable system.

The project will be investigating specific questions within this complex topic of energy research, using stochastic analysis and data-driven approaches to work towards a quantitative understanding of fluctuating aspects of the energy system. This includes a detailed analysis of fluctuations in the power production itself, analysis of power production time series of renewable generators (wind and solar), demand trends and fluctuations, interaction with the energy market and more. To achieve a better understanding, we will perform a statistical analysis of measured frequency fluctuations around the nominal frequency of 50Hz, which mirrors demand and production fluctuations in a very efficient way. Based on previous work (B. Schäfer, C. Beck, et al, 2018; Gorjão et al, 2020) in this project we will use new mathematical models based on generalized stochastic differential equations to realistically model and predict frequency fluctuations, and compare with data measured in the UK and in various other European power grids. A recent development is also the usage of machine learning techniques as forecasting tools in this context.

Supervisors: Prof Christian Beck & Dr Benjamin Schaefer

Emulators for the analysis of computer experiments

A common feature of industrial experimentation is the use of simulation and computer experiments to at least partially replace complicated and expensive physical experiments. The analysis of computer experiments often builds a surrogate model termed an emulator. The emulator has a variety of uses including sensitivity analysis, forecasting, history matching and design of experiments, among other uses.

This project is concerned with practical aspects of construction and use of emulators, with special focus on a type of emulator called supersaturated models. During the project we aim to survey the methodology and apply it to some case studies, developing emulators and using them for subsequent analysis. Along the project we expect to also compare with other techniques although this comparison is not meant to be exhaustive but just indicative of practice.

Supervisor: Dr Hugo Maruri-Aguilar

Discovering Bayesian network structure through time

Network analysis has been recently used for studying economic crises and has shown to contribute to a better understanding of complex systems of interconnected institutions. Moreover, modeling high-dimensional time series with network dependence is a frequent yet challenging problem in real-world applications but recently has been accounted for by incorporating the overlapping community structure in a network. In this project, we will study the Bayesian approach to a novel strand of literature based on Network Autoregressive model.

This class of models accounts for the community and cross-sectional dependence structures in time series and the idea behind this project is to implement machine learning techniques to study the dynamic and panel structure of networks. Hence, this can be done through a multilayer representation of the networks characterizing the dependence processes.

To gain flexibility, we will employ Bayesian techniques; their nonparametric extension, and the connection to machine learning techniques.

Keywords: Bayesian inference; time series; network modeling; autoregressive; econometrics

Supervisors: Dr Alex Shestopaloff & Dr Luca Rossini 

Quantifying uncertainty in unsupervised machine learning tasks

Grouping similar data objects into clusters is one of the fundamental task in statistics and machine learning. Identifying these clusters and studying their relationships help us understand the structure of a dataset, and they serve as the basis for subsequent analysis.

Being an unsupervised learning task, clustering has been applied to a wide range of scientific domains, for example, detecting galactic clusters in astronomy, identifying communities in social networks, image segmentation in computer vision, gene clustering and many more.A variety of clustering algorithms have been developed for different applications. Most of these algorithms give a single cluster estimate, where its cluster configuration is optimal under a chosen loss function. However as with other statistical procedures, data randomness propagates uncertainty throughout the learning process, and this give rise to a range of possible data partitions. Consequently in many practical situations, having access to a range of potential clusters is more helpful when compared to single estimate, whether it is for decision making or to assess worst-case scenarios. In this project, we will be looking at Bayesian uncertainty quantification (UQ) for clustering problems.

We adopt the Bayesian approach because it allows us to conduct UQ and also obtain point estimates naturally from posterior distributions. UQ for cluster analysis is relatively unexplored in the literature, and we believe that this project will contribute significantly to the knowledge and development of UQ algorithms in this area.

Supervisor: Dr William Yoo