Modules

Programme Structure

Below is a full list of all modules which are expected to be available to students on this programme across the semesters. Please note that this is for information only and may be subject to change.

In the first semester, you will complete the compulsory modules that provide a foundation in data analytics, machine learning, and the statistics of data analysis. You will build upon this knowledge working with industry-standard tools and software.

In the second semester, you will decide on your specialism. This part of the programme has been designed to prepare you for specific career paths, so what you study will depend on what you wish to do in the future.

Our specialisms, also known as streams, can be found below. We've highlighted the skills you will develop and the potential career paths for each stream.

Applied Machine Learning

You will learn to use mathematical, computational and statistical techniques to design, implement, and evaluate machine learning models to solve real-world problems. Potential career opportunities in developing and deploying predictive tools in Finance, Healthcare, Retail, and Manufacturing sectors among others.

Pattern Recognition and Deep Learning

You will learn to use mathematical, computational and statistical techniques extract key insights from different types of data. Potential career opportunities in Tech, Finance, Healthcare, Retail, Automotive, and Manufacturing sectors.

Statistical Inference

You will learn to use mathematical, computational and statistical techniques to analyze data, draw meaningful conclusions, and contribute to evidence-based decision-making across application domains. Potential career opportunities in Finance, Healthcare, Retail, and Manufacturing sectors, among others.

In the third semester, over the summer, you will work alongside one of our researchers on an independent research project. This will consolidate your learning and allow you to develop strong applied data science skills.

Semester A Compulsory Modules

Machine Learning with Python

This module will introduce you to some of the most widely-used techniques in machine learning (ML). After reviewing the necessary background mathematics, we will investigate various ML methods, such as linear regression, polynomial regression and classification with logistic regression. The module covers a very wide range of practical applications, with an emphasis on hands-on numerical work using Python. At the end of the module, you will be able to formalise a ML task, choose the appropriate method to process it numerically, implement the ML algorithm in Python, and assess the method’s performance.

Probability and Statistics for Data Analysis

This module will teach the probabilistic and statistical foundations which underpin the MSc Data Analytics. This module begins by covering some of the essential theoretical notions of probability and the distributions of random variables which underpin statistical methods. It then describes different types of statistical tests of hypotheses and addresses the questions of how to use them and when to use them. This material is essential for data analytics in applications of statistics in psychology, the life or physical sciences, business and economics.

Storing, Manipulating and Visualising Data

The ability to store, manipulate and display data in appropriate ways is of great importance to data scientists. This module will introduce you to many of the most widely-used techniques in the field. The emphasis of this module on a variety of tools used interactively rather than programming as such. It will cover best practices for data visualisation as well as various methods for preparing data for further analysis.

Programming in Python

This module introduces you to the Python programming language. After learning about data types, variables and expressions, you will explore the most important features of the core language including conditional branching, loops, functions, classes and objects. We will also look at several of the key packages (libraries) that are widely used for numerical programming and data analysis.

Semester B

Applied Machine Learning stream

Two compulsory modules:

Computational Statistics with R

This module introduces modern methods of statistical inference for small samples, which use computational methods of analysis, rather than asymptotic theory. The techniques covered in the module include non-parametric tests, bootstrap, and cross-validation. Most of these methods are now used regularly in modern business, finance, and science. Finally, the module includes the implementation of all the proposed methods with the statistics software R.

Forecasting with AI

This module introduces students to methods for forecasting using both classical techniques and the latest AI-driven approaches. Students learn methods such as ARIMA, basic machine learning models, and neural networks (LSTM). In doing so, the module is centred on fundamental techniques used across industries. The module provides students with practical skills in applying forecasting methods to real-world datasets. Emphasis is placed on understanding core forecasting concepts and practical implementation using Python.

Choose two elective modules from:

Bayesian Statistics

The module aims to introduce you to the Bayesian paradigm. The module will show you some of the problems with frequentist statistical methods, show you that the Bayesian paradigm provides a unified approach to problems of statistical inference and prediction, enable you to make Bayesian inferences in a variety of problems, and illustrate the use of Bayesian methods in real-life examples.

Topics include:

The Bayesian paradigm: likelihood principle, sufficiency and the exponential family, conjugate priors, examples of prior to posterior analysis, mixtures of conjugate priors, non-informative priors, two sample problems, predictive distributions, constraints on parameters, point and interval estimation,hypothesis tests, nuisance parameters.

Linear models: use of non-informative priors, normal priors, two and three stage hierarchical models, examples of one way model, exchangeability between regressions, growth curves, outliers and influential observations.
Approximate methods: normal approximations to posterior distributions, Laplace’s method for calculating ratios of integrals, Gibbs sampling, finding full conditionals, constrained parameter and missing data problems, graphical models. Advantages and disadvantages of Bayesian methods.
Examples: appropriate examples will be discussed throughout the course. Possibilities include epidemiological data, randomised clinical trials, radiocarbon dating.

SAS for Business Intelligence

This module is key for students wishing to further their understanding of the visualisation techniques used in business decision processes using the powerful SAS Visual Analytics software.

You will apply the power of SAS analytics to massive amounts of data, gain valuable insights into visualisation techniques to uncover relevant patterns, and be empowered to make quicker informed decisions.

Optimisation for Business Processes

Optimisation refers to the selection of the best alternative, according to some criterion, from a set of available alternatives.

This module introduces standard models from mathematical optimisation, like network flows and linear programmes, and their use in solving real-world optimisation problems; in staff and project scheduling, commodity trading, production, and sales. Tutorials focus on modelling of real-world optimisation problems based on data, and on the use of software such as R, Excel, and Gurobi to solve optimisation problems and make better decisions.

Financial Data Analytics

This module will provide students with a general understanding of current applications of data analytics to finance and in particular to derivatives and investment banking. It will introduce a range of analytical tools such as volatility surface management, yield curve evolution and FX volatility/correlation management. It will also provide you with an overview of some standard tools in the field such as Python, R, Excel/VBA and the Power BI Excel functionality. Students are not expected to have any familiarity with coding or any of the topics above, as the module will develop these from scratch. It will provide you with the understanding of a field necessary to prepare for a career in finance in roles such as trading, structuring, management, risk management and quantitative positions in investment banks and hedge funds.

Digital and Real Asset Analytics

This module will introduce students to the elementary mathematics and analytics of investment for digital and real assets. This module will develop, from a practical approach, an understanding of the analytics of several asset classes that are currently included in investment portfolios, such as commodities, real estate, art and cryptoassets, and how these assets' statistical properties fit in the context of the portfolio. The module focuses on the concepts and characteristics of digital and real assets. It will introduce students to the mathematics of the Theory of Storage for commodities, the mathematics of indexes and uses in the real estate and art markets, trading algorithms, and cryptocurrency investment strategies such as staking, De-Fi, and non-fungible tokens. This module is particularly useful for students considering a career in financial mathematics, finance, investment management, investment banking, consultancy or asset management.

Pattern Recognition and Deep Learning

Two compulsory modules:

Neural Networks and Deep Learning

This module introduces you to several state-of-the-art methodologies for machine learning with neural networks (NNs). After discussing the basic theory of constructing and calibrating NNs, we consider various types of NN suitable for different purposes, such as convolutional NNs, recurrent NNs, autoencoders and generative adversarial networks. This module includes a wide range of practical applications; you will implement each type of network using Python for your weekly coursework assignments, and will calibrate these networks to real datasets.

Advanced Machine Learning

This module builds on the earlier module "Machine Learning with Python", covering a number of advanced techniques in machine learning, such as dimensionality reduction, support vector machines, decision trees, random forests, and clustering. Although the underlying theoretical ideas are clearly explained, this module is very hands-on, and you will implement various applications using Python in the weekly coursework assignments.

Choose two elective modules from:

Graphs and Networks

This module addresses one of the most important “hot topics” in mathematics research – the study of networks – and is essential for understanding the characteristics and universal structural properties of complex networks. Complex networks are the outcome usually of a stochastic dynamics but they are not completely random. You will learn how to disentangle randomness from structural organisational principles of complex networks and how several major types of complex network can be described and artificially generated by mathematical models. Networks characterise the underlying structure of a large variety of complex systems, from the Internet to social networks and the brain. This course is designed to teach students the mathematical language needed to describe complex networks, their basic properties and dynamics. The broad aim is to provide students with the key skills required fundamental research in complex networks, and necessary for application of network theory to specific network problems arising in academic or industrial environments. The students will acquire experience in solving problems related to complex networks and will learn the necessary language to formulate models of network-embedded systems.

Topics include:

Basic concepts used in studying complex networks (e.g. adjacency matrices, degree distributions and correlations, graph distances)
Basic tools used to study complex networks (e.g. connected components, k-cores, communities, motifs, centrality measures)
Models for complex networks: the small world, the growing networks models and the configuration model

Forecasting with AI

This module introduces students to methods for forecasting using both classical techniques and the latest AI-driven approaches. Students learn methods such as ARIMA, basic machine learning models, and neural networks (LSTM). In doing so, the module is centred on fundamental techniques used across industries. The module provides students with practical skills in applying forecasting methods to real-world datasets. Emphasis is placed on understanding core forecasting concepts and practical implementation using Python.

Bayesian Statistics

The module aims to introduce you to the Bayesian paradigm. The module will show you some of the problems with frequentist statistical methods, show you that the Bayesian paradigm provides a unified approach to problems of statistical inference and prediction, enable you to make Bayesian inferences in a variety of problems, and illustrate the use of Bayesian methods in real-life examples.

Topics include:

The Bayesian paradigm: likelihood principle, sufficiency and the exponential family, conjugate priors, examples of prior to posterior analysis, mixtures of conjugate priors, non-informative priors, two sample problems, predictive distributions, constraints on parameters, point and interval estimation,hypothesis tests, nuisance parameters.

Linear models: use of non-informative priors, normal priors, two and three stage hierarchical models, examples of one way model, exchangeability between regressions, growth curves, outliers and influential observations.
Approximate methods: normal approximations to posterior distributions, Laplace’s method for calculating ratios of integrals, Gibbs sampling, finding full conditionals, constrained parameter and missing data problems, graphical models. Advantages and disadvantages of Bayesian methods.
Examples: appropriate examples will be discussed throughout the course. Possibilities include epidemiological data, randomised clinical trials, radiocarbon dating.

Computational Statistics with R

This module introduces modern methods of statistical inference for small samples, which use computational methods of analysis, rather than asymptotic theory. Some of these methods such as permutation tests and bootstrapping, are now used regularly in modern business, finance and science.

Topics include:

The techniques developed will be applied to a range of problems arising in business, economics, industry and science. Data analysis will be carried out using the user-friendly, but comprehensive, statistics package R.

Probability density functions: the empirical cdf; q-q plots; histogram estimation; kernel density estimation.
Nonparametric tests: permutation tests; randomisation tests; link to standard methods; rank tests.
Data splitting: the jackknife; bias estimation; cross-validation; model selection.
Bootstrapping: the parametric bootstrap; the simple bootstrap; the smoothed bootstrap; the balanced bootstrap; bias estimation; bootstrap confidence intervals; the bivariate bootstrap; bootstrapping linear models.

Statistical Inference stream

Two compulsory modules:

Bayesian Statistics

The module aims to introduce you to the Bayesian paradigm. The module will show you some of the problems with frequentist statistical methods, show you that the Bayesian paradigm provides a unified approach to problems of statistical inference and prediction, enable you to make Bayesian inferences in a variety of problems, and illustrate the use of Bayesian methods in real-life examples.

Topics include:

The Bayesian paradigm: likelihood principle, sufficiency and the exponential family, conjugate priors, examples of prior to posterior analysis, mixtures of conjugate priors, non-informative priors, two sample problems, predictive distributions, constraints on parameters, point and interval estimation,hypothesis tests, nuisance parameters.

Linear models: use of non-informative priors, normal priors, two and three stage hierarchical models, examples of one way model, exchangeability between regressions, growth curves, outliers and influential observations.
Approximate methods: normal approximations to posterior distributions, Laplace’s method for calculating ratios of integrals, Gibbs sampling, finding full conditionals, constrained parameter and missing data problems, graphical models. Advantages and disadvantages of Bayesian methods.
Examples: appropriate examples will be discussed throughout the course. Possibilities include epidemiological data, randomised clinical trials, radiocarbon dating.

Computational Statistics with R

This module introduces modern methods of statistical inference for small samples, which use computational methods of analysis, rather than asymptotic theory. The techniques covered in the module include non-parametric tests, bootstrap, and cross-validation. Most of these methods are now used regularly in modern business, finance, and science. Finally, the module includes the implementation of all the proposed methods with the statistics software R.

Choose two elective modules from:

Graphs and Networks

This module addresses one of the most important “hot topics” in mathematics research – the study of networks – and is essential for understanding the characteristics and universal structural properties of complex networks. Complex networks are the outcome usually of a stochastic dynamics but they are not completely random. You will learn how to disentangle randomness from structural organisational principles of complex networks and how several major types of complex network can be described and artificially generated by mathematical models. Networks characterise the underlying structure of a large variety of complex systems, from the Internet to social networks and the brain. This course is designed to teach students the mathematical language needed to describe complex networks, their basic properties and dynamics. The broad aim is to provide students with the key skills required fundamental research in complex networks, and necessary for application of network theory to specific network problems arising in academic or industrial environments. The students will acquire experience in solving problems related to complex networks and will learn the necessary language to formulate models of network-embedded systems.

Topics include:

Basic concepts used in studying complex networks (e.g. adjacency matrices, degree distributions and correlations, graph distances)
Basic tools used to study complex networks (e.g. connected components, k-cores, communities, motifs, centrality measures)
Models for complex networks: the small world, the growing networks models and the configuration model

Neural Networks and Deep Learning

This module introduces you to several state-of-the-art methodologies for machine learning with neural networks (NNs). After discussing the basic theory of constructing and calibrating NNs, we consider various types of NN suitable for different purposes, such as convolutional NNs, recurrent NNs, autoencoders and generative adversarial networks. This module includes a wide range of practical applications; you will implement each type of network using Python for your weekly coursework assignments, and will calibrate these networks to real datasets.

Forecasting with AI

This module introduces students to methods for forecasting using both classical techniques and the latest AI-driven approaches. Students learn methods such as ARIMA, basic machine learning models, and neural networks (LSTM). In doing so, the module is centred on fundamental techniques used across industries. The module provides students with practical skills in applying forecasting methods to real-world datasets. Emphasis is placed on understanding core forecasting concepts and practical implementation using Python.

Advanced Machine Learning

This module builds on the earlier module "Machine Learning with Python", covering a number of advanced techniques in machine learning, such as dimensionality reduction, support vector machines, decision trees, random forests, and clustering. Although the underlying theoretical ideas are clearly explained, this module is very hands-on, and you will implement various applications using Python in the weekly coursework assignments.

Data Analytics Project and Dissertation

Each Data Analytics MSc student is required to complete a 60 credit project dissertation. A typical MSc project dissertation consists of about 30 word-processed pages, covering a specific research-level topic in data analytics, usually requiring the student to understand, explain and elaborate on results from one or more journal articles and/or performing computation, simulations, or analysis. An MSc project should help prepare a good student for PhD research or independent work in the industry and even allow an excellent student the possibility of doing some research.

Possible areas of the MSc dissertation projects offered by the School of Mathematical Sciences include a large variety of different scientific topics, among them time series analysis, exploratory data analysis on a dataset, performance and comparative analysis of state of the art techniques, theoretical models of data, complex systems, dynamical systems, topological data analysis, experimental design with data, and statistical aspects of data analytics techniques.

Study at Queen Mary

Experience Queen Mary

Breadcrumb

Programme Structure

Semester A Compulsory Modules

Machine Learning with Python

Probability and Statistics for Data Analysis

Storing, Manipulating and Visualising Data

Programming in Python

Semester B

Applied Machine Learning stream

Computational Statistics with R

Forecasting with AI

Bayesian Statistics

SAS for Business Intelligence

Optimisation for Business Processes

Financial Data Analytics

Digital and Real Asset Analytics

Pattern Recognition and Deep Learning

Neural Networks and Deep Learning

Advanced Machine Learning

Graphs and Networks

Forecasting with AI

Bayesian Statistics

Computational Statistics with R

Statistical Inference stream

Bayesian Statistics

Computational Statistics with R

Graphs and Networks

Neural Networks and Deep Learning

Forecasting with AI

Advanced Machine Learning

Data Analytics Project and Dissertation