Below is a full list of all modules which are expected to be available to students on this programme across the semesters. Please note that this is for information only and may be subject to change. Students will take four modules each semester. A particular feature of this programme is that you can take up to three modules on machine learning, covering everything from elementary techniques (such as linear regression) to advanced deep neural network design, with a focus throughout on practical implementation.
This module will introduce you to some of the most widely-used techniques in machine learning (ML). After reviewing the necessary background mathematics, we will investigate various ML methods, such as linear regression, polynomial regression and classification with logistic regression. The module covers a very wide range of practical applications, with an emphasis on hands-on numerical work using Python. At the end of the module, you will be able to formalise a ML task, choose the appropriate method to process it numerically, implement the ML algorithm in Python, and assess the method’s performance.
This module will teach the probabilistic and statistical foundations which underpin the MSc Data Analytics. This module begins by covering some of the essential theoretical notions of probability and the distributions of random variables which underpin statistical methods. It then describes different types of statistical tests of hypotheses and addresses the questions of how to use them and when to use them. This material is essential for data analytics in applications of statistics in psychology, the life or physical sciences, business and economics.
The ability to store, manipulate and display data in appropriate ways is of great importance to data scientists. This module will introduce you to many of the most widely-used techniques in the field. The emphasis of this module on a variety of tools used interactively rather than programming as such. It will cover best practices for data visualisation as well as various methods for preparing data for further analysis.
This module introduces you to the Python programming language. After learning about data types, variables and expressions, you will explore the most important features of the core language including conditional branching, loops, functions, classes and objects. We will also look at several of the key packages (libraries) that are widely used for numerical programming and data analysis.
This module focuses on the use of computers for solving applied mathematical problems. Its aim is to provide you with proper computational tools to solve problems which you are likely to encounter while during your MSc, and to develop with a sound understanding of a programming language used in applied sciences. The topics covered will include basics of scientific programming, numerical solution of ordinary differential equations, random numbers and Monte Carlo methods, simulation of stochastic processes, algorithms for complex networks analysis and modelling. The emphasis of the module would be on numerical aspects of mathematical problems, with a focus on applications rather than theory.
Topics include:
This module builds on the earlier module "Machine Learning with Python", covering a number of advanced techniques in machine learning, such as dimensionality reduction, support vector machines, decision trees, random forests, and clustering. Although the underlying theoretical ideas are clearly explained, this module is very hands-on, and you will implement various applications using Python in the weekly coursework assignments.
The module aims to introduce you to the Bayesian paradigm. The module will show you some of the problems with frequentist statistical methods, show you that the Bayesian paradigm provides a unified approach to problems of statistical inference and prediction, enable you to make Bayesian inferences in a variety of problems, and illustrate the use of Bayesian methods in real-life examples.
The Bayesian paradigm: likelihood principle, sufficiency and the exponential family, conjugate priors, examples of prior to posterior analysis, mixtures of conjugate priors, non-informative priors, two sample problems, predictive distributions, constraints on parameters, point and interval estimation,hypothesis tests, nuisance parameters.
This module introduces modern methods of statistical inference for small samples, which use computational methods of analysis, rather than asymptotic theory. Some of these methods such as permutation tests and bootstrapping, are now used regularly in modern business, finance and science.
The techniques developed will be applied to a range of problems arising in business, economics, industry and science. Data analysis will be carried out using the user-friendly, but comprehensive, statistics package R.
This module will provide students with a general understanding of current applications of data analytics to finance and in particular to derivatives and investment banking. It will introduce a range of analytical tools such as volatility surface management, yield curve evolution and FX volatility/correlation management. It will also provide you with an overview of some standard tools in the field such as Python, R, Excel/VBA and the Power BI Excel functionality. Students are not expected to have any familiarity with coding or any of the topics above, as the module will develop these from scratch. It will provide you with the understanding of a field necessary to prepare for a career in finance in roles such as trading, structuring, management, risk management and quantitative positions in investment banks and hedge funds.
This module addresses one of the most important “hot topics” in mathematics research – the study of networks – and is essential for understanding the characteristics and universal structural properties of complex networks. Complex networks are the outcome usually of a stochastic dynamics but they are not completely random. You will learn how to disentangle randomness from structural organisational principles of complex networks and how several major types of complex network can be described and artificially generated by mathematical models. Networks characterise the underlying structure of a large variety of complex systems, from the Internet to social networks and the brain. This course is designed to teach students the mathematical language needed to describe complex networks, their basic properties and dynamics. The broad aim is to provide students with the key skills required fundamental research in complex networks, and necessary for application of network theory to specific network problems arising in academic or industrial environments. The students will acquire experience in solving problems related to complex networks and will learn the necessary language to formulate models of network-embedded systems.
This module introduces you to several state-of-the-art methodologies for machine learning with neural networks (NNs). After discussing the basic theory of constructing and calibrating NNs, we consider various types of NN suitable for different purposes, such as convolutional NNs, recurrent NNs, autoencoders and generative adversarial networks. This module includes a wide range of practical applications; you will implement each type of network using Python for your weekly coursework assignments, and will calibrate these networks to real datasets.
Optimisation refers to the selection of the best alternative, according to some criterion, from a set of available alternatives.
This module introduces standard models from mathematical optimisation, like network flows and linear programmes, and their use in solving real-world optimisation problems; in staff and project scheduling, commodity trading, production, and sales. Tutorials focus on modelling of real-world optimisation problems based on data, and on the use of software such as R, Excel, and Gurobi to solve optimisation problems and make better decisions.
This module is key for students wishing to further their understanding of the visualisation techniques used in business decision processes using the powerful SAS Visual Analytics software.
You will apply the power of SAS analytics to massive amounts of data, gain valuable insights into visualisation techniques to uncover relevant patterns, and be empowered to make quicker informed decisions.
Time Series Analysis refers to the use of statistical and machine learning methods for inference on datasets containing variables collected over time, with the ultimate goal of forecasting the values of these variables at some future time.
This module introduces key concepts such as trend and seasonality decomposition, autocorrelation, autoregressive and moving average models, and exponential methods. Tutorials focus on the use of the R software environment in the analysis of real-world time series data.
This module introduces you to the fundamentals of modern time series analysis. We aim to be comprehensive, looking at both theory and applications for different time series models that are widely used in practice. To this end, we will use R and RStudio as our main software for data analysis and you will gain hands-on experience in applying methods learned to real-world case studies.
Each Data Analytics MSc student is required to complete a 60 credit project dissertation. A typical MSc project dissertation consists of about 30 word-processed pages, covering a specific research-level topic in data analytics, usually requiring the student to understand, explain and elaborate on results from one or more journal articles and/or performing computation, simulations, or analysis. An MSc project should help prepare a good student for PhD research or independent work in the industry and even allow an excellent student the possibility of doing some research.
Possible areas of the MSc dissertation projects offered by the School of Mathematical Sciences include a large variety of different scientific topics, among them time series analysis, exploratory data analysis on a dataset, performance and comparative analysis of state of the art techniques, theoretical models of data, complex systems, dynamical systems, topological data analysis, experimental design with data, and statistical aspects of data analytics techniques.