Uncertainty in machine learning: when no decision is the best decision
PhD student, Gerardo Durán-Martín, tells us about Bayesian machine learning and why not making a decision is sometimes the most intelligent decision of all.
What was your route to a PhD in Mathematics?
I did my undergraduate studies in Applied Mathematics at the Marista University in Mexico City and I then went to work in industry. I was a Risk Manager at Santander Bank and then co-founded a start-up dedicated to understanding financial markets through machine learning. After a couple of years, I decided to pursue a great interest of mine: understanding the mathematics behind machine learning. I got accepted into the MSc in Mathematics at Queen Mary to pursue that interest and by the end of the program, I knew I wanted to continue with a PhD in the field. I applied and was awarded an EPSRC studentship, so I’m currently in my first year of the PhD under the supervision of Alex Shestopaloff, Kevin Murphy, and Luca Rossini.
What is your area of research?
I’m focused in Bayesian machine learning with applications to time-series analysis. Our goal is to find data-efficient and robust algorithms that are able to model predictive distributions capable of characterising the uncertainty in complex time-varying systems.
How would you explain your research to someone who isn’t from a mathematics background?
The fields of artificial intelligence, statistics, and machine learning are all about making decisions. Every single day, a person has to make all sorts of decisions: what to wear for work or school, what to eat, what to do in their leisure time, what books to read, etc. There are also bigger and important decisions in life: what to study at university, if and when to get married, if one should save up for a car, and so forth.
If the goal in life is to maximise happiness, one might think that a really intelligent person would be capable of making decisions throughout their life that maximise their happiness. However, on many occasions, we don’t have enough information to determine what the best decision to make is. This could be because we haven’t experienced enough of the world to decide what to do or because we haven’t been in a position to question the ramifications of that decision.
Can you give us an example?
Let’s imagine asking a 5 year old girl whether she would like to study a PhD in Anthropology at age 27. We can safely assume that she won’t be too sure about whether or not studying a PhD will maximise her future happiness. So if her answer were “I don’t know”, we should take this answer as valid since, most likely, she won’t even know what a PhD is!
In the same way, an intelligent agent should not only be considered intelligent based only on the number of correct decisions it can make. In some scenarios, it may not have enough information to make a decision so in this sense, showing uncertainty in decision making is actually a sign of intelligence. Sometimes, making no decisions is the best decision. Additionally, if the agent were to have more information that could help guide its decision, it should consider it to guide future decisions.
What does this look like in your research?
In my research, we build mathematical models that make decisions based on previous beliefs and new information. The way we incorporate past beliefs and future information is through the lens of Bayesian statistics. Bayesian statistics allows us to not only estimate the future, but also quantify how certain we are about future predictions. We seek to find models that understand what happened in the past to make decisions in the future, and at the same time, ones that are able to adapt if the environment changes.
Is there any other research being done similar to yours?
Machine learning is a very popular nowadays. It is looked from different perspectives and has many applications from face recognition, to weather forecast or, in my research, making optimal decisions in financial markets.
Can you give us an example of how your research might be used in the real world?
The purpose of an investment firm is to maximise its return on investment. A pension fund, for example, seeks to maximise (given a certain level of risk) the amount of money it will give people once they retire. In normal market conditions, we might hope to have a mathematical model that is capable of making accurate investment decisions up to a tolerable level of uncertainty. If the markets were to change, say during an economic crisis, there may be no mathematical model or amount of data that could describe what is the best decision to make. Our model should therefore make note of that and let humans make a decision or wait until it has enough information to make a decision.
Who are your supervisors and what are their areas of expertise?
Alex Shestopaloff is a Lecturer in Statistics at the School of Mathematical Sciences at Queen Mary. His area of expertise is statistical computing, in particular, Markov Chain Monte Carlo methods for performing Bayesian inference for complex stochastic models.
Kevin Murphy is a Senior Staff Research Scientist at Google. His area of expertise is probabilistic approaches to machine learning, Bayesian inference and decision making under uncertainty, and applications to "AI problems" like understanding images, text, and other data types.
Luca Rossini is a tenure-track assistant professor in Statistics at the University of Milan. He is focused in econometrics and Statistics. In particular, he’s interested in Bayesian methods applied to time series models and to graph/network theory.