Statistics and Data Science Seminar

YEAR

Date

Room

Speaker

Title
03/04/2024 2:00 PM

MB-503

Prof. Maria Grith (Erasmus University Rotterdam)

Neural Tangent Kernel in Implied Volatility Forecasting: A Nonlinear Functional Autoregression Approach

Implied volatility (IV) forecasting is inherently challenging due to its high dimensionality across various moneyness and maturity, and nonlinearity in both spatial and temporal aspects. We utilize implied volatility surfaces (IVS) to represent comprehensive spatial dependence and model the nonlinear temporal dependencies within a series of IVS. Leveraging advanced kernel-based machine learning techniques, we introduce the functional Neural Tangent Kernel (fNTK) estimator within the Nonlinear Functional Autoregression framework, specifically tailored to capture intricate relationships within implied volatilities. We establish the connection between NTK and functional kernel regression, emphasizing its role in contemporary nonparametric statistical modeling. Empirically, we analyze S&P 500 Index options from January 2009 to December 2021, encompassing more than 6 million European calls and puts, thereby showcasing the superior forecast accuracy of fNTK. We demonstrate the significant economic value of having an accurate implied volatility forecaster within trading strategies. Notably, short delta-neutral straddle trading, supported by fNTK, achieves a Sharpe ratio ranging from 1.45 to 2.02, resulting in a relative enhancement in trading outcomes ranging from 77% to 583%.
10/04/2024 2:00 PM

MB-503

Dr Nicolas Hernandez (QMUL)

Simultaneous predictive confidence bands for functional time series models

Functional Time Series (FTS) are sequences of dependent random elements taking values on some functional space. Most of the research on this domain focuses on producing a predictor able to forecast the next function, having observed a part of the sequence. For this, the Autoregressive Hilbertian process is a suitable framework. Here, we address the problem of constructing simultaneous predictive confidence bands for a stationary FTS. The method is based on an entropy measure for stochastic processes. To construct predictive bands, we use a Reproducing Kernel Hilbert Spaces (RKHS) to represent the functions and a functional bootstrap procedure that allows us to estimate the prediction law and a Reproducing Kernel Hilbert Spaces (RKHS) to represent the functions, considering then the basis associated to the reproducing kernel. We then classify the points on the projected space according to those that belong to the minimum entropy set (MES) and those that do not. We map the minimum entropy set back to the functional space and construct a band using the regularity property of the RKHS. The proposed methodology is illustrated through artificial and real data sets.
09/02/2012 4:30 PM

M203

Steve Bush School of Mathematical Sciences, University of Technology, Sydney

Optimal Designs for Stated Choice Experiments that Incorporate Position Effects

Choice experiments are widely used in transportation, marketing, health and environmental research to measure consumer preferences. From these consumer preferences, we can calculate willingness to pay for an improved product or state, and hence make policy decisions based on these preferences.

In a choice experiment, we present choice sets to the respondent sequentially. Each choice set consists of m options, each of which describes a product or state, which we generically call an item. Each item is described by a set of attributes, the features that we are interested in measuring. Respondents are asked to select the most preferred item in each choice set. We then use the multinomial logit model to determine the importance of each attribute.

In some situations we may be interested in whether an item's position within the choice set affects the probability that the item is selected. This problem is reminiscent of donkey voting in elections, and can also be seen in the design of tournaments, where the home team is expected to have an advantage.

In this presentation, we present a discussion of stated choice experiments, and then discuss a model that incorporates position effects for choice experiments with arbitrary m. This is an extension of the model proposed by Davidson and Beaver (1977) for m = 2. We give optimal designs for the estimation of attribute main effects plus the position effects under the null hypothesis of equal selection probabilities. We conclude with some simulations that compare how well optimal designs and near-optimal designs estimate the attribute main effects and position effects
26/01/2012 10:56 AM

M203

Helen Warren London School of Hygiene and Tropical Medicine

Derivation and Assessment of Robustness Criteria for Vulnerability of Block Designs in the Event of Observation Loss

This presentation summarises the main content of my PhD, researching into
the robustness of incomplete block designs, which introduces a Vulnerability
Measure to determine the likelihood of a design becoming disconnected with
inestimable treatment contrasts, as a result of random observation loss. For
any general block design, formulae have been derived and a program has been
written to calculate and output the vulnerability measures.

Comparisons are made between the vulnerability and optimality of designs.
The vulnerability measures can aid in design construction, be used as a pilot
procedure to ensure the proposed design is sufficiently robust, or as a method
of design selection by ranking the vulnerability measures of a set of competing
designs in order to identify the least vulnerable design. In particular, this can
distinguish between non-isomorphic BIBDs. By observing combinatorial relationships
between concurrences and block intersections of designs, this ranking
method is compared with other approaches in literature that consider the
effects on the efficiency of BIBDs, by either the loss of two complete blocks, or
the loss of up to three random observations.

The loss of whole blocks of observations is also considered, presenting improvements
on bounded conditions for the maximal robustness of designs.
Special cases of design classes are considered, e.g. complement BIBDs and repeated
BIBDs, as well as non-balanced designs such as Regular Graph Designs
17/07/2011 3:00 PM

Mathematics Lecture Theater

Hugo Maruri-Aguilar School of Mathematical Sciences, Queen Mary, University of London

Smooth polynomial methods for computer simulations

Smooth supersaturated polynomial interpolators (Bates et al. 2009)
are an alternative to modelling computer simulations. They have
the flexibility of polynomial modeling, while avoiding the
inconvenience of undesired polynomial oscillations (i.e. Runge's
phenomenon). Smooth polynomials have been observed to be most
effective for small sample sizes, although their use is not
restricted in this respect. The talk will survey the smooth
polynomial technique, comparing with traditional alternatives like
kriging or thin-plate splines. Extensions and examples will be
presented.

This is joint work with Henry Wynn and Ron Bates (LSE).
02/06/2011 5:30 PM

M203

Wai Yin Yeung School of Mathematical Sciences Queen Mary, University of London

The power of biased coin designs

The biased coin design introduced by Efron (1971, Biometrika) is a design for allocating patients in clinical
trials which helps to maintain the balance and randomness of the experiment. Its power is studied by Chen (2006,
Journal of Statistical Planning and Inference) and compared with that of repeated simple random sampling when
there are two treatment groups and patients’ responses are normally distributed. Another design similar to Efron’s
biased coin design called the adjustable biased coin design has been developed by Baldi Antognini and Giovagnoli
(2004, Journal of the Royal Statistical Society Series C) for patient allocation. Both designs aim to balance the
number of patients in two treatment groups. It is shown by Baldi Antognini (2008, Journal of Statistical Planning
and Inference) theoretically the adjustable biased coin design is uniformly more powerful than Efron’s biased coin
design. It means that the adjustable biased coin design gives a more balanced trial than Efron’s biased coin design.
Moreover, the biased coin design methods can also be applied to patients grouped by prognostic factors in order
to balance the number of patients in two treatments for each of the factors. This is called the covariate-adaptive
biased coin design by Shao, Yu and Zhong (2010, Biometrika). It is believed that the covariate-adaptive biased
coin design gains more power than Efron’s biased coin design and recently the covariate-adjustable biased coin
design is also under investigation. However, the case when there is an interaction between covariates has not been
looked at in details for any of the above designs.

This talk will consist of three parts. First, numerical values for the simulated power for the adjustable
biased coin design which has not been studied before will be shown and compare with the simulated power of
Efron’s biased coin design. Then, the powers of repeated simple random sampling and the biased coin design will
be studied when responses are binary. The theoretical calculations and exact numerical results will then be given
for the unconditional powers of the two designs for binary responses. Finally, the expression for the power of
covariate-adaptive randomization by normal approximation will be introduced. Numerical values for the normal
approximation will also be given to compare with the exact value of the biased coin design. In addition, for
the covariate-adaptive biased coin design, the idea of global and marginal balance will also be introduced and
compared their difference when we have interactions for the covariates.
05/05/2011 5:30 PM

M103

Prof. Dr. Vladimir V. Anisimov Senior Director, Research Statistics Unit,Quantitative Sciences, GlaxoSmithKline

Statistical techniques for predictive patient recruitment, randomization and drug supply modelling in multicentre clinic

A large clinical trial for testing a new drug usually involves a large number of patients and is
carried out in different countries using multiple clinical centres. The patients are recruited in
different centres, after a screening period they are randomized to different treatments according
to some randomization scheme and then get a prescribed drug. A design of multicenter clinical trials
consists of several stages including statistical design (choosing a statistical model for the analysis
of patient responses, randomization scheme, sample size needed for testing hypothesis, etc.) and
predicting patient's recruitment and drug supply needed to cover patient's demand.

The talk is devoted to the discussion of the advanced statistical techniques for modelling and
predicting stochastic processes describing the behaviour of trial in time. For modelling patient's
recruitment, the innovative predictive analytic statistical methodology is developed [1,2,3].
Patient's flows are modelled by using Poisson processes with random delays and gamma distributed
rates. ML and Bayesian techniques for estimating parameters using recruitment data and asymptotic
approximations for creating predictive bounds in time for the number of patients in centres/regions
are developed. It allows also to evaluate the optimal number of clinical centres needed to complete
the trial before deadline with a given probability and predict trial performance.
This technique is extended further to predicting the number of different events in trials with
waiting time to response. Closed-form analytic expressions for the predictive distributions are
derived. Implementation in oncology trials is considered.
The technique for predicting the number of patients randomized to different treatments for the basic
randomization schemes – unstratified and centre/region-stratified, is developed and the impact of
randomization process on the statistical power and sample size of the trial is also investigated [4].
Using these results, an innovative risk-based statistical approach to predicting the amount of drug
supply required to cover patient demand with a given risk of stock-out is developed [3]. The software
tools in R for patient's recruitment, event modelling and drug supply modelling based on these
techniques are developed. These tools are on the way of implementation in GSK and already led to
significant benefits and cost savings.

References
[1] Anisimov, V.V., Fedorov, V.V., Modeling, prediction and adaptive adjustment of recruitment in
     multicentre trials. Statistics in Medicine, Vol. 26, No. 27, 2007, pp. 4958-4975.
[2] Anisimov, V.V., Recruitment modeling and predicting in clinical trials, Pharmaceutical
     Outsourcing. Vol. 10, Issue 1, March/April 2009, pp. 44-48.
[3] Anisimov, V.V., Predictive modelling of recruitment and drug supply in multicenter clinical
    trials. Proc. of the Joint Statistical Meeting, Washington, USA, August, 2009, pp. 1248-1259.
[4] Anisimov, V., Impact of stratified randomization in clinical trials, In: Giovagnoli A., Atkinson
    AC., Torsney B. (Eds), MODA 9 - Advances in Model-Oriented Design and Analysis. Physica-Verlag/Springer,
    Berlin, 2010, pp. 1-8.
[5] Anisimov, V., Drug supply modeling in clinical trials (statistical methodology), Pharmaceutical
     Outsourcing, May/June, 2010, pp. 50-55.
[6] Anisimov, V.V., Effects of unstratified and centre-stratified randomization in multicentre clinical
    trials. Pharmaceutical Statistics, v. 10, iss. 1, 2011, pp. 50-59.
31/03/2011 5:30 PM

M203

Roger Sugden School of Mathematical Sciences, QMUL

Prediction under unequal probability sampling

I consider a very simple prediction problem and contrast two
classical approaches with the Bayesian approach: firstly
in the case of no selection (or selection at random) and
secondly with limited design information in the form of
unequal probability weights for the sampled units.

I find the Bayesian approach much less ad hoc than the
alternatives!
24/03/2011 4:30 PM

M203

Muddakkir Manas Khadim School of Mathematical Sciences, QMUL

An Algorithm for Generating a Response Surface Split-plot Design

The estimation of the variance components of a response surface model for a
Split-plot design has been of much interest in recent years. Different techniques
are available for estimating these variance components that includes REML, a
bayesian approach, the replication of the center point runs and a randomization
based approach. The available numbers of degrees of freedom is also an
important issue when estimating these variance components. In our talk, we
will present an algorithm for generating a D-optimal Split-plot design such that
the generated design has a required number of degrees of freedom for estimating
the variance components using the randomization based approach. One advantage
of using this approach is that it gives pure error estimates of the variance
components.
17/03/2011 4:30 PM

M203

M. Sofia Massa Department of Statistics, University of Oxford

Graphical models combination

In some recent applications, the interest is in combining information about relationships between variables from independent studies performed under partially comparable circumstances. One possible way of formalising this problem is to consider combination of families of distribution respecting conditional independence constraints with respect to a graph G, i.e., graphical models. In this talk I will start by giving a brief introduction to graphical models and by introducing some motivating examples of the research question. Then I will present some relevant types of combinations and associated properties. Finally I will discuss some issues related to the estimation of the parameters of the combination.
10/03/2011 4:30 PM

M203

Shahrul Mt-Isa Imperial College London

Improving Evidence-Based Risk-Benefit Decision-Making of Medicines for Children

Risk-benefit assessment for decision-making based on evidence is a subject of continuing interest. However, randomised clinical trials evidence of risks and benefits are not always available especially for drugs used in children mainly due to ethical concern of children being subjects of clinical trials. This thesis appraises risk-benefit evidence from published trials in children for the case study; assesses the risk-benefit balance of drugs, proposes a framework for risk-benefit evidence synthesis, and demonstrates the extent of its contribution.

The review shows trial designs lack safety planning leading to inconsistency safety reporting, and lack of efficacy evidence. The General Practice Research Database (GPRD) data was exploited to synthesise evidence of risks of cisapride and domperidone in children with gastro-oesophageal reflux as a case study. Efficacy data are only available through review evidence.

Analysis of prescribing trends does not identify further risk-benefit issues but suggest the lack of evidence has led to inappropriate prescribing in children. Known adverse events are defined from the British National Formulary and quantified. Proportional reporting ratio technique is applied to other clinical events to generate potential safety signals. Signals are validated; and analysed for confirmatory association through covariates adjustment in regressions. The degree of associations between signals and drugs are assessed using Bradford Hill’s criteria for causation. Verified risks are known adverse events with 95% statistical significance, and signals in abdominal pain group and bronchitis and bronchiolitis group.

The drugs’ risk-benefit profiles are illustrated using the two verified signals and an efficacy outcome. Sensitivity of input parameters is studied via simulations. The findings are used to hypothetically advise risk-benefit aspects of trial designs. The value of information from this study varies between stakeholders and the keys to communicating risks and benefits lie in presentation and understanding. The generalisability and scope of the proposed methods are discussed
24/02/2011 4:30 PM

M203

Gemma Stephenson National Oceanography Centre, Southampton

Using derivative information in the statistical analysis of computer models

Complex deterministic dynamical models are an important tool for climate prediction. Often though, such models are computationally too expensive to perform the many runs required. In this case one option is to build a Gaussian process emulator which acts as a surrogate, enabling fast prediction of the model output at specified input configurations. Derivative information may be available, either through the running of an appropriate adjoint model or as a result of some analysis previously performed. An emulator would likely benefit from the inclusion of this derivative information. Whether further efficiency is achieved, however, depends on the computational cost of obtaining the derivatives. Results of the emulation of a radiation transport model, with and without derivatives, are presented.

The knowledge of the derivatives of complex models can add greatly to their utility, for example in the application of sensitivity analysis or data assimilation. One way of generating such derivatives, as suggested above, is by coding an adjoint model. In climate science in particular adjoint models are becoming increasingly popular, despite the initial overhead of coding the adjoint and the subsequent, additional computational expense required to run the model.

We suggest an alternative method for generating partial derivatives of complex model output, with respect to model inputs. We propose the use of a Gaussian process emulator which can be used to estimate derivatives even without any derivative information known a priori. We show how an emulator can be employed to provide derivative information about an intermediate complexity climate model, C-GOLDSTEIN, and compare the performance of such an emulator to the C-GOLDSTEIN adjoint model.
17/02/2011 3:30 PM

Mathematics Lecture Theater

Ben Parker School of Mathematical Sciences, Queen Mary, University of London

Design of Experiments for Markov Chains, or how often should we open the box?

Suppose we have a system that we wish to make repeated measurements on,
but where measurement is expensive or disruptive. Motivated by an
example of probing data networks, we model this as a black box system:
we can either chose to open the box or not at any time period, and our
aim is to infer the parameters that govern how the system evolves over
time.

By regarding this system evolution as an experiment that is to be
optimised, we present a method for finding optimal time points at which
to measure, and discuss some numerical results.

We show how we can generalise this result to find optimal measurement
times for any system that evolves according to the Markov principle.

This is joint work with Steven Gilmour and John Schormans (Queen Mary).
17/02/2011 4:30 PM

M203

Stefanie Biedermann University of Southampton

Optimal Designs for Indirect Regression

In many real life applications, it is impossible to observe the feature of interest
directly. For example, non-invasive medical imaging techniques rely on indirect
observations to reconstruct an image of the patient’s internal organs. In this paper,
we investigate optimal designs for such indirect regression problems.

We use the optimal designs as benchmarks to investigate the efficiency of designs
commonly used in applications. Several examples are discussed for illustration.
Our designs provide guidelines to scientists regarding the experimental conditions
at which the indirect observations should be taken in order to obtain an accurate
estimate for the object of interest.

This is joint work with Nicolai Bissantz and Holger Dette (Bochum) and Edmund
Jones (Bristol).
10/02/2011 4:30 PM

M203

Theodore Papamarkou Non-Communicable Disease (NCD) Research Group, Strangeways Research Laboratory University of Cambridge

Patterns of Ethno-Linguistic and Genomic Diversity in Sub-Saharan Africa

Sub-Saharan African populations are characterized by a relatively complex genetic
architecture. Their excessive allele frequency differentiation, linkage disequilibrium
patterns and haplotype sharing have been understudied. The aim of our newly launched
project on African diversity is to understand the genetic diversity among sub-Saharan
African populations and its correlation with ethnic, archaeological and linguistic
variation. Ultimately, the study is hoping to disentangle past population histories
and therefore detect the evolutionary history of sub-Saharan African populations,
who are the origin of anatomically modern humans. Additionally, subsequent
genome-wide association studies, mostly related to lipid metabolism, are expected
to identify previously unsuspected biological pathways involved in disease etiology.

The talk is meant to broadly address the scope of the study and to outline the
associated statistical challenges.
03/02/2011 4:30 PM

M203

Dave Bray School of Mathematical Sciences (Queen Mary) and Department of Mechanical Engineering (Imperial College)

Nanoparticles Dispersion: A Quantitative Measurement

Nanoparticle clustering within composite materials is known to affect the performance of the material, such as its toughness, and can ultimately cause its mechanical failure.

The type of nanoparticle dispersion is often judged through micrographs of the material, obtained using an electron or atomic force microscope. However no standard quantitative method is in use for classifying these materials into good (homogeneous) and poor (heterogeneous) dispersion. For material scientists it is of pressing concern that a suitable method is found to measure particle dispersion to enable further progress to be made in understanding the effect of morphology on the material properties.

This talk aims to be of general interest, providing the engineering background, proposed method and measurement results of test cases.
27/01/2011 4:30 PM

M203

Ron Bates Design Systems Engineering, Rolls-Royce plc

Stochastic analysis models in the development of turbomachinery

The main focus of this seminar is the industrial application of tools
for stochastic analysis within the standard engineering design process.

Various applications will be discussed and the use of statistical methods
in engineering will be highlighted.

The talk will also explore aspects of uncertainty management, and highlight
some of the challenges faced in delivering practical stochastic analysis
methods for the engineering community.
21/01/2011 3:00 PM

M513

Robert Mee University of Tennessee, Knoxville

One-Step RSM Using Fractional Box-Behnken Designs

In contrast to the usual sequential nature of response surface methodology
(RSM), recent literature has proposed both screening and response surface
exploration using a single three-level design. This approach is known as
³one-step RSM². We discuss and illustrate shortcomings of the current
one-step RSM designs and analysis. Subsequently, we propose a class of
three-level designs and an analysis that will address these shortcomings. We
illustrate the designs and analysis with simulated and real data.
20/01/2011 4:30 PM

M203

Salvador Gezan Department of Statistics Institute of Food and Agricultural Sciences, University of Florida

Optimal design and analysis of field genetic trials: using old and new statistical tools.

Agronomic and forestry breeding trials tend to be large, often using hundreds of plants
and showing considerable spatial variation. In this study, we present various
alternatives for the design and analysis of field trials to identify “optimal” or
“near optimal” experimental designs and statistical techniques for estimating genetic
parameters through the use of simulated data for single site analysis. These simulations
investigated the consequences of different plot types (single- or four-plant row),
experimental designs and patterns of environmental heterogeneity.

Also, spatial techniques such as nearest neighbor methods and modeling of the error
structure by specifying an autoregressive covariance were compared. Because spatial
variation cannot usually be accounted for in the trial design another strategy is to
improve trial analysis by using post-hoc blocking. We studied several typical experimental
designs and compared their efficiency with post-hoc blocking of the same designs over a
randomized complete block.

Usually, early stages of a breeding program there is a large availability of genotypes that
could be tested but limited resources. Here, unreplicated trials have been recommended as an
option to support on an early screening of genetic material that can be pre-selected and later
tested in more formal replicated trials for single or multiple sites. In this study, we provide
with a better evaluation/understanding of the statistical and genetic advantages and disadvantages
of the use of unreplicated trials by using simulated data, particularly for clonal trials, under
different replication alternatives. We also measure the gain in precision of using spatial
analysis in unreplicated trials and evaluate the effects of different genetic structures
(additive, dominant and epistasis) on these analyses.
25/11/2010 4:30 PM

M203

Mohammad Lutfor Rahman School of Mathematical Sciences, Queen Mary, University of London

Multi-stratum designs with categorical responses

It is not possible to completely randomize the order of runs in some multi-factor factorial experiments.
This often results in a generalization of the factorial designs called split-plot designs. Sometimes in
industrial experiments complete randomization is not feasible because of having some factors whose
levels are difficult to change. When properly taken into account at the design stage, hard-to-change
factors lead naturally to multi-stratum structures. Mixed models are used to analyze multi-stratum
designs as each stratum may have random effects on the responses. We intend to design
experiments and analyze categorical data with hard-to-set factors with the motivation of random
effects structure in the mixed models. The current study is motivated by a polypropylene experiment
by four Belgian companies where responses are continuous and categorical. We have analyzed the
data from the current experiment using mixed binary logit and mixed cumulative logit models in a
Bayesian approach. Also we obtained outputs following the simplified models by Goos and Gilmour
(2010). While simplified models were used, the output obtained by Bayesian methods were similar to
those obtained by likelihood methods as non-informative priors were considered for the fixed effects.
25/11/2010 5:00 PM

M203

Benjamin Gaby School of Mathematical Sciences, Queen Mary, University of London

Bayesian Tests For Outliers In Uniform Samples

In 1979 Barnett derived a series of classical tests that were based on simulations
to test whether extreme observations in a sample were outliers. He did this
for a variety of different probability distributions, including the Normal,
Exponential, Uniform and Pareto distributions. In 1988 Pettit considers
this problem for Exponential samples by using a Bayesian approach based on
deriving Bayes Factors to perform these tests. Then in 1990 he studies the
multivariate Normal distribution in some detail and approaches this problem
by deriving various results using the conditional predictive ordinate. Since
then this problem has been considered for both the Poisson and Binomial
distributions.

Recently I have been studying this problem for the Uniform and Pareto
distributions and our talk is based on the results that I have obtained for
the Uniform case. The talk will be in two parts, first we look at the one
sided Uniform distribution, where I have shown that the largest observation
in the sample minimises the conditional predictive ordinate and then derived
the Bayes Factor to test whether it is an outlier. I then derived the Bayes
Factors for the cases when we have multiple outliers generated by the same
probability distribution and generated by different probability distributions.
For the one sided Uniform distribution all the results that I obtained managed
to be exact in the fact that I did not have to approximate any integrals.
The second part of the talk looks at the two sided Uniform distribution,
where the structure of the problem was exactly the same as for the one sided
Uniform distribution except that it was a lot more complicated because of
it being a two parameter problem. I dealt with this by using a transformation
that made this a one parameter problem and then used an analytical approach to
approximate the Bayes Factors by an infinite series, where a full derivation
for the approximation and proof that the series converges are given. Finally
in this section, I extend my ideas to solve the problem for multivariate
Uniform distribution.
18/11/2010 4:30 PM

M203

Piotr Zwiernik Department of Statistics, University of Warwick

Cumulants and L-cumulants spaces

In this talk I will first introduce cumulants which form a
convenient language to describe and approximate probability
distributions. A rich combinatorial structure of cumulants
helps to understand them better. The combinatorial version
of the definition of cumulants gives also a direct
generalization to L-cumulants. Without going to much into
technical details I will try to show how L-cumulants can be
used in the analysis of certain statistical models.

Our example focuses on phylogenetic tree models which are
graphical models with hidden data. I will also mention some
links with free probability.
11/11/2010 4:30 PM

M203

Alfonso Miranda Department of Quantitative Social Science, Institute of Education, University of London

Missing ordinal covariates with informative selection

This paper considers the problem of parameter estimation in a model for a
continuous response variable y when an important ordinal explanatory
variable x is missing for a large proportion of the sample. Non-missingness
of x, or sample selection, is correlated with the response variable
and/or with the unobserved values the ordinal explanatory variable takes
when missing. We suggest solving the endogenous selection, or `not missing
at random' (NMAR), problem by modelling the informative selection mechanism,
the ordinal explanatory variable, and the response variable together.

The use of the method is illustrated by re-examining the problem of the ethnic
gap in school achievement at age 16 in England using linked data from
the National Pupil database (NPD), the Longitudinal Study of Young People
in England (LSYPE), and the Census 2001.
04/11/2010 4:30 PM

M203

Michal Komorowski Theoretical Systems Biology Group, Imperial College London

Inference, sensitivity and identifiability in stochastic chemical systems.

The aim of the presentation is to present a novel, integrated
theoretical framework for the analysis of stochastic biochemical
reactions models. The framework includes efficient methods for
statistical parameter estimation from experimental data, as well as
tools to study parameter identifiability, sensitivity and robustness.
The methods provide novel conclusions about functionality and
statistical properties of stochastic systems.

I will introduce a general model of chemical reactions described by
the Chemical Master Equation that I approximate using the linear noise
approximation. This allows to write explicit expressions for the
likelihood of experimental data, which lead to an efficient inference
algorithm and a quick method for calculation of the Fisher Information
Matrices.

A number of experimental and theoretical examples will be presented to
show how the techniques can be used to extract information from the
noise structure inherent to experimental data. Examples include
inference of parameters of gene expression using a fluorescent
reporter gene data, a Bayesian hierarchical model for estimation of
transcription rates and a study of the p53 system. Novel insights into
the causes and effects of stochasticity in biochemical systems are
obtained by the analysis of the Fisher Information Matrices.

References:

Komorowski, M. , Finkenstädt , B., Rand, D. A. , (2010); Using single
fluorescent reporter gene to infer half-life of extrinsic noise and
other parameters of gene expression, Biophysical Journal, Vol 98,
Issue 12, 2759-2769,

Komorowski, M. , Finkenstädt , B., Harper, C. V., Rand, D. A. ,
(2009); Bayesian inference of biochemical kinetic parameters using the
linear noise approximation, BMC Bioinformatics, 2009, 10:343
doi:10.1186/1471-2105-10-343, 2009,

B. Finkenstadt; E. A. Heron; M. Komorowski; K. Edwards; S. Tang; C. V.
Harper; J. R. E. Davis; M. R. H. White; A. J. Millar; D. A. Rand,
(2008);
Reconstruction of transcriptional dynamics from gene reporter data
using differential equations, Bioinformatics 15 December 2008; 24:
2901 - 2907.
28/10/2010 5:30 PM

M203

Guy Freeman University of Warwick

Learning, prediction and causation with graphical models

Graphical models provide a very promising avenue for making sense
of large, complex datasets. In this talk I review strategies for
learning Bayesian networks, the most popular graphical models currently
in use, and introduce a new graphical model, the chain event graph,
which is an improvement on using the Bayes net in many cases but
which introduces its own challenges for learning, prediction and
causation.
21/10/2010 5:30 PM

M203

Alexander Vikhavsky School of Engineering and Material Science, Queen Mary, University of London

Numerical analysis of the global identifiability of electrochemical systems

We discuss a numerical analysis of the parametric identifiability of electrochemical
systems. Firstly, we analyze global identifiability of the entire set of parameters
in a single ac voltammetry experiment and examine the effect of different waveforms
(square, sawtooth) on the accuracy of the identification procedure.

The analysis of global identifiability is equivalent to finding a global optimum of
a specially designed function. The optimization problem is solved by a random search
method and a statistical analysis of the obtained solution allows for selection of
a subset of the parameters (or they linear combinations), which can be identified.
Finally, we discuss optimization of the waveform for better identifiability.
14/10/2010 5:30 PM

M203

Ramon Rizvi School of Mathematical Sciences, Queen Mary, University of London

Beyond cluster analysis

Seminar series:

Statistics Seminar

Cluster analysis is a well established statistical technique which aims
to detect groups in data. Its main use is as an exploratory tool rather
than a conclusive technique.

Recently there has been growing interest in expansions of this technique
under the general umbrella name "Persistence of homology". This new topic
is in the crossroads between statistics and topology; and the main aim is
to describe other features than groups present in multivariate data and
thus it is a natural extension of clustering.

Betti numbers are used to describe data, and applying the first Betti number
coincides with cluster analysis, whereas subsequent Betti numbers enable
detection of "holes" or loops in data. For example, cluster analysis is
unable to detect whether data is gathered around a circle, but with persistent
homology this feature is immediately detected.

The Seminar aims to survey both techniques and illustrate with some
examples.

This Seminar is the result of EPSRC Vacation Bursary Scheme 2010, won
by QMUL undergraduate Ramon Rizvi.
07/10/2010 5:30 PM

M203

Rosemary A. Bailey School of Mathematical Sciences, Queen Mary, University of London

The randomization model for two-phase experiments

Seminar series:

Statistics Seminar

For a single-phase experiment, we allocate treatments to
experimental units using a systematic plan, and then randomize by
permuting the experimental units by a permutation chosen at random from
a suitable group. This leads to the theory developed in J. A.
Nelder's 1965 Royal Society papers. Recently, C. J. Brien and I have
been extending this theory to experiments such as two-phase
experiments, where the produce, or outputs, from the first phase are
randomized to a new set of experimental units in the second phase.
This brings in new difficulties, especially with standard software.
27/05/2010 5:30 PM

M203

Dan Stowell Centre for Digital Music Queen Mary, University of London

Rating and ranking in a standardised audio listening test

Seminar series:

Statistics Seminar

We describe a standardised audio listening test known as MUSHRA, used to
evaluate the perceptual quality of intermediate-quality audio algorithms
(for example MP3 compression). The nature of the test involves aspects
of continuous rating as well as ranking of items. We discuss the statistics
used to analyse test data, in light of recent experiences conducting a
user group study.
29/04/2010 5:30 PM

M203

Silvia Liverani Department of Statistics Bristol University

Bayesian Clustering of Curves and Microarray Data

Seminar series:

Statistics Seminar

An increasing number of microarray experiments produce time
series of expression levels for many genes. Some recent clustering
algorithms respect the time ordering of the data and are, importantly,
extremely fast. The aim is to cluster and classify the expression
profiles in order to identify genes potentially involved in, and
regulated by, the circadian clock. In this presentation we report new
developments associated with this methodology. The partition space is
intelligently searched placing most effort in refining the partition
where genes are likely to be of most scientific interest.
11/03/2010 4:30 PM

M203

Muna Arephin Cancer Research UK Centre for Epidemiology, Mathematics and Statistics Wolfson Institute of Population Health

Order restricted inference for multi-arm trials

Seminar series:

Statistics Seminar

There is an increasing demand to test more than one new treatment in the hope of
finding at least one that is better than the control group in clinical trials. A likelihood
ratio test is developed using order restricted inference, a family of tests is defined and
it is shown that the LRT and Dunnett-type tests are members of this family. Tests are
compared, using power and a simple loss function which takes incorrect selection, and
its impact, into account. The optimal allocation of patients to treatments were sought
to maximize power and minimize expected loss.

For small samples, the LRT statistic for binary data based on order restricted inference
is derived and used to develop a conditional exact test. Two-stage adaptive designs for
comparing two experimental arms with a control are developed, in which the trial is
stopped early if the difference between the best treatment and the control is less than
C1; otherwise, it continues, with one arm if one experimental treatment is better than
the other by at least C2, or with both arms otherwise. Values of the constants C1 and
C2 are compared and the adaptive design is found to be more powerful than the fixed
design.
04/03/2010 4:30 PM

M203

Kei Kobayashi Department of Mathematical Analysis and Statistical Inference The Institute of Statistical Mathematics

Bayesian shrinkage prediction and its application to regression problems

Seminar series:

Statistics Seminar

In this talk, we consider Bayesian shrinkage predictions for the Normal regression problem under the frequentist Kullback-Leibler risk function. This result is an extension of Komaki (2001, Biometrika) and George (2006, Annals. Stat.).

Firstly, we consider the multivariate Normal model with an unknown mean and a known covariance. The covariance matrix can be changed after the first sampling. We assume rotation invariant priors of the covariance matrix and the future covariance matrix and show that the shrinkage predictive density with the rescaled rotation invariant superharmonic priors is minimax under the Kullback-Leibler risk. Moreover, if the prior is not constant, Bayesian predictive density based on the prior dominates the one with the uniform prior.
In this case, the rescaled priors are independent of the covariance matrix of future samples. Therefore, we can calculate the posterior distribution and the mean of the predictive distribution (i.e. the posterior mean and the Bayesian estimate for quadratic loss) based on some of the rescaled Stein priors without knowledge of future covariance. Since the predictive density with the uniform prior is minimax, the one with each rescaled Stein prior is also minimax.

Next we consider Bayesian predictions whose prior can depend on the future covariance. In this case, we prove that the Bayesian prediction based on a rescaled superharmonic prior dominates the one with the uniform prior without assuming the rotation invariance.
Applying these results to the prediction of response variables in the Normal regression model, we show that there exists the prior distribution such that the corresponding Bayesian predictive density dominates that based on the uniform prior. Since the prior distribution depends on the future explanatory variables, both the posterior distribution and the mean of the predictive distribution may depend on the future explanatory variables.

The Stein effect has robustness in the sense that it depends on the loss function rather than the true distribution of the observations. Our result shows that the Stein effect has
robustness with respect to the covariance of the true distribution of the future observations.
25/02/2010 4:30 PM

203

Anthony C. Atkinson Department of StatisticsLondon School of Economics

Optimum Experimental Designs for Enzyme Kinetic Models

Seminar series:

Statistics Seminar

Enzymes are biological catalysts that act on substrates. The speed of reaction as a function of substrate concentration typically follows
the nonlinear Michaelis-Menten model. The reactions can be modified by the presence of inhibitors, which can act by several different mechanisms, leading to a variety of models, all also nonlinear.

The paper describes the models and derives optimum experimental designs for model building. These include D-optimum designs for all the parameters and Ds-optimum designs for subsets of parameters. The Ds-optimum designs may be nonsingular and so do not provide estimates of all parameters; designs are suggested which have both good D- and Ds-efficiencies. Also derived are designs for testing the equality of parameters.
11/02/2010 4:30 PM

M203

Janet Godolphin Department of Mathematics University of Surrey

Estimability and Connectivity in m-way Designs

Seminar series:

Statistics Seminar

The classical problem of ascertaining the connectivity status of an
m-way design has received much attention, particularly in the cases
where m=2 and m=3. In the general case, a new approach yields the
connectivity status for the overall design and for each of the individual
factors directly from the kernel space of the design matrix. Furthermore,
the set of estimable parametric functions in each factor is derived from
a segregated component of this kernel space.

The kernel space approach enables a simple derivation of some classical
results. Examples are given to illustrate the main results.
04/02/2010 4:30 PM

M203

Henry P. Wynn Department of Statistics London School of Economics

Information-based learning, with thoughts on optimal experimental design.

Seminar series:

Statistics Seminar

The information approach to optimal experimental design
is widened to include information-based learning more
generally, drawing on the classical work of Renyi, Lindley,
de Groot and others. Learning is considered as occurring
when the posterior distribution of the quantity of interest
is more peaked than the prior, in a certain sense.

A key theorem states when this is expected to occur. Some
special examples are considered which show the boundary
between when learning occurs and when it does not.
28/01/2010 4:30 PM

M203

Mahbub Latif School of Mathematical Sciences Queen Mary

Design and analysis of transform-both-sides nonlinear models

Seminar series:

Statistics Seminar

Transformation on both sides of a nonlinear regression model has been used in practice to achieve, for example, linearity in the parameters of the model, approximately normally distributed errors, and constant error variance. The method of maximum likelihood is the most common method for estimating the parameters of the nonlinear model and the transformation parameter. In this talk we will discuss a new method, which we call the Anova method, for estimating all the parameters of the transform-both-sides nonlinear model. The Anova method is computationally simpler than the maximum likelihood approach and and allows a more natural separation of different sources of lack-of-fit.

Considering the Michaelis-Menten model as an example, we will show the results of a simulation study for comparing maximum likelihood and Anova methods, where the Box-Cox transformation is used for transforming both sides of the Michaelis-Menten model. We will also show the use of the Anova method in fitting more complex transform-both-sides nonlinear models, such as transform-both-sides nonlinear mixed effects models and transform-both-sides nonlinear model with random block effects. At the end of the talk, we will briefly present a new approach of designing transform-both-sides nonlinear Michaelis-Menten model.
21/01/2010 5:00 PM

M203

Maria Roopa Queen Mary, Queen Mary graduate students seminar

Bayesian decision procedures for dose escalation-a reanalysis

Seminar series:

Statistics Seminar

Zhou et.al (2006) developed Bayesian dose-escalation procedures for
early phase I clinical trials in oncology.They are based on with discrete
measures of undesirable events and continuous measures of therapeutic
benefit. The objective is to find the optimal dose associated with some
low probability of an adverse event.

To understand their methodology I tried to reproduce their results
using a hierarchical linear model (Lindley and Smith (1972)) with different
orderings of the data. Computations were done in R. I found my results
were consistent with one another but different to the published results.
I then also programmed the model using ``WinBugs'' and again found the
results to be consistent with mine. I concluded that the published results
were in error.

My main interests are in Bayesian approaches for the design and analysis
of dose escalation trials, which involves prior information concerning
parameters of the relationships between dose and the risk of an adverse
event, with the chance to update after every dosing period using Bayes
theorem. In this talk I will discuss some of these issues and also shall
report my current work.
05/11/2009 4:30 PM

M203

A. Giovagnoli Università di Bologna

Randomized group up-and-down (U&D) experiments

Seminar series:

Statistics Seminar

Dating back to Dixon and Mood (1948), an Up-and-Down procedure is a sequential experiment used in binary response trials for identifying the stress level (treatment) corresponding to a pre-specified probability of positive response. In Phase I clinical trials U&D rules can bee seen as a development of the traditional dose-escalation procedure (Storer, 1998). Recently Baldi Antognini et al. (2008) have proposed a group version of U&D procedures whereby at each stage a group of m units is treated at the same level and the number of observed positive responses determines how to randomize the level assignment of the next group. This design generalizes a vast class of U&Ds previously considered (Derman, 1957; Durham and Flournoy 1994; Giovagnoli and Pintacuda, 1998; Gezmu and Flournoy, 2006). The properties of the design change as the randomization method varies: appropriate randomization schemes guarantee desirable results in terms of the asymptotic behaviour of the experiment (see also Bortot and Giovagnoli, 2005). Results can be extended to continuous responses (Ivanova and Kim, 2009).

Other approaches for identifying a target dose, alternative to the nonparametric U&D, are the parametric Continual Reassessment Method introduced by O'Quigley et al. (1990), and several recent modifications thereof. The debate on dose escalation procedures in the recent statistical literature continues to be very lively.
15/10/2009 5:30 PM

M203

W. Bergsma London School of Economics

Marginal models for dependent, clustered and longitudinal categorical data

Seminar series:

Statistics Seminar

In the social, behavioral, educational, economic, and biomedical sciences, data are often collected in ways that introduce dependencies in the observations to be compared. For example, the same respondents are interviewed at several occasions, several members of networks or groups are interviewed within the same survey, or, within families, both children and parents are investigated. Statistical methods that take the dependencies in the data into account must then be used, e.g., when observations at time one and time two are compared in longitudinal studies. At present, researchers almost automatically turn to multi- level models or to GEE estimation to deal with these dependencies. Despite the enormous potential and applicability of these recent developments, they require restrictive assumptions on the nature of the dependencies in the data.

The marginal models of this talk provide another way of dealing with these dependencies, without the need for such assumptions, and can be used to answer research questions directly at the intended marginal level. The maximum likelihood method, with its attractive statistical properties, is used for fitting the models. This talk is based on a recent book by the authors in the Springer series Statistics for the Social Sciences, see www.cmm.st.
20/03/2024 2:00 PM

MB-503

Prof. Ioanna Manolopoulou (UCL)

Combining observational data with non-representative randomised data in heterogeneous treatment effect modelling

Building statistical models using non-randomly sampled data is a well-known challenge in statistics, and is especially challenging when any part of the statistical model is not fully identifiable. In causal inference, and in particular in the estimation of heterogeneous treatment effects, this arises when observational data are used which may be affected by unobserved confounding. One approach to correct for such confounding is to combine observational data with randomised experiments. However, when these randomised experiments are not representative of the whole population, the effect of de-confounding will be poor for subsets of the population that fall outside the range of these experiments. Depending on the structure of the model and the nature of the prior distributions used within a Bayesian model, this will be addressed by borrowing information from other parts of the space. In this work, we highlight the importance of building models that can account for uncertainty due to unobserved confounding in regions where no de-confounding is possible. To this end, we embed a combination of randomised and observational data into Bayesian Causal Forests (BCF), and make use of adaptive modular inference to harness as much reliable information from the observational data as possible, without leading to over-confidence in regions of poor identifiability. We implement our methods on a set of simulated and real data examples.
13/03/2024 2:00 PM

MB-503

Dr Nicolo Colombo (Royal Holloway)

On training locally-adaptive Conformal Prediction

Conformal Prediction (CP) is a distribution-free and non-asymptotic uncertainty estimation method, i.e. it does not rely on assumptions on the underlying data distribution and provides finite-sample guarantees. Given any pre-trained prediction algorithm and a test sample, a CP algorithm produces a Prediction Set (PS), i.e. a subset of the label space, that is guaranteed to contain the test label with lower-bounded marginal probability. We address the problem of making the PS locally adaptive. The proposed new strategy produces PS that are marginally valid but have input-dependent sizes. The localization process is cast into a smooth minimization problem and can be solved through standard gradient methods.
28/02/2024 2:00 PM

MB-503

Dr Sandipan Roy (University of Bath)

Multi-Response Linear Regression Estimation Based on Low-Rank Pre-smoothing

Pre-smoothing is a technique aimed at increasing the signal-to-noise ratio in data to improve subsequent estimation and model selection in regression problems. However, pre-smoothing has thus far been limited to the univariate response regression setting. Motivated by the widespread interest in multi-response regression analysis in many scientific applications, this article proposes a technique for data pre-smoothing in this setting based on low rank approximation. We establish theoretical results on the performance of the proposed methodology, and quantify its benefit empirically in a number of simulated experiments. We also demonstrate our proposed low rank pre-smoothing technique on real data arising from the environmental sciences.
21/02/2024 2:00 PM

MB-503

Dr. Zeljko Kereta (UCL)

On improving unsupervised approaches for medical image reconstruction

Deep learning-based image reconstruction approaches have demonstrated considerable success in many imaging modalities. However, their reliance on abundant high-quality paired training data remains a significant hurdle in many problem domains where such datasets are not available, for example in medical imaging. Moreover, deep learning approaches in data scarce scenarios often fail to generalise and are prone to reconstruction artefacts in case of distributional shifts. In this talk we present an unsupervised/selfsupervised deep learning approach aimed to address these challenges through a two-stage methodology. In the first stage the network is pretrained on simulated training data of ground truth images and measurements. In the second stage the parameters are fine-tuned on the target image, adapting the model to the shift in distribution. Experimental results showcase the effectiveness of our approach, revealing accelerated deployment, improved stability, and competitive performance despite limited training data.
14/02/2024 2:00 PM

MB-503

Prof. Emmanuil Georgoulis (Heriot-Watt)

hp-Version Discontinuous Galerkin Methods on Essentially Arbitrarily-Shaped Elements

I will present a recent generalisation of the popular interior-penalty discontinuous Galerkin (dG) method discretizing general classes of linear and nonlinear advection-diffusion-reaction problems to meshes comprising extremely general, essentially arbitrarily-shaped element shapes. In particular, our analysis allows for curved element shapes, without the use of non-linear elemental maps. The feasibility of the method relies on the definition of a suitable choice of the discontinuity-penalisation, which turns out to be explicitly dependent on the particular element shape, but essentially independent on small shape variations. A priori error bounds for the resulting method will be given, under very mild structural assumptions restricting the magnitude of the local curvature of element boundaries. I also plan to discuss briefly computer implementation aspects of the framework. Numerical experiments will be also presented throughout the talk aiming to motivate and showcase the practicality and the potential advantages of the proposed numerical framework.
31/01/2024 2:00 PM

MB-503

Dr Alex Shestopaloff (QMUL)

Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models

In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models. Within our framework, the envisioned pipeline takes as input a set of time series, and creates an enlarged universe of extracted subsequence time series from each input time series, via a sliding window approach. This is then followed by an application of various clustering techniques, (such as k-means++ and spectral clustering), employing a variety of pairwise similarity measures, including nonlinear ones. Once the clusters have been extracted, lead-lag estimates across clusters are robustly aggregated to enhance the identification of the consistent relationships in the original universe. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our method is not only able to robustly detect lead-lag relationships in financial markets, but can also yield insightful results when applied to an environmental data set.
30/11/2023 2:00 PM

MB-503

Dr Poulami Ganguly (SMS, QMUL)

Grid-free algorithms for tomographic imaging

The inverse problem of tomographic imaging is the reconstruction of a 3D sample from 2D projection images. An estimate for the 3D reconstruction of a sample is usually obtained by discretizing the reconstruction volume using a voxel grid. This discretization may not be ideal in scenarios where additional prior knowledge is available. In this talk, we look at two applications where grid-free alternatives are advantageous: first, we look at the problem of reconstructing a nanocrystal at atomic resolution from electron microscopy images taken at a few tilt angles. We propose a grid-free algorithm that allows for continuous deviations of the atom locations. We show that this allows for a meaningful incorporation of additional prior knowledge about the system, in particular the potential energy of the configuration, and is able to resolve lattice defects in simulated data. In addition, we show how augmenting such an approach with a model for deformation allows us to propose a grid-free algorithm for tilt-series alignment in cryo-electron tomography. We compare this second approach with existing approaches for tilt-series alignment and show that we can reliably estimate marker locations and deformations without labelling markers in projection data.
07/12/2023 2:00 PM

MB-503

Dr José A. Iglesias (University of Twente)

On extremal points for some convex regularizers

Due in part to a wider acceptance of advanced convex optimization methods, nonsmooth regularization terms are now a mainstay of variational approaches in inverse problems, optimal control, and beyond. A majority of those used in practice are positively one-homogeneous, which means that they can be seen as the Minkowski or gauge functional of an infinite-dimensional convex set, the generalized unit ball associated to the regularizer.

Under compactness assumptions which are in any case required for the regularization method to be well-posed, these balls can be described as the convex hull of their extremal points. Making such a description explicit has a multitude of applications mostly revolving around sparsity, which is usually the motivation for introducing such regularization functionals in the first place. These include results showing existence of solutions that can be expressed using finitely many of these extremal points, and optimization algorithms based on such iterates, which often admit fast convergence guarantees and grid-free implementations.

In this talk we will consider this description of extremal points in some specific cases. We provide a full characterization for two infimal convolution-type functionals, the total generalized variation in one dimension and Kantorovich-Rubinstein norms in spaces of signed measures in Euclidean space, as well as some results on different variants of the total (gradient) variation.

Based on joint works with Daniel Walter, Marcello Carioni, Giacomo Cristinelli and Kristian Bredies.
16/11/2023 2:00 PM

MB-503

Dr Natalia Efremova (SBM, QMUL)

Leveraging Machine Learning and Earth Observation Data for Sustainable Agriculture

Agriculture is both one of the sectors most susceptible to climate change and a significant contributor to it. Therefore, it is essential to consider both mitigation and adaptation strategies, as well as transforming agricultural practices to promote sustainability and resilience in the agricultural sector. A key objective of application of artificial intelligence (AI) and satellite imagery in agricultural settings is to develop more reliable and scalable methods for monitoring global crop conditions promptly and transparently, while also exploring how we can adapt agriculture to mitigate the effects of climate change. Agricultural monitoring with earth observation data provides a timely and reliable way to access the state of the field or farm and the surrounding territories, used for gathering data and producing forecasts. Computer vision and signal processing techniques play a crucial role in extracting meaningful information from raw satellite data. Growing adoption of AI and machine learning (ML) tools has significantly influenced the expansion of Earth Observation (EO) and remote sensing to agricultural management. In this talk, I will discuss advanced techniques employed throughout the entire data processing cycle, encompassing tasks such as data compression, transmission, image recognition, and forecasting environmental factors like land cover, land use, biomass, organic soil carbon and soil nutrient and more.
09/11/2023 2:00 PM

MB-503

Dr Swati Chandna (Birkbeck, University of London)

Nonparametric modeling and estimation for network data

Network data are commonly observed in a wide variety of applications. Such data may arise in the form of a single network observed at a given point in time, or as multiple networks on the same set of nodes, for example, social networks on the same set of individuals over time, or from different social platforms at a given point in time. A nonparametric approach to studying structure in unlabeled networks is offered by the graphon function. There has been a growing interest on the problem of graphon estimation as well as its application to important problems such as bootstrapping networks, estimation of missing links etc. In this talk, I will present results on graphon estimation from a single network observed with node covariates and a natural extension of the graphon model to the bivariate setting where a pair of possibly correlated networks on the same set of nodes are observed.
23/11/2023 2:00 PM

MB-503

Dr Angelica Aviles-Rivero (University of Cambridge)

Functionals, Neural Nets, and Beyond: On Multi-Modal Graph Learning and Implicit Neural Representations

In this talk, we delve into two pivotal subjects. The first topic revolves around the development of hybrid graph models tailored to the complexities of multi-modal data. We present a novel semi-supervised hypergraph learning framework, specifically designed for diagnostic purposes. Our approach adopts a hybrid perspective, where we introduce a new methodology centered on a dual embedding strategy and a semi-explicit flow. To illustrate the efficacy of our proposed model, we employ it within the realm of Alzheimer's disease diagnosis, demonstrating its capacity to uncover latent relationships within intricate multi-modal data.

Transitioning seamlessly to the second subject, we delve into implicit neural representations. We introduce an innovative function designed to harness the strengths of Strong Spatial and Frequency attributes, marking a departure from conventional methods. Remarkably, our novel technique showcases exceptional enhancements in performance across a diverse array of downstream tasks, notably encompassing CT reconstruction and denoising applications. Through rigorous experimentation, we elucidate the advantages enabled by our approach.
26/10/2023 2:00 PM

MB-503

Dr Martin Benning (QMUL)

A lifted Bregman formulation for the inversion of deep neural networks

We propose a novel framework for the regularized inversion of deep neural networks. The framework is based on recent work on training feed-forward neural networks without the differentiation of activation functions. The framework lifts the parameter space into a higher dimensional space by introducing auxiliary variables, and penalizes these variables with tailored Bregman distances. We propose a family of variational regularizations based on these Bregman distances, present theoretical results and support their practical application with numerical examples. In particular, we present the first convergence result (to the best of our knowledge) for the regularized inversion of a single-layer perceptron that only assumes that the solution of the inverse problem is in the range of the regularization operator, and that shows that the regularized inverse provably converges to the true inverse if measurement errors converge to zero. This is joint work with Xiaoyu Wang from Heriot-Watt University.
12/10/2023 2:00 PM

MB-503

Dr Yury Korolev (University of Bath)

Vector-valued Barron spaces

Approximation properties of infinitely wide neural networks have been studied by several authors in the last few years. New function spaces have been introduced that consist of functions that can be efficiently (i.e., with dimension-independent rates) approximated by neural networks of finite width, e.g. Barron spaces for networks with a single hidden layer. Typically, these functions act between Euclidean spaces, typically with a high-dimensional input space and a lower-dimensional output space. As neural networks gain popularity in inherently infinite-dimensional settings such as inverse problems and imaging, it becomes necessary to analyse the properties of neural networks as nonlinear operators acting between infinite-dimensional spaces. In this talk, I will discuss a generalisation of Barron spaces to functions that map between Banach spaces and present Monte-Carlo (1/sqrt(n)) approximation rates.
05/10/2023 2:00 PM

MB-503

Dr Kolyan Ray (Imperial College London)

A variational Bayes approach to debiased inference in high-dimensional linear regression

We consider statistical inference for a single coordinate of a high-dimensional parameter in sparse linear regression. It is well-known that high-dimensional procedures such as the LASSO can provide biased estimators for this problem and thus require debiasing. Motivated by recent theoretical advances on debiased Bayesian inference, we propose a scalable variational Bayes approach to this problem. We investigate the numerical performance of this algorithm and establish accompanying theoretical guarantees for estimation and uncertainty quantification. Joint work with Ismael Castillo, Alice L’Huillier and Luke Travis.
28/09/2023 2:00 PM

MB-503

Dr Adam Sykulski (Imperial College London)

Spatiotemporal Statistical Modelling of Ocean Data

This talk will study spatiotemporal data collected from “drifters” which are instruments designed to freely float around our ocean, mimicking particles of water. While the focus of the talk is in oceanography, this form of data is ubiquitous, for example the spatiotemporal data collected from wearable devices (e.g. medical wristwatches), therefore much of the methodology presented can translate to other applications. The focus is on data-driven statistical solutions, the presentation will not be too technical, and no prerequisite knowledge of oceanography is expected from the audience!
MB-503

Pierre Miasnikof (University of Toronto)

Two statistical techniques for graph structure assessment (complex networks)

I will present two statistical techniques that were specifically designed to address problems in network analysis. The first is a statistical algorithm to determine if a network meets the prerequisite conditions to be meaningfully summarized through clusters. Clustering algorithms will always identify clusters. Unfortunately, if a network does not possess a clustered structure, the (node) clustering exercise will not only be a waste of time, it will inevitably result in misleading conclusions. The second technique is a statistical routine that seeks to answer the question "is network G1 similar to network G2?". To answer this question, we transform the graph into a probability distribution and use a standard Kolmogorov-Smirnov test.
12/04/2023 11:00 AM

MB-503

Jim Griffin (UCL)

Bayesian vector autoregressions with tensor decompositions

Vector autoregressions (VARs) are popular in analyzing economic time series. However, VARs can be over-parameterized if the numbers of variables and lags are moderately large. Tensor VAR, a recent solution to overparameterization, treats the coefficient matrix as a third-order tensor and estimates the corresponding tensor decomposition to achieve parsimony. In this paper, the inference of Tensor VARs is inspired by the literature on factor models. Firstly, we determine the rank by imposing the Multiplicative Gamma Prior to margins, i.e. elements in the decomposition, and accelerate the computation with an adaptive inferential scheme. Secondly, to obtain interpretable margins, we propose an interweaving algorithm to improve the mixing of margins and introduce a post-processing procedure to solve column permutations and sign-switching issues. In the application of the US macroeconomic data, our models outperform standard VARs in point and density forecasting and yield interpretable results consistent with the US economic history.
05/04/2023 11:00 AM

MB-503

Eftychia Solea (QMUL)

High-dimensional Nonparametric Functional Graphical Models via the Functional Additive Partial Correlation Operator

This article develops a novel approach for estimating a high-dimensional and nonparametric graphical model for functional data. Our approach is built on a new linear operator, the functional additive partial correlation operator, which extends the partial correlation matrix to both the nonparametric and functional settings. We show that its nonzero elements can be used to characterize the graph, and we employ sparse regression techniques for graph estimation. Moreover, the method does not rely on any distributional assumptions and does not require the computation of multi-dimensional kernels, thus avoiding the curse of dimensionality. We establish both estimation consistency and graph selection consistency of the proposed estimator, while allowing the number of nodes to grow with the increasing sample size. Through simulation studies, we demonstrate that our method performs better than existing methods in cases where the Gaussian or Gaussian copula assumption does not hold. We also demonstrate the performance of the proposed method by a study of an electroencephalography data set to construct a brain network.
22/03/2023 11:00 AM

MB-503

Eoghan O’Neill (Erasmus University of Rotterdam)

Type 1 and Type 2 Tobit Bayesian Additive Regression Tree Models

This paper introduces Type I and Type II Tobit Bayesian Additive Regression Trees (TOBART-1 and TOBART-2). Simulation results and applications to real data sets demonstrate that TOBART-1 produces more accurate predictions than competing methods, and provides posterior intervals for the conditional expectation and other quantities of interest.

TOBART-2 extends the Type II Tobit model to account for nonlinearities and model uncertainty by including sums of trees in both the selection and outcome equations. A Dirichlet Process Mixture distribution for the error term allows for departure from the assumption of bivariate normally distributed errors. Simulation studies suggest that TOBART-2 can produce more accurate treatment effect estimates than competing approaches. We illustrate the method with an application to the RAND Health Insurance Experiment.
22/02/2023 11:00 AM

MB-503

Javier Rubio Alvarez (UCL)

Flexible Excess Hazard Modelling with Applications in Cancer Epidemiology

Excess hazard modelling is one of the main tools in population-based cancer survival research. This setting allows for direct modelling of the survival due to cancer in the absence of reliable information on the cause of death, which is common in population-based cancer epidemiology studies. We propose a unifying link-based additive modelling framework for the excess hazard that allows for the inclusion of many types of covariate effects, including spatial and time-dependent effects, using any type of smoother, such as thin plate, cubic splines, tensor products and Markov random fields. Three case studies that illustrate the type of applications of interest in practice will be presented. We will conclude with a discussion on available software tools (in R), as well as a general discussion on the use of the relative survival framework.
01/03/2023 11:00 AM

MB-503

Richard Hooper (QMUL)

Optimal design of stepped wedge cluster randomised trials

Stepped wedge trials are cluster randomised clinical trials in which each “cluster” of participants (e.g. all users of a local health service) are randomised, not to one treatment condition or another, but to a particular schedule for crossing from the control condition to the intervention condition. Some clusters might cross over before data collection even begins; some might cross over at some point during the prospective data collection interval, and some might not cross over at all during that interval. In a stepped wedge trial the cross-over is always unidirectional. You can cross from control to intervention, but never back again from intervention to control. In some stepped wedge trials participants are recruited from each cluster in one, long, consecutive stream; in others they are recruited once at the start of the trial and followed prospectively as a cohort; in still others they are sampled in a series of cross-sectional snapshots of the cluster through time. The unidirectional cross-over, the constraints on how many people you can recruit and when, and the way you model the correlation between health outcomes of individuals from the same cluster (the intra-cluster correlation), all lead to some fascinating problems in the design of experiments, with some equally fascinating solutions. These solutions are of great practical interest to applied health researchers trying to evaluate public health interventions and quality improvement programmes. In my own methods research programme I am particularly interested in two kinds of stepped wedge design: “incomplete” designs, where data collection effort is focused at particular times in particular clusters, and designs with continuous recruitment of participants. I will present some of the findings from this work.
15/02/2023 11:00 AM

MB-503

Yanbo Tang (Imperial College London)

Adaptive Quadrature for Bayesian Inference

Adaptive numerical quadrature is used to normalize posterior distributions in many Bayesian models. We provide the first stochastic convergence rate for the error incurred when normalizing a posterior distribution under typical regularity conditions. We give approximations to moments, marginal densities, and quantiles, and provide convergence rates for several of these summaries. Low- and high-dimensional applications are presented, the latter using adaptive quadrature as one component of a more sophisticated approximation framework, for which limited theory is given. Extension of the theory to the high-dimensional framework for the Laplace approximation (a specific instance of an adaptive quadrature method) is considered and guarantees are provided under additional regularity assumptions.
09/02/2023 2:30 PM

MB-503

Michael Pitt (KCL)

On some properties of Markov chain Monte Carlo simulation methods when the likelihood is intractable

Markov chain Monte Carlo samplers still converge to the correct posterior distribution of the model parameters when an unbiased estimator is available for the Likelihood. Whilst this allows inference for a very wide variety of intractable problems, a critical issue for performance is the choice of the number of particles (or samples).

We add the following contributions. We provide analytically derived, practical guidelines on the optimal number of particles to use in general scenarios. We show that the results in the article apply more generally to Markov chain Monte Carlo sampling schemes with the likelihood estimated in an unbiased manner. We introduce recent results on the asymptotic limits as T (the length of the time series) becomes large. Applications include Stochastic Volatility models for which the volatility follows a stochastic differential equation.
26/01/2023 11:30 AM

MB-503

Nicola Perra (QMUL)

Modelling the spreading of SARS-CoV2 across different spatio-temporal scales

In the talk, I will provide an overview of different approaches I have applied to model the unfolding of the COVID-19 pandemic and its effects. In doing so, I will discuss the insights obtained by studying the initial phases of the pandemic, the first wave, and the vaccine rollout in the USA, Europe as well as Latin America. I will also discuss the key role of non-pharmaceutical interventions.
02/02/2023 11:00 AM

MB-503

Dan Zhu (Monash University)

Distribution Vector Autoregression: Eliciting Macro and Financial Dependence

Vector autoregression is an essential tool in empirical macroeconomics and finance, providing simple yet insightful information, such as the impulse response function of different shocks. This paper extends the scope of vector autoregression under a multivariate distribution regression framework and proposes the distribution impulse response function, which provides a more comprehensive picture of the dynamic heterogeneity. As an empirical application, we apply the proposed method to study the conditional joint distribution of GDP growth rate and financial conditions in the U.S. The results from our new framework confirm some existing findings in the literature: 1) the tight financial condition creates multimodality in the conditional joint distribution, and 2) restricting the upper tail of financial condition has a noticeable impact on long-term GDP growth. Yet, the extracted information on the effect of restricting the lower tail of GDP during the global financial crisis suggests an alternative conclusion, i.e., negligible impact on financial condition.
22/11/2022 11:00 AM

MB-503

Uzu Lim (Oxford)

Tangent space and dimension estimation of data manifold

Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy non-uniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and non-uniformity of the probability measure.
06/12/2022 11:00 AM

MB-503

John Baez (UCR, CQT NUS and Topos Institute)

Information Theory in Population Dynamics

Information theory has interesting connections to the population dynamics of self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher's fundamental theorem of natural selection.
29/11/2022 11:00 AM

MB-503

Kostas Papafitsoros (QMUL)

Automatic Distributed Parameter Selection of Regularisation Functionals in Imaging via Bilevel Optimisation

We will discuss a series of bilevel optimisation problems that use a suitable statistics-based upper level objective and lead to automatic selection of spatially dependent parameters for regularisation functionals used in image reconstruction. The spatial dependence of the parameters generally leads to a better recovery of high-detailed areas in the reconstructed image. We will introduce the framework by considering initially as a regulariser the weighted Total Variation, and subsequently discuss its artifact-free, higher order extension, weighted Total Generalised Variation. We will then present some recent results regarding extension of the framework to regularisation functionals that involve a more general class of differential operators. The applicability of this extension will be demonstrated with numerical results in image denoising for a Huber Total Variation functional where also the underlying Huber parameter is chosen to be spatially dependent. This provides further flexibility in the regularisation process and eventually results in an improved reconstruction quality.
15/11/2022 11:00 AM

MB-503

Kristian Strommen (University of Oxford)

A topological perspective on weather regimes

It has long been suggested that the mid-latitude atmospheric circulation possesses what has come to be known as "weather regimes", which can roughly be categorised as regions of phase space with above-average density. Their existence and behaviour has been extensively studied in meteorology and climate science, due to their potential for drastically simplifying the complex and chaotic mid-latitude dynamics. Several well-known, simple non-linear dynamical systems have been used as toy-models of the atmosphere in order to understand and exemplify such regime behaviour. Nevertheless, no agreed-upon and clear-cut definition of a "regime" exists in the literature, and unambiguously detecting their existence in the atmospheric circulation is often hindered by the high dimensionality of the system.

In this talk I will first give an overview of some of the approaches used to study and define weather regimes. I will then proceed to propose a definition of weather regime that equates the existence of regimes in a dynamical system with the existence of non-trivial topological structure of the system's attractor. I will discuss how this approach is computationally tractable, practically informative, and identifies the relevant regime structure across a range of examples. This talk is based on the paper https://doi.org/10.1007/s00382-022-06395-x
01/11/2022 11:00 AM

MB-503

Vitaliy Kurlin (University of Liverpool)

Geometric Data Science: old challenges and new solutions

Geometric Data Science develops continuous parameterizations
on moduli spaces of data objects up to important equivalences. The key
example is a finite or periodic set of unlabeled points considered up
to rigid motion or isometry preserving inter-point distances. Periodic
point sets model all solid crystalline materials (periodic crystals)
with zero-size points at all atomic centers. A periodic point set is
usually given by a finite motif of points (atoms or ions) in a unit
cell (parallelepiped) spanned by a linear basis. The underlying
lattice can be generated by infinitely many bases. Even worse, the set
of possible motifs for any periodic set is continuously infinite.

This typical ambiguity of data representation was recently resolved by

generically complete and continuous isometry invariants: Pointwise

Distance Distributions (PDD) of periodic point sets. The near-linear
time algorithm for PDD invariants was tested on more than 200 billion
pairwise comparisons of all 660K+ periodic crystals in the world's
largest collection of real materials: the Cambridge Structural
Database.

The huge experiment above took only two days on a modest desktop and
detected five pairs of isometric duplicates. In each pair, the
crystals are truly isometric to each other but one atom is replaced
with a different atom type, which seems physically impossible without
perturbing distances to atomic neighbors. Five journals are now
investigating the integrity of the underlying publications that
claimed these crystals.

The more important conclusion is the Crystal Isometry Principle
meaning that all real periodic crystals have unique geographic-style
locations in a common continuous Crystal Isometry Space (CRISP). This
space CRISP is parameterized by complete isometry invariants and
contains all known and not yet discovered periodic crystals.

The relevant publications are in NeurIPS 2022, MATCH 2022, SoCG 2021.

The latest paper in arxiv:2207.08502 defined complete isometry
invariants with continuous computable metrics on any finite sets of
unlabeled points in a Euclidean space. Many papers are co-authored
with colleagues at Liverpool Materials Innovation Factory and inked at

http://kurlin.org/research-papers.php#Geometric-Data-Science.
25/10/2022 11:00 AM

MB-503

Hugo Maruri-Aguilar (QMUL)

Betti penalisation of Lasso

This talk is concerned with an enhancement of Lasso for polynomial

regression models. Our polynomial regression uses squarefree
hierarchical models, and these models can be seen as a simplicial
complex. We propose a compound criterion that combines
validation error with a measure of model complexity, and the measure
of model complexity is a sum of Betti numbers of the model.

The compound criteria helps model selection in polynomial regression models
containing higher-order interactions. Simulation results and a real data
example show that the compound criteria produces sparser models with lower
prediction errors than other statistical methods.

As part of the talk, I will mention briefly the history of this project which
I believe is worthy looking at.

This is joint work with S. Hu (Alibaba) and Z. Ma (Huawei)
11/10/2022 11:00 AM

MB-503

Celeste Damiani (QMUL)

MammoAI: An AI System for Risk Assessment at Mammography Screening
At the moment, in the UK the majority of women go through the same breast cancer screening programme, but different women have different levels of risk of getting breast cancer.

We look at how we can assess risk using mammograms to enable new breast cancer screening programmes that they are more suited to the level of risk faced by each woman. In particular:
- How can we tell when a woman might be at risk of getting a false negative during a standard mammogram, and should be offered a supplemental screening method?
- How do we assess the risk future cancer after a negative screen?
- How can we use TDA tools for risk assessment on mammograms?
This work is part of the CRUK funded project (reference 49757/A28689) "An Artificial Intelligence System for Real-time Risk Assessment at Mammography Screening (Mammo AI)'"
04/10/2022 11:00 AM

MB-503

Matteo Iacopini (QMUL)

Bayesian Additive Regression Trees for Rank-Order Data

Rank-ordered data are popular in many fields, including sports, marketing, finance, politics, and health economics. Most of the existing approaches rely on the restrictive assumption of a linear specification for the latent scores that drive the observed ranks. Besides, despite being provided over time by one or multiple rankers, the temporal dimension and properties of these orderings have been rarely investigated in the literature. To deal with these issues, we introduce two novel families of nonparametric order-statistics models that considers a static (ROBART) and an autoregressive process (ARROBART) for the latent scores and allows for a nonlinear impact of each covariate on the latent scores. This is achieved by modeling the regression function via a Bayesian additive regression tree (BART), that defines the overall fit as the sum the fit of many small regression trees. As generalizations of the Thurstone family, the proposed ROBART and ARROBART models preserve interpretability and include several popular frameworks as special cases. Joint work with Eoghan O’Neill, Luca Rossini.
27/09/2022 11:00 AM

MB-503

Renata Turkes (University of Antwerp)

On the Effectiveness of Persistent Homology

Persistent homology (PH) is one of the most popular methods in Topological Data Analysis. Even though PH has been used in many different types of applications, the reasons behind its success remain elusive; in particular, it is not known for which classes of problems it is most effective, or to what extent it can detect geometric or topological features. The goal of this work is to identify some types of problems where PH performs well or even better than other state-of-the-art methods in data analysis. We consider three fundamental shape analysis tasks: the detection of the number of holes, curvature and convexity from 2D and 3D point clouds sampled from shapes. Experiments demonstrate that PH is successful in these tasks, outperforming several baselines, including PointNet, an architecture inspired precisely by the properties of point clouds. In addition, we observe that PH remains effective for limited computational resources and limited training data, as well as out-of-distribution test data, including various data transformations and noise. For convexity detection, we provide a theoretical guarantee that PH is effective for this task, and demonstrate the detection of a convexity measure on the FLAVIA dataset of plant leaf images.

This talk is based on joint work with Guido Montufar and Nina Otter (https://arxiv.org/abs/2206.10551)

The talk will be given in person in MB-503, and we will also make it available remotely through the following Zoom link.
13/04/2022 1:00 PM

via Zoom

Daniel Paulin, Edinburgh University

Efficient MCMC sampling with dimension-free convergence rate using ADMM-type splitting

Performing exact Bayesian inference for complex models is computationally intractable. Markov chain Monte Carlo (MCMC) algorithms can provide reliable approximations of the posterior distribution but are expensive for large data sets and high-dimensional models. A standard approach to mitigate this complexity consists in using subsampling techniques or distributing the data across a cluster. However, these approaches are typically unreliable in high-dimensional scenarios. We focus here on a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated alternating direction method of multipliers (ADMM) optimization algorithm. These methods appear to provide empirically state-of-the-art performance but their theoretical behavior in high dimension is currently unknown. In this paper, we propose a detailed theoretical study of one of these algorithms known as the split Gibbs sampler. Under regularity conditions, we establish explicit convergence rates for this scheme using Ricci curvature and coupling ideas. We support our theory with numerical illustrations. This is joint work with Maxime Vono (Criteo AI Lab) and Arnaud Doucet (Oxford).

Zoom link
06/04/2022 1:00 PM

via Zoom

Axel Finke

Conditional sequential Monte Carlo in high dimensions

We discuss Markov chain Monte Carlo methods called "iterated conditional sequential Monte Carlo" a.k.a. "particle Gibbs samplers". These methods can be used to approximate the joint distribution of all latent states in state-space models. We show that these methods suffer a curse of dimension. We then introduce a novel modification of this method which employs local, random-walk type moves to circumvent this curse of dimension.

Zoom link
08/03/2022 2:00 PM

MB204 in person + via Zoom

Peter Bubenik

Homotopy, Homology, and Persistent Homology using Cech’s Closure Spaces

We use Cech closure spaces, also known as pretopological spaces, to develop a uniform framework that encompasses the discrete homology of metric spaces, the singular homology of topological spaces, and the homology of (directed) clique complexes, along with their respective homotopy theories. We obtain nine homology and six homotopy theories of closure spaces. We show how metric spaces and more general structures such as weighted directed graphs produce filtered closure spaces. For filtered closure spaces, our homology theories produce persistence modules. We extend the definition of Gromov-Hausdorff distance to filtered closure spaces and use it to prove that our persistence modules and their persistence diagrams are stable. We also extend the definitions Vietoris-Rips and Cech complexes to closure spaces and prove that their persistent homology is stable.

This is joint work with Nikola Milicevic.

Here is the Zoom link
23/02/2022 1:00 PM

via Zoom

Hugo Maruri-Aguilar

Echelon designs, Hilbert series and Smolyak grids

Echelon designs were first described in the monograph by Pistone et al. (2000). These designs are defined for continuous factors and include, amongst others, factorial designs. They have the appealing property that the saturated polynomial model associated to it mirrors the geometric configuration of the design. Perhaps surprisingly, the interpolators for such designs are based upon the Hilbert series of the monomial ideal associated with the polynomial model and thus the interpolators satisfy properties of inclusion-exclusion.

Echelon designs are quite flexible for modelling and include the recently developed designs known as Smolyak sparse grids. In our talk we present the designs, describe their properties and show examples of application.

This is joint work with H. Wynn (CATS, LSE).

Reference: Pistone et al. (2000) Algebraic Statistics. Chapman & Hall/CRC

Key words: Sparse grids, experimental design, algebraic statistics, polynomial models.

Zoom link
16/02/2022 1:00 PM

via Zoom

Quan Zhou (Texas A&M)

Informed MCMC sampling for high-dimensional model selection problems

Informed Markov chain Monte Carlo (MCMC) methods have been proposed as scalable solutions to Bayesian posterior computation on high-dimensional discrete state spaces, but theoretical results about their convergence behavior in general settings are lacking. In this talk, we first consider the variable selection problem. We propose a novel informed Metropolis-Hastings algorithm which can achieve a mixing rate that is independent of the number of covariates, under mild high-dimensional conditions. The mixing time proof relies on a novel method called "two-stage drift condition". This result shows that the mixing rate of locally informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation, and thus such methods scale well to high-dimensional data. Second, we consider MCMC sampling on general finite state spaces. We propose a class of methods called informed importance tempering (IIT) and develop generally applicable spectral gap bounds that characterize the convergence rate of IIT. Our theory provides important insights into how to choose the proposal weighting scheme for an informed MCMC method. If time permits, we will also briefly discuss the application of our theory to the high-dimensional structure learning problem. This talk is based on joint works with A. Smith, H. Chang, J. Yang, D. Vats, G. Roberts and J. Rosenthal.

Zoom link
15/03/2018 4:15 PM

W316, Queens' Building

E. Y. Wang, Wolfson Institute of Population Health

Design of multi-arm, multi-stage clinical trials with dynamic controls

Often, for a given patient population, there will be more than one treatment available for testing at the Phase III stage. Rather than conducting separate randomised controlled trials for each of these treatments (which could require prohibitively high numbers of patients), this study proposes a multi-arm trial assessing the performance of all the available treatments.

A surrogate biomarker/endpoint will be used to judge the performance of the treatments at interims, where, if a treatment under-performs with regard to the surrogate biomarker compared to the control, it will be removed and recruitment instead given to a new promising treatment.

I explored trials of this design which continue over a long period of time, comparing the power of a trial of this design with several consecutive parallel Phase III trials, to explore which design fared best in terms of type I error, power, survival of patients on the trials and long-term survival of patients with the given disease.
04/05/2017 4:30 PM

Queens' W316

J. M. Cuzick, Wolfson Institute of Population Health

Use of frailty models in medical statistics

Frailty models are typically used when there are unobserved covariates. Here we explore their use in two situations where they provide important insights into two epidemiologic questions.

The first involves the question of type replacement after vaccination against the human papilloma virus (HPV). At least 13 types of HPV are known to cause cancer, especially cervix cancer. Recently vaccines have been developed against some of the more important types, notably types 16 and 18. These vaccines have been shown to prevent infection by the types used with almost 100% efficacy. However a concern has been raised that by eliminating these more common types, a niche will be created in which other types could now flourish and that the benefits of vaccination could be less than anticipated if this were to occur. It will be years before definitive data is available on this, but preliminary evidence could be obtained if it could be shown that there is a negative associated between the occurrence of multiple infections in the same individual. The virus is transmitted by sexual contact and testing for it has become part of cervical screening. As infection increases with greater sexual activity, a woman with one type is more likely to also harbour another type so the question can be phrase as to whether there is a negative association between specific pairs of types in the context of an overall positive association. A frailty model is used for this in which the total number of infections a women has is an unobserved covariate and the question can be rephrased to ask if specific types are negatively correlated conditional on the number of types present in a woman. This is modelled by assuming a multiplicative random variable τ having a log gamma distribution with unit mean and one additional parameter θ so that the occurrence of type j in individual i is modified to be τi pj where τi are iid copies of τ and the joint probability of being infected by types 1,..,k is Ѳkp1…pk with Ѳk = E(τk). A likelihood is obtained and moment based estimation procedures are developed and applied to a large data set.

A second example pertains to an extension of the widely used proportional hazards model for analysing time to event data with censoring. In practice hazards are often not proportional over time and converging hazards are observed, and the effect of a covariate is stronger in early follow up than it is subsequently. This can be modelled by assuming an unobserved multiplicative factor in the hazard function again having a log gamma distribution with unit mean and one additional parameter θ. Integrating out this term leads to a Pareto survival distribution. A (partial) likelihood is obtained and estimation procedures are developed and applied to a large data set.
08/12/2016 4:30 PM

BR 3.02

C. Wang, Wolfson Institute of Population Health

Spatial analysis and its application in modelling cancer screening coverage in England

One problem that arises from spatial data is that spatial correlation often exists among the observations, since spatial unites close to each other are likely to share similar socio-economic, infrastructure or other characteristics. Statistical models that ignored spatial correlation may lead to biased parameter estimates. In the econometrics literature, there are several methods to measure and model such spatial correlated effects. We demonstrate some of these statistical tools using real-world data by exploring factors affecting cancer screening coverage in England. In this particular study, we are interested in the impact of car ownership and public transport usage on breast and cervical cancer screening coverage. District-level cancer screening coverage data (in proportions) and UK census data have been collected and linked.

A non-spatial model (using ordinary least squares, OLS) was firstly fitted, and Moran's I statistic was used and found that significant spatial correlation exists even after controlling for a range of predictors. Two alternative spatial models were then tested, namely: 1) spatial autoregressive (SAR) model, and 2) SAR error model, or simply as spatial error model (SEM).

Results from spatial models are compared with the non-spatial models, it has been found that some coefficient estimates are different, and the former outperforms the latter in terms of goodness-of-fit. In particular, the SEM is the best model for both types of cancer.

Finally, we discuss some general issues in spatial analysis, such as the modifiable areal unit problem (MAUP), different spatial weighting schemes, and other spatial modelling strategies such as a gravity model and a spatially varying coefficient model.
19/05/2016 4:30 PM

M103

B. V. North, Wolfson Institute of Population Health

STARPAC and phase 1 studies with the continual reassessment method

Phase I clinical trials are an essential step in the development of anticancer drugs. The main goal of these studies is to establish the recommended dose and/or schedule of new drugs or drug combinations for phase II trials. The guiding principle for dose escalation in phase I trials is to avoid exposing too many patients to subtherapeutic doses while maintaining rapid accrual and preserving safety by limiting toxic side-effects. STARPAC is a phase 1 trial examining the use of ATRA, a Vitamin A like compound, in combination with established cancer drugs in combatting pancreatic cancer, a cancer with a dismal survival record which is the 4th highest cancer killer world wide. A challenge for toxicity trials that prescribe doses for newly recruited patients based on the dose and toxicity data from previous patients in that patients are recruited before previous patients have reported toxicity data. In order to safely escalate doses we employ a 2 stage process with the first stage an accelerated rule based procedure and the second stage a modified approach based on the Bayesian Continual Reassessment Method that combines a prior toxicity-dose curve with the accumulating patient dose/toxicity data.
04/02/2016 4:30 PM

M103

P. D. Sasieni, Wolfson Institute of Population Health

Alternatives to net and relative survival for comparison of survival between populations

Most cancer registries choose not to rely on cause of death when presenting survival statistics on cancer patients, but instead to look at overall mortality after diagnosis and adjust for the expected mortality in the cohort had they not been diagnosed with cancer. For many years the relative survival (observed survival divided by expected survival) was estimated by the Ederer-II method. More recently statisticians have used the theory of classical competing risks to estimate the net survival – that is the survival that would be observed in cancer patients if it were possible to remove all competing causes of death. Pohar-Perme showed that in general estimators of the relative survival and not consistent for the net survival, and proposed a new consistent estimator of the net survival. Poher-Perme’s estimator can have much larger variance than Ederer-II (and may not be robust). Thus whereas some statisticians have argued that one must use the Poher-Perme estimator because it is the only one that is consistent for the net survival, others have argued that there is a bias-variance trade off and Ederer-II may still be preferred even though it is inconsistent.

We draw analogy from the literature regarding robust estimation of location. If one wants to estimate the mean of a distribution consistently, then it may be difficult to improve on the sample mean. But if one simply wants a measure of location then other estimators are possible and might be preferred to the sample mean. We define a measure of net survival to be a functional satisfying certain equivariance and order conditions. The limits of neither Ederer-II nor Pohar-Perme satisfy our definition of being an invariant measure of net survival. We introduce two families of functionals that do satisfy our definition. Consideration of minimum variance and robustness then allows us to select a single member of each family as the preferred measure of net survival.

Noting that in a homogeneous population the relative survival and the net survival are identical and correspond to the survival of the excess hazard, we can then view our functionals of weighted averages of stratum-specific relative-survival, net-survival or excess-hazards. These can be viewed as standardised estimators with standardising weights that are time-dependent. The preferred measures use weights that depend on the numbers at risk in each stratum from a standard population as a function of time. For example, when the strata are defined by age at diagnosis, the standardising weights will depend on the age-specific prevalence of the cancer in the standard population.

We show through simulation that, unlike both Ederer-II and Pohar-Perme, our estimators are invariant and robust under changing population structures, and also that they are consistent and reasonably efficient. Although our estimator does not (consistently) estimate the (marginal) net hazard it performs as well or better than both the crude and standardised versions of both Ederer-II and Pohar-Perme in all simulations.

Joint work with Adam Brentnall
29/01/2015 4:30 PM

M203

A. R. Brentnall, Wolfson Institute of Population Health

On use of the concordance index in epidemiology

Studies of risk factors in epidemiology often use a case-control design. The concordance index (or area under the receiver operating characteristic (ROC) curve (AUC)) may be used in unmatched case-control studies to measure how well a variable discriminates between cases and controls. The AUC is sometimes used in matched case-control studies by ignoring matching, but it lacks interpretation because it is not based on an estimate of the ROC for the population of interest. An alternative measure of the concordance of risk factors conditional on the matching factors will be introduced, and applied to data from breast and lung cancer case-control studies.

Another common design in epidemiology is the cohort study, where the aim might be to estimate the concordance index for predictors of censored survival data. A popular method only considers pairs of individuals when the
smaller outcome is uncensored (Harrell's c-statistic). While this statistic can be useful for comparing different models on the same data set, it is dependent on the censoring distribution. Methods to address this issue will be considered and applied to data from a breast cancer trial.
22/04/2013 3:00 PM

130 Wolfson Institute

Peter Sasieni, Wolfson Institute of Population Health, QMUL

Some Statistical issues arising from evaluating cancer screening

Ideally, cancer screening is initially evaluated through randomised controlled trials, the analysis of which should be straightforward. The statistical challenges arise when one is either trying to combine the results of several trials with different designs, or trying to evaluate routine service screening (which may use improved technologies compared to the original randomised controlled trials).

We will briefly discuss the following problems.
1. Estimation from interval censored data based on imperfect observations. When screening for asymptomatic pre-cancerous disease, one will only identify the disease if the individual with the disease is screened and if the screening test is positive (leading to further investigations and a definitive diagnosis). In the simple model one may have periodic screening with a fixed sensitivity. A more sophisticated analysis would take account of the possibility that as the precancerous lesion grows the sensitivity of the screening test increases.
2. Estimating over-diagnosis (defined as a screen-detected cancer that would not have been diagnosed (before the individual died) in the absence of screening) from a trial in which the control arm are all offered screening at the end of the trial. The idea is that with extended follow-up data one may be able to apply methods designed for non-compliance to estimate over-diagnosis.
3. Improving ecological studies and trend analyses to try to estimate the effects of screening on incidence (over-diagnosis or cancer prevention) and mortality taking into account secular trends in incidence and mortality.
4. Meta-analysis of randomised trials of screening that are heterogeneous in terms of screening interval, duration of follow-up after the last screen, and whether or not the control group were offered screening at the end of the trial. The idea that we explore is whether by modelling the expected behaviour of the incidence function over time, one can combine estimates of the same quantity in the meta analysis.
5. How should one quantify exposure to screening in an observational (case-control) study of cancer screening? The issue is whether one can use such studies to accurately estimate the benefit of screening at different intervals. We will discuss a few options and suggest that they may best be studied by applying them to simulated data.
08/12/2021 1:00 PM

MB204 + Zoom

Mihai Cucuringu

Spectral methods for clustering signed and directed networks

We consider the problem of clustering in two important families of
networks: signed and directed, both relatively less well explored
compared to their unsigned and undirected counterparts. Both problems
share an important common feature: they can be solved by exploiting the
spectrum of certain graph Laplacian matrices or derivations thereof. In
signed networks, the edge weights between the nodes may take either
positive or negative values, encoding a measure of similarity or
dissimilarity. We consider a generalized eigenvalue problem involving
graph Laplacians, with performance guarantees under the setting of a
signed stochastic block model. The second problem concerns directed
graphs. Imagine a (social) network in which you spot two subsets of
accounts, X and Y, for which the overwhelming majority of messages (or
friend requests, endorsements, etc) flow from X to Y, and very few flow
from Y to X; would you get suspicious? To this end, we also discuss a
spectral clustering algorithm for directed graphs based on a
complex-valued representation of the adjacency matrix, which is able to
capture the underlying cluster structures, for which the information
encoded in the direction of the edges is crucial. We evaluate the
proposed algorithm in terms of a cut flow imbalance-based objective
function, which, for a pair of given clusters, it captures the
propensity of the edges to flow in a given direction. Experiments on a
directed stochastic block model and real-world networks showcase the
robustness and accuracy of the method, when compared to other
state-of-the-art methods. Time permitting, we briefly discuss potential
extensions to the sparse setting and regularization, applications to
lead-lag detection in time series and ranking from pairwise comparisons.

Zoom link
01/12/2021 1:00 PM

Zoom

Vukosi Marivate (University of Pretoria)

Coming to grips with the reality of data science - it's people all the way down

As practising Data Science researchers and practitioners, the COVID-19 pandemic has highlighted both the need for data driven decision making and the reality of what it really takes to get to that point. It is not only about throwing data and models at a problem. It is about understanding the environment that one is in and then strategising on what might best work for that environment. In this talk I look back at some of the work we have done within responding to different challenges within both Data Science and Natural Language Processing. I place at the center people and how they are the important piece in our practice.

Zoom link
24/11/2021 4:00 PM

via Zoom

Philippe Gagnon

An asymptotic Peskun ordering and its application to lifted samplers

Please note different time from usual seminar time.

A Peskun ordering between two samplers, implying a dominance of one over the other, is known among the Markov chain Monte Carlo community for being a remarkably strong result, but it is also known for being one that is notably difficult to establish. Indeed, one has to prove that the probability to reach a state, using a sampler, is greater than or equal to the probability using the other sampler, and this must hold for all states excepting the current state. We provide in this paper a weaker version that does not require an inequality between the probabilities for all these states: the dominance holds asymptotically, as a varying parameter grows without bound, as long as the states for which the probabilities are greater than or equal to belong to a mass-concentrating set. The weak ordering turns out to be useful to compare lifted samplers for partially-ordered discrete state-spaces with their Metropolis–Hastings counterparts. An analysis yields a qualitative conclusion: they asymptotically perform better in certain situations (and we are able to identify these situations), but not necessarily in others (and the reasons why are made clear). The difference in performance is evaluated quantitatively in important applications such as graphical-model simulation and variable selection.

Joint work with Florian Maire (Université de Montréal).

The pre-print is available at: https://arxiv.org/abs/2003.05492. In the talk, I will focus on the motivations of our work, which will allow to motivate our theoretical result.

Zoom link
17/11/2021 1:00 PM

MB203 (please note different location) + Zoom

Arthur Guillaumin

Debiased Whittle likelihood for time series and spatial data

Time series and spatial data are ubiquitous in many application areas, such as environmental data, geosciences, astronomy, and finance. A key statistical modelling and estimation challenge for these data is that of dependance between points at different times or locations. While parametric models of covariance can be estimated via exact likelihood, this is ill-suited for many practical problems due to the heavy computational cost.

A standard approach to address this relies on approximate likelihood methods. The Whittle likelihood is one such approximation for gridded data, based on the Discrete Fourier Transform of the data. It is popular due to its n log n computational cost, robustness to non-Gaussian data, and amenability to interpretation in the spectral domain. However, Whittle likelihood estimates can suffer from a strong bias due to the finite and discrete sampling. This is true in particular for spatial data where bias dominates verses standard deviation in dimension equal or greater than two. Additionally, practical sampling patterns often diverge from theoretical requirements, due to non-square observational domains or missing data. In this presentation we present a recently proposed modification to the Whittle likelihood which addresses all these issues at once.

We provide asymptotic results under a framework which we call Significant Correlation Contribution, which allows us to understand the interplay between the sampling pattern and the covariance model. We demonstrate that our modification renders our estimate asymptotically efficient and normal for a wide class of settings and present some practical use cases.

Zoom link
03/11/2021 1:00 PM

Zoom

Marzieh Eidi (Max Planck Institute for Mathematics in the Sciences)

Curvature-based Analysis of Directed (Hyper)Networks

Today we are confronted with huge and highly complex data and one main challenge is to determine the "structure" of complex networks or ''shape'' of data. In the past few years, geometric and topological methods, as powerful tools that originated from Riemannian geometry, are becoming popular for data analysis. In this seminar, after introducing Ollivier-Ricci curvature for (directed) hypergraphs, as one of the main recent applications, I will present the result of the implementation of this tool for the analysis of chemical reaction networks. We will see that this notion alongside Forman-Ricci curvature are edge-based complementary tools for detecting some important structures in the network.

Zoom link
27/10/2021 1:00 PM

MB204 and Zoom

Jun Yang (University of Oxford)

Stereographic Markov Chain Monte Carlo

High dimensional distributions, especially those with heavy tails, are notoriously difficult for off the shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information and local moves, results in empirically observed "stickiness" and poor theoretical mixing properties - lack of geometric ergodicity. In this paper we introduce a new class of MCMC samplers that map the original high dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop Random Walk Metropolis type algorithms as well as versions of Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy tailed distributions and also empirically exhibit rapid convergence.

Joint work with Krzysztof Latuszynski and Gareth O. Roberts.

Zoom link
20/10/2021 1:00 PM

Zoom

Tom Leinster (University of Edinburgh)

What is the uniform distribution?

Everyone knows what the uniform probability distribution is on a real interval or on a finite set, but it is not so obvious what we should understand "uniform distribution" to mean on a completely arbitrary space. I will give a general definition, taking "space" to mean something slightly more general than compact metric space. The definition rests on a maximum entropy theorem for distributions on metric spaces, which in turn arose from questions about the measurement of biodiversity. This idea of seeking a systematic general notion of uniform distribution is similar in spirit to the quest for an objective prior, and indeed, is at least loosely related to it, as I will explain. (Joint work with Emily Roff.)

Zoom link
06/10/2021 1:00 PM

Room MB503 + Zoom streaming

Nina Otter (QMUL)

A topological perspective on weather regimes

In this talk I will discuss recent and ongoing work on using topology to define and study weather regimes. The talk is based on joint work with K. Strommen, M. Chantry and J. Dorrington, with preprint available at https://arxiv.org/abs/2104.03196.

Zoom link: https://qmul-ac-uk.zoom.us/j/82103051171?pwd=NjJRckR5Z3lJRzRRZlFlblhDNGFzZz09
29/09/2021 1:00 PM

MB-503 and Zoom

Philippa (Pip) Pattison (University of Sydney)

Realisation-dependent models for networks

Abstract: In this talk, I summarise progress in building models for social networks that capture many of their well-known structural features. I focus on a modelling approach which construes global network structure as the outcome of dynamic, potentially realisation-dependent, interactive processes occurring within local neighbourhoods of a network. I describe a hierarchy of models implied by the approach and their estimation from partial network data structures obtained through certain types of network sampling schemes. I illustrate how these models can be used to enrich our understanding of community network structures and hence of processes such as the transmission of infectious diseases.

About the speaker: Prof Pip Pattison is a quantitative psychologist by background and the primary focus of her research is the development and application of mathematical and statistical models for social networks and network processes. She is currently the Deputy Vice-Chancellor (Education) at the University of Sydney.

Zoom Link
26/05/2021 2:00 PM

Zoom

Concepcion Ausin (Universidad Carlos III de Madrid)

Variational inference for high dimensional structured factor copulas

Factor copula models have been recently proposed for describing the joint distribution of a large number of variables in terms of a few common latent factors. A Bayesian procedure is employed in order to make fast inferences for multi-factor and structured factor copulas. To deal with the high dimensional structure, a Variational Inference (VI) algorithm is applied to estimate different specifications of factor copula models. Compared to the Markov Chain Monte Carlo (MCMC) approach, the variational approximation is much faster and could handle a sizable problem in limited time. Another issue of factor copula models is that the bivariate copula functions connecting the variables are unknown in high dimensions. An automatic procedure is derived to recover the hidden dependence structure. By taking advantage of the posterior modes of the latent variables, the bivariate copula functions are selected by minimizing the Bayesian Information Criterion (BIC). Simulation studies in different contexts show that the procedure of bivariate copula selection could be very accurate in comparison to the true generated copula model. The proposed procedure is illustrated with two high dimensional real data sets.

Zoom link
19/05/2021 2:00 PM

Zoom

Victor Veitch (University of Chicago)

Counterfactual Invariance to Spurious Correlations

Informally, a ‘spurious correlation’ is the dependence of a model on some aspect of the input data that an analyst thinks shouldn’t matter. In machine learning, these have a know-it-when-you-see-it character, e.g., changing the gender of a sentence’s subject changes a sentiment predictor’s output. I'll talk about counterfactual invariance, a causal formalization of the requirement that changing irrelevant parts of the input shouldn’t change model predictions. We connect counterfactual invariance to out-of-domain model performance, and provide schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and meaning of counterfactual invariance depend fundamentally on the true underlying causal structure of the data. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.

Zoom link
12/05/2021 2:00 PM

Zoom

Sylvia Fruhwirth-Schnatter (University of Vienna)

Triple the gamma – Achieving Shrinkage and Variable Selection in TVP Models

Time-varying parameter (TVP) models are a popular tool for handling data with smoothly changing parameters. However, in situations with many parameters the flexibility underlying these models may lead to overfitting models and, as a consequence, to a severe loss of statistical efficiency. This occurs, in particular, if only a few parameters are indeed time-varying, while the remaining ones are constant or even insignificant. As a remedy, hierarchical shrinkage priors have been introduced for TVP models to allow shrinkage both of the initial parameters as well as their variances toward zero.

The talk reviews various approaches of introducing shrinkage priors for TVP models. Recently, Cadonna et al (2020) introduced the (hierarchical) triple Gamma prior which includes other popular shrinkage priors such as the double Gamma prior and the horseshoe prior as special cases. The talk also discussed efficient methods for MCMC inference and investigates the close resemblance of the triple Gamma prior with BMA. For illustration, hierarchical shrinkage priors are applied to TVP-VAR-SV models, a popular tool for modelling multivariate macroeconomic time series. The results clearly indicate that shrinkage priors reduce the risk of overfitting and increase statistical efficiency in a TVP modelling framework.

(based on joint work with Annalisa Cadonna and Peter Knaus, Vienna University of Economics and Business)

Full version of Cadonna et al (2020): https://doi.org/10.3390/econometrics8020020

Zoom link
05/05/2021 2:00 PM

Zoom

Celeste Damiani (QMUL)

AN AI SYSTEM FOR ASSESSING BREAST DENSITY

At the moment, in the UK all women go through the same breast cancer screening programme. But different women have different levels of risk of getting breast cancer. In our project we are looking at how we can adjust breast cancer screening programmes so that they are more suited to the level of risk faced by each woman - this is known as risk-adapted screening. In particular, breast density is the amount of white and bright regions seen on a mammogram. High breast density can make it harder for doctors to detect breast cancer on a screening mammogram and also increases the risk of developing breast cancer. I am going to talk about how we are planning to use AI algorithms to objectively measure breast density and answer the question: how can we tell when a woman might be at risk of getting a false negative during a standard mammogram, and should be offered an alternative screening method? This is part of the CRUK funded project “An Artificial Intelligence System for Real-time Risk Assessment at Mammography Screening (Mammo AI)”

Zoom link
28/04/2021 2:00 PM

Zoom

Maria Grith (Erasmus University of Rotterdam)

The Block-Autoregressive Model in Non-Standard Bases

We propose a new autoregressive model for the analysis of time-series with periodic interdependencies. The model is based on the application of a vector autoregressive model to univariate data that is partitioned into ‘blocks’ of observations. For this reason, we refer to it as the block-autoregressive (BAR) model. The untransformed BAR model nests several other autoregressive models such as the regular AR model, the periodic AR model, the (mixed) seasonal AR model, and the scale-specific AR model that was introduced by Bandi et. al (2019). In addition, the BAR model can be transformed using orthonormal bases to unveil dependencies between weighted averages of observations in subsequent blocks. This yields parsimonious model representations that enhance interpretability and improve predictive performance. The model is estimated using OLS and parametric bootstrapping methods in the case of large samples, which is complemented by a basis-specific LASSO step for smaller samples. Both simulated and empirical examples are used to illustrate the model. Joint with Dick van Dijk and Karel de Wit.

Zoom link
21/04/2021 2:00 PM

Zoom

Radu Craiu (University of Toronto)

Finding our Way in the Dark: Approximate MCMC for Approximate Bayesian Methods

With larger amounts of data at their disposal, scientists are emboldened to tackle complex questions that require sophisticated statistical models. It is not unusual for the latter to have likelihood functions that elude analytical formulations. Even under such adversity, when one can simulate from the sampling distribution, Bayesian analysis can be conducted using approximate methods such as Approximate Bayesian Computation (ABC) or Bayesian Synthetic Likelihood (BSL). A significant drawback of these methods is that the number of required simulations can be prohibitively large, thus severely limiting their scope. We propose perturbed MCMC samplers that can be used within the ABC and BSL paradigms to significantly accelerate computation while maintaining control on computational efficiency. The proposed strategy relies on recycling samples from the chain’s past. The algorithmic design is supported by a theoretical analysis while practical performance is examined via a series of simulation examples and data analyses. This is joint work with Dr. Evgeny Levi.

Zoom link
14/04/2021 2:00 PM

Zoom

Antonio Lijoi (Bocconi University, Milan)

Measuring dependence for Bayesian nonparametric models

The Bayesian approach to inference stands out for naturally allowing borrowing of information across heterogeneous populations or studies. Several popular classes of models in this setting induce a dependence structure on the observations that can be seen as a mixture between the two extreme cases of exchageability and unconditional independence. As an illustrative example in this direction, a recent proposal based on the Dirichlet process will be described. Such a structure leads one to consider the problem of measuring dependence in terms of the distance of the actual prior specification from the two extremes. The talk will describe a novel approach that relies on the Wasserstein distance and is suitably tailored to random measure based models. An application to some noteworthy models in the literature provides some useful insights.

Zoom link
07/04/2021 3:00 PM

Zoom

Jingwei Liang (QMUL)

Screening for Sparse Online Learning

Sparsity promoting regularizers are widely used to impose low-complexity structure (e.g. l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit "finite activity identification", namely, they can identify the low-complexity structure in a finite number of iterations. However, most online algorithms (such as proximal stochastic gradient descent) do not have the property owing to the vanishing step-size and non-vanishing variance. In this talk, by combining with a screening rule, I will show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification. One consequence is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited for computational gains. Numerically, significant acceleration can be obtained.

Zoom link
31/03/2021 2:00 PM

Zoom

Sebastian Schmon (Improbable, UK)

Generalized Posteriors for Approximate Bayesian Computation and Simulation-based Inference

Complex simulators have become a ubiquitous tool in many scientific disciplines, providing high-fidelity, implicit probabilistic models of natural and social phenomena. Unfortunately, they typically lack the tractability required for conventional statistical analysis. Approximate Bayesian computation (ABC) has emerged as a key method in simulation-based inference, wherein the true model likelihood and posterior are approximated using samples from the simulator. In this talk, we will first draw connections between ABC and generalized Bayesian inference (GBI) by re-interpreting the accept/reject step in ABC as an implicitly defined error model. Then we argue that these implicit error models will invariably be misspecified.

While ABC posteriors are often treated as a necessary evil for approximating the standard Bayesian posterior, this allows us to re-interpret ABC as a potential robustification strategy. In a second step, we will turn our attention to some recent machine learning approaches to simulation-based inference. While those methods are designed to be exact when the true data generating mechanism is known, we will show that neural density estimators can perform poorly when this assumption is violated. Using our findings on ABC we will argue for a combination of machine-learning and statistics approach to obtain a reliable, but highly efficient algorithm for posterior inference in intractable models.

Zoom link
17/03/2021 2:00 PM

Zoom

Anthony Constantinou (QMUL)

Bayesian network structure learning with noisy data

Numerous Bayesian network structure learning algorithms have been proposed in the literature over the past few decades. Each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with synthetic data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This presentation will provide a brief introduction to the two main classes of structure learning, called constraint-based and score-based, and illustrate how different assumptions of data noise influence structure learning performance.

Zoom Link
03/03/2021 2:00 PM

Zoom

Roberto Casarin (Ca' Foscari University of Venice, Italy)

Bayesian Dynamic Tensor Regression

Tensor-valued data are becoming increasingly available in economics and this calls for suitable econometric tools. We propose a new dynamic linear model for tensor-valued response variables and covariates that encompasses some well-known econometric models as special cases. Our contribution is manifold. First, we define a tensor autoregressive process (ART), study its properties, and derive the associated impulse response function. Second, we exploit the PARAFAC low-rank decomposition for providing a parsimonious parametrization and to incorporate sparsity effects. We also contribute to inference methods for tensors by developing a Bayesian framework which allows for including extra-sample information and for introducing shrinking effects. We apply the ART model to time-varying multilayer networks of international trade and capital stock and study the propagation of shocks across countries, over time and between layers.

Zoom Link
24/02/2021 3:00 PM

Zoom

Michele Guindani (University of California, Irvine, United States)

A Common Atom Model for the Bayesian Nonparametric Analysis of Nested Data

The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this talk, we propose a nested Common Atoms Model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a two-layered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. If time allows, we will also discuss an application to the analysis of time series calcium imaging experiments in awake behaving animals.We further investigate the performance of our model in capturing true distributional structures in the population by means of simulation studies.

Zoom Link
17/02/2021 2:00 PM

Zoom

Marcelo Pereyra (Heriot-Watt University)

Bayesian inference with data-driven image priors encoded by neural networks

This talk presents a mathematical and computational methodology for performing Bayesian inference in problems where prior knowledge is available in the form of a training dataset or set of training examples. This prior information is encoded into the model by using a deep neural network, which is combined with an explicit likelihood function by using Bayes' theorem to derive the posterior distribution for the quantities of interest given the available data. Bayesian computation is then performed by using appropriate Markov chain Monte Carlo stochastic algorithms. We study the properties of the proposed models and computation algorithms and illustrate performance on a range of inverse problems related to imaging sciences, where they are used to perform Bayesian point estimation, uncertainty quantification, hypothesis testing, and model misspecification diagnosis.

Based on a joint work with Matthew Holden and Kostas Zygalakis.

Zoom Link
12/02/2021 10:00 AM

Zoom

Clara Grazian (University of New South Wales, Australia)

The importance of being conservative: Bayesian analysis for mixture models

From a Bayesian perspective, mixture models have been characterised by a restrictive prior modelling since their ill-defined nature makes most of the improper priors not acceptable. In particular, recent results have shown the inconsistency of the posterior distribution on the number of components when using standard nonparametric prior processes.

We propose an analysis of prior choices associated by their property of conservativeness in the number of components. Among the proposals, we derive a prior distribution on the number of clusters which considers the loss one would incur if the true value representing the number of components were not considered. The prior has an elegant and easy to implement structure, which allows to naturally include any prior information one may have as well as to opt for a default solution in cases where this information is not available.

The methods are then applied on two real datasets. The first dataset consists of retrieval times for monitoring IP packets in computer network systems. The second dataset consists of measures registered in antimicrobial susceptibility tests for 14 compounds used in the treatment of M. Tuberculosis. In both the situations, the number of clusters is uncertain and different solutions lead to different interpretations.

Zoom Link
03/02/2021 2:00 PM

Zoom

Davide Ferrari (Free University of Bozen-Bolzano, Italy)

Model selection by sparse composition of estimating equations

This talk introduces a method for selecting high-dimensional models based on a truncation mechanism
to generate sparse estimating equations. Given a set of low-dimensional estimating equations for the model parameters, a high-dimensional model is selected by minimizing the distance between a composite estimating equation and the full likelihood scores subject to a L1-type penalty. The proposed strategy reduces the overall model complexity by dropping the noisy terms in the estimating equations. Differently from other approaches to model selection, our penalty involves the inclusion of low-dimensional equations rather than model parameters; this implies that consistency of the final parameter estimates is unaffected by the selection mechanism. Numerical and statistical efficiency of the new methodology is illustrated through examples on simulated and real data.

Zoom Link
25/11/2020 2:00 PM

Zoom

Yanbei Chen (Computer Vision Group, QMUL)

Image Search with Text Feedback by Visiolinguistic Attention Learning

Zoom link.

Abstract

Image search with text feedback has promising impacts in various real-world applications, such as e-commerce and internet search. Given a reference image and text feedback from user, the goal is to retrieve images that not only resemble the input image, but also change certain aspects in accordance with the given text. This is a challenging task as it requires the synergistic understanding of both image and text. In this work, we tackle this task by a novel Visiolinguistic Attention Learning (VAL) framework. Specifically, we propose a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics. By inserting multiple composite transformers at varying depths, VAL is incentive to encapsulate the multi-granular visiolinguistic information, thus yielding an expressive representation for effective image search. We conduct comprehensive evaluation on three datasets: Fashion200k, Shoes and FashionIQ. Extensive experiments show our model exceeds existing approaches on all datasets, demonstrating consistent superiority in coping with various text feedbacks, including attribute-like and natural language descriptions.

This work was presented in CVPR 2020. Link to paper here.
09/12/2020 2:00 PM

Zoom

Jamie Griffin (Statistics and Data Science Group, QMUL)

Estimates of the severity of coronavirus disease 2019: a model-based analysis

Zoom link.

Abstract

In the face of rapidly changing data, a range of case fatality ratio estimates for coronavirus disease 2019 (COVID-19) have been produced that differ substantially in magnitude. We aimed to provide robust estimates, accounting for censoring and ascertainment biases. These early estimates give an indication of the fatality ratio across the spectrum of COVID-19 disease and show a strong age gradient in risk of death.

Lancet paper here.
11/11/2020 2:00 PM

Zoom

Luca Rossini (Statistics and Data Science Group, QMUL)

Proper Scoring rules for evaluating asymmetry in density forecasting

Zoom link.

Abstract

In this talk, we propose a novel asymmetric continuous probabilistic score (ACPS) for evaluating and comparing density forecasts. It extends the proposed score and defines a weighted version, which emphasizes regions of interest, such as the tails or the center of a variable's range. A test is also introduced to statistically compare the predictive ability of different forecasts. The ACPS is of general use in any situation where the decision maker has asymmetric preferences in the evaluation of the forecasts. In an artificial experiment, the implications of varying the level of asymmetry in the ACPS are illustrated. Then, the proposed score and test are applied to assess and compare density forecasts of macroeconomic relevant datasets (US employment growth) and of commodity prices (oil and electricity prices) with particular focus on the recent COVID-19 crisis period.

This is a joint work with Matteo Iacopini and Francesco Ravazzolo. Link to the paper here.
18/11/2020 2:00 PM

Zoom

Ayon Mukherjee (Principal Biostatistician, Clinipace Berlin)

Covariate-Adjusted Response-Adaptive Designs for Weibull distributed Survival Responses

Zoom link.

Abstract

Covariate-adjusted response-adaptive (CARA) designs use available responses to skew the treatment allocation in an ongoing clinical trial in favour of the treatment arm found at an interim stage to be best for a patient’s covariate profile.

There has recently been extensive research on CARA designs mainly involving binary responses. Though exponential survival responses have also been considered, the constant hazard property of the exponential model makes the mean residual life for patients constant, making it too restrictive for wide-ranging applicability. To overcome this limitation, designs are developed for Weibull distributed survival responses by deriving two variants of optimal designs based on an optimality criterion.

The optimal designs are based on the covariate-adjusted doubly-adaptive biased coin design (CADBCD) in one case, and the covariate-adjusted efficient randomised adaptive design (CAERADE) in the other.. The observed treatment allocation proportions for these designs converge to the expected targeted values, which are derived based on constrained optimization problems. The existing large sample theory for CARA designs rely on Taylor expansion of the allocation probability function, which do not apply to the CAERADE, as it is a discrete and discontinuous function. To overcome this difficulty of discontinuity, and to establish the asymptotic properties of the CAERADE, a stopping time of a martingale process has been introduced. A comparative analysis of these two optimal designs are also discussed. Given the treatment allocation history, response histories, previous covariate information and the covariate profile of the incoming patient, an expression for the conditional probability of a patient being allocated to a particular treatment has been obtained. To apply such designs, the treatment allocation probabilities are sequentially modified based on the history of previous patients’ treatment assignments, responses, covariates and the covariates of the new patient.

For a Phase III clinical trial, the CAERADE is preferable to the CADBCD when the main objective is to minimise the asymptotic variance of the allocation procedure. However, the former procedure being discrete tends to be slower in converging towards the expected target allocation proportion. Since the CAERADE provides a design with minimum variance, it is better than the CADBCD as far as the power of the Wald test for testing treatment differences is concerned. An extensive simulation study of the operating characteristics of the proposed designs supports these findings. It is concluded that the proposed CARA procedures can be suitable alternatives to the traditional balanced randomization designs in survival trials, provided that response data are available during the recruitment phase to enable adaptations to the designs. The findings are illustrated extensively by redesigning an existing clinical trial for treating colorectal cancer.

Keywords: Censored Responses; Optimum allocation; Power; Variability; Covariate Profile
21/10/2020 2:00 PM

Zoom

Chris Bamford (Game AI, QMUL)

Neural Game Engine: Accurate learning of generalizable forward models from pixels

Abstract:

Access to a fast and easily copied forward model of a game is essential for model-based reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for model-free algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previous work on the Neural GPU, this talk introduces the Neural Game Engine, as a way to learn models directly from pixels. The learned models are able to generalise to different size game levels to the ones they were trained on without loss of accuracy. Results on deterministic General Video Game AI games demonstrate competitive performance, with many of the games models being learned perfectly both in terms of pixel predictions and reward predictions. The pre-trained models are available through the OpenAI Gym interface here: https://github.com/Bam4d/Neural-Game-Engine.

Zoom link.

arXiv link.
25/03/2020 12:00 PM

Mathematical Sciences Building, Room MB503

Dr. Maria Kalli (University of Kent)

Cancelled

Cancelled because of coronavirus.
11/03/2020 12:00 PM

Mathematical Sciences Building, Room MB-503

Dr. Kalliopi Mylona (King's College London)

Cancelled

Cancelled because of coronavirus.
26/02/2020 12:00 PM

Mathematical Sciences Building, Room MB503

Dr. Yunxiao Chen (London School of Economics)

Statistical Analysis of Item Preknowledge in Educational Tests: Latent Variable Modelling and Statistical Decision Theory

Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one’s life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in educational tests due to item leakage. That is, a proportion of test takers have access to leaked items before a test is administrated, which leads to inflated performance on the set of leaked items. We develop methods for the simultaneous detection of cheating test takers and compromised items based on data from a single test administration, when both sets are completely unknown. Latent variable models are proposed for the modelling of (1) data consisting only of item-level binary scores and (2) data consisting of both item-level binary scores and response time, where the former is commonly available in paper-and-pencil tests and the latter is widely encountered in computer-based tests. The proposed model adds a latent class model component upon a factor model (also known as item response theory model) component, where the factor model component captures item response behaviour driven by test takers’ ability and the latent class model component captures item response behaviour due to item preknowledge. We further propose a statistical decision framework, under which compound decision rules are developed that control local false discovery/non-discovery rates. Statistical inference is carried out under a Bayesian framework. The proposed method is applied to data from a computer-based nonadaptive licensure assessment.

This is a joint work with Prof. Irini Moustaki and Ms. Yan Lu (PhD student).
19/02/2020 12:00 PM

Mathematical Sciences Building, Room MB503

Dr. Kolyan Ray (Imperial College)

Semiparametric Bayesian causal inference using Gaussian process priors

We investigate semiparametric Bayesian inference for average treatment effects based on observational data, which is a challenging problem due to the missing counterfactuals and selection bias. This model has applications in biostatistics and causal inference.

We show that standard Gaussian process priors satisfy a semiparametric Bernstein-von Mises theorem under sufficient smoothness conditions, thereby showing that the posterior can yield optimal interference. We further propose a novel propensity score-based prior modification that corrects for the first-order posterior bias. Numerical simulations confirm significant improvement in both estimation accuracy and uncertainty quantification compared to using an unmodified Gaussian process.
12/02/2020 12:00 PM

Mathematical Sciences Building, Room: MB-503

Dr Georgios Papageorgiou (Birkbeck)

Bayesian semiparametric analysis of multivariate continuous responses, with variable selection

We present an approach to Bayesian semiparametric inference for Gaussian multivariate response regression. We are motivated by various small and medium dimensional problems from the physical and social sciences. The statistical challenges revolve around dealing with the unknown mean and variance functions and in particular, the correlation matrix. To tackle these problems, we have developed priors over the smooth functions and a Markov chain Monte Carlo algorithm for inference and model selection. Specifically: Dirichlet process mixtures of Gaussian distributions is used as the basis for a cluster-inducing prior over the elements of the correlation matrix. The smooth, multidimensional means and variances are represented using radial basis function expansions. The complexity of the model, in terms of variable selection and smoothness, is then controlled by spike-slab priors. A simulation study is presented, demonstrating performance as the response dimension increases. Finally, the model is fit to a number of real world datasets.
11/12/2019 12:00 PM

Mathematical Sciences Building, Room: MB-503

Dr Monica Pirani (Imperial College London)

A data integration approach to adjust for residual confounding area-referenced environmental health studies

Study designs where data have been aggregated by geographical areas are popular in environmental epidemiology. These studies are commonly based on administrative databases and, providing a complete spatial coverage, are particularly appealing to make inference on the entire population. However, the resulting estimates are often biased and difficult to interpret due to unmeasured confounders, which typically are not available from routinely collected data. We propose a framework to improve inference drawn from such studies exploiting information derived from individual-level survey data. The latter are summarized in an area-level scalar score by mimicking at ecological-level the well-known propensity score methodology. The literature on propensity score for confounding adjustment is mainly based on individual-level studies and assumes a binary exposure variable. Here we generalize its use to cope with area-referenced studies characterized by a continuous exposure. Our approach is based upon Bayesian hierarchical structures specified into a two-stage design: (i) geolocated individual-level data from survey samples are up-scaled at ecological-level, then the latter are used to estimate a generalized ecological propensity score (EPS) in the in-sample areas; (ii) the generalized EPS is imputed in the out-of-sample areas under different assumptions about the missingness mechanisms, then it is included into the ecological regression, linking the exposure of interest to the health outcome. This delivers area-level risk estimates which allow a fuller adjustment for confounding than traditional areal studies. The methodology is illustrated by using simulations and a case study investigating the risk of lung cancer mortality associated with nitrogen dioxide in England (UK).
04/12/2019 12:00 PM

Mathematical Sciences Building, Room: MB-503

Dr Chris Fallaize (University of Nottingham)

Unlabelled Shape Analysis with Applications in Bioinformatics

In shape analysis, objects are often represented as configurations of points, known as landmarks. The case where the correspondence between landmarks on different objects is unknown is called unlabelled shape analysis. The alignment task is then to simultaneously identify the correspondence between landmarks and the transformation aligning the objects.

In this talk, I will discuss the alignment of unlabelled shapes, and discuss two applications to problems in structural bioinformatics. The first is a problem in drug discovery, where the main objective is to find the shape information common to all, or subsets of, a set of active compounds. The approach taken resembles a form of clustering, which also gives estimates of the mean shapes of each cluster. The second application is the alignment of protein structures, which will also serve to illustrate how the modelling framework can incorporate very general information regarding the properties we would like alignments to have; in this case, expressed through the sequence order of the points (amino acids) of the proteins.
15/01/2020 12:00 PM

Mathematical Sciences Building, Room: MB503

Dr Claudia Neves (University of Reading)

On trend estimation and testing with application to extreme rainfall

Extreme Value Theory provides a rigorous mathematical justification for being able to extrapolate outside the range of the sampled observations. The primary assumption is that the observations are independent and identically distributed. Although the celebrated extreme value theorem still holds under several forms of weak dependence, relaxing the stationarity assumption, for example by considering a trend in extremes, leads to a changeling problem of inference based around the frequency of extreme events. Some studies advocate climate crisis is not so much about startling magnitudes of extreme phenomena but rather how the frequency of extreme events can contribute to the worst case scenarios that could play out on the planet. For instance, the average rainfall may not be changing much, but heavy rainfall may become significantly more or less frequent, meaning that different observations must be endowed with different aspects in their underlying distributions. In this talk, I will present statistical tools for the semi-parametric modelling of the evolution of extreme values over time and/or space by considering a trend on the frequency of exceedances above a high (random) threshold. The methodology is illustrated with an application to daily rainfall data from several gauging stations across Germany and The Netherlands.

16/10/2019 12:00 PM

Mathematical Sciences Building, Room: MB503

Dr Hugo Maruri-Aguilar (QMUL)

Lasso for hierarchical polynomial models

In hierarchical polynomial regression, an interaction term such
as x1x2 is included in the model only if both main effects
x1 and x2 are also included in the model.
We note that the divisibility conditions implicit in polynomial
hierarchy give way to natural constraints for the model parameters.
Our work uses this idea to derive versions of strong
and weak hierarchy and to extend existing work in the literature,
which at the moment is only concerned with models of degree two.
We discuss how to estimate parameters in lasso using
standard quadratic programming techniques and apply our
proposal to some examples.

This is joint work with S. Lunagomez (Lancaster University).

20/03/2019 12:00 PM

Queens' Building, Room: W316

Maria Lomeli (Babylon Health)

Amortised inference using faithful inverses for importance sampling
Automated decision-making for medical diagnosis consists of producing differentials for various diseases based on evidence about the state of the patient. A particular way to encode the various relationships between symptoms, risk-factors, and diseases is by using a Bayesian network, where the edge structure reflects the underlying causal mechanisms between the nodes. Due to the combinatorial explosion of computing posterior distributions exactly, various approximate inference schemes have been proposed to tackle this problem, such as variational inference and importance sampling, among others. In addition, amortisation techniques allow us to reduce the cost of inference by carrying out and storing some computations offline. In the medical-diagnosis task, producing highly-accurate marginals is key to differential diagnosis. Importance sampling is particularly suited for this, as it is asymptotically exact and a good choice of proposal can provide a reduction in variance. In this talk, I will discuss how we can construct various data-driven proposals by using an inverse factorisation of the model’s joint distribution. The proposal distributions are based on a neural network that is trained with samples from the generative model before inference takes place, whereas the inverse factorisation provides the sampling schedule for the importance sampling scheme. We explored the impact of different inverse factorisations in terms of variance reduction. Our findings reveal that the new scheme produces competitive data-driven proposals for importance sampling.
```
This is joint work with Divya Gautam, Kostis Gourgoulias, Saurabh Johri and Maneesh Sahani.
```
Short bio:

Maria Lomeli is currently a research scientist at Babylon Health, UK. Previously, she was a research associate at the Machine Learning group, University of Cambridge, working with Zoubin Ghahramani. She obtained her PhD from the Gatsby Unit, UCL under the supervision of Yee Whye Teh.
27/03/2019 12:00 PM

Queens' Building, Room: W316

Shaoxiong Hu (Queen Mary University of London)

The topological criteria for statistical model selection

The LASSO has recently attracted attention in the context of models with hierarchy restrictions. In these models, an interaction term is allowed only if both main effects are active (strong hierarchy) or if at least one main effect is active (weak hierarchy). For example, under strong hierarchy appearance of the term x1x2 in a model requires both x1 and x2, while under weak hierarchy at least one of x1, x2 is needed. Our work is motivated by possible higher-order interactions in linear regression models. We are concerned with enhancing the performance of LASSO for square-free hierarchical polynomial models when combining validation error with a measure of model complexity. The measure of the complexity is the sum of Betti numbers of the model, seen as a simplicial complex. We represent the polynomial regression model in terms of components and cycles, borrowing from recent developments in computational topology. We use LASSO as our model selection method combined with Betti numbers. We study and propose an algorithm which combines statistical and algebraic criteria. This compound criterion would allow us to deal with model selection problems of higher-order interactions in polynomial regression models.
06/02/2019 12:00 PM

Queens' Building, Room: W316

Fengnan Gao (Fudan University and Shanghai Center for Mathematical Sciences)

Maximum likelihood estimation of Sublinear Preferential Attachment Models and its connection to urn models

The preferential attachment (PA) network is a popular way of modeling the social networks, the collaboration networks and etc. The PA network model is an evolving network model where new nodes keep coming in. When a new node comes in, it establishes only one connection with an existing node. The random choice on the existing node is via a multi-nomial distribution with probability weights based on a preferential function f on the degrees. f maps the natural numbers to the positive real line and is assumed apriori non-decreasing, which means the nodes with high degrees are more likely to get new connections, i.e. "the rich get richer". Under sublinear parametric assumptions on the PA function, we proposed the maximum likelihood estimator on f. We show that the MLE yields optimal performance with the asymptotic normality results. Despite the optimal property of the MLE, it depends on the history of the network evolution, which is often difficult to obtain in practice. To avoid such shortcomings of the MLE, we propose the quasi maximum likelihood estimator (QMLE), a history-free remedy of the MLE. To prove the asymptotic normailty of the QMLE, a connection between the PA model and Svante Janson's urn models is exploited.

This is (partially) joint work with Aad van der Vaart.
06/03/2019 12:00 PM

Queens' Building, Room: W316

Zahra Abdulla (King's College)

How to bring the fun back to Statistics teaching: Inclusive practices to combat statistical anxiety

One of the major challenges for teachers of Statistics to non-statisticians is the high levels of statistical anxiety amongst students, student’s perceptions of what their experience has been to learn statistics or mathematics in the past and the potential of the negative impact of these attitudes or beliefs on how students learn statistics.

This talk will aim to showcase how to use different types inclusive practice activities and assessment methods constructively aligned with the learning outcomes, to support in developing students’ confidence in the classroom; through providing a supportive learning environment that works through building trust, setting expectations and making statistics fun.
13/02/2019 12:00 PM

Queens' Building, Room: W316

Yi Yu (University of Bristol)

Univariate Mean Change Point Detection: Penalization, CUSUM and Optimality

The problem of univariate mean change point detection and localization based on a sequence of n independent observations with piecewise constant means has been intensively studied for more than half century, and serves as a blueprint for change point problems in more complex settings. We provide a complete characterization of this classical problem in a general framework in which the upper bound on the noise variance sigma^2, the minimal spacing Delta between two consecutive change points and the minimal magnitude of the changes kappa, are allowed to vary with n. We first show that consistent localization of the change points when the signal-to-noise ratio kappa X sqrt(Delta) / sigma is uniformly bounded from above is impossible. In contrast, when kappa X sqrt(Delta) / sigma is diverging in n at any arbitrary slow rate, we demonstrate that two computationally-efficient change point estimators, one based on the solution to an L0-penalized least squares problem and the other on the popular WBS algorithm, are both consistent and achieve a localization rate of the order log(n) X (sigma / kappa)^2. We further show that such rate is minimax optimal, up to a log(n) term.

Preprint arXiv
10/12/2018 12:00 PM

Queens' Building, Room: W316

Luciana Dalla Valle, University of Plymouth

Analysis of Twin Data via Bayesian Non-parametric Conditional Copula

Several studies on heritability in twins aim at understanding the different contribution of environmental and genetic factors to specific traits. Considering the national merit twin study, our purpose is to analyse correctly the influence of socio-economic status on the relationship between twins’ cognitive abilities. Our methodology is based on conditional copulas, which enable us to model the effect of a covariate driving the strength of dependence between the main variables. We propose a flexible Bayesian non-parametric approach for the estimation of conditional copulas, which can model any conditional copula density. Our methodology extends the work of Wu, Wang and Walker in 2015 by introducing dependence from a covariate in an infinite mixture model. Our results suggest that environmental factors are more influential in families with lower socio-economic position.
12/11/2018 12:00 PM

Queens' Building, Room: W316

Tom Berrett, University of Cambridge

Nonparametric independence testing via mutual information

In this talk I will discuss recent work on the problem of testing the independence of two multivariate random vectors, given a sample from the underlying population. Classical
measures of dependence such as Pearson correlation or Kendall’s tau are often found to not capture the complex dependence between variables in modern datasets, and in recent years a large literature has developed on defining appropriate nonparametric measures of dependence and associated tests. We take the information-theoretic quantity mutual information as our starting point, and define a new test, which we call MINT, based on the estimation of this quantity, whose decomposition into joint and marginal entropies facilitates the use of recently-developed efficient entropy estimators derived from nearest neighbour distances.

The proposed critical values of our test, which may be obtained by simulation in the case where an approximation to one marginal is available or by permuting the data otherwise, facilitate size guarantees, and we provide local power analyses, uniformly over classes of densities whose mutual information satisfies a lower bound. Our ideas may be extended to provide new goodness-of-fit tests of normal linear models based on assessing the independence of our vector of covariates and an appropriately-defined notion of an error vector. The theory is supported by numerical studies on both simulated and real data.
29/10/2018 12:00 PM

Queens' Building: Room W316

Luke Kelly, University of Oxford

Lateral trait transfer in phylogenetic inference

We are interested in inferring the phylogeny, or shared ancestry, of a set of taxa descended from a common ancestor. Lateral trait transfer is a form of reticulate evolutionary activity whereby species exchange evolutionary traits outside of ancestral relationships. The resulting trait histories are mosaics of the underlying species tree. To address this frequent source of model misspecification, we propose a novel model for species diversification which explicitly controls for the effect of lateral transfer.

The parameters of our likelihood are the solution of a sequence of differential equations over a phylogeny and the computational cost of this calculation is exponential in the number of taxa. We exploit symmetries in the differential systems and techniques from numerical analysis to build an efficient approximation scheme to reduce the computational cost of inference by an order of magnitude while remaining exact in a MCMC sense. We illustrate our method on a data set of lexical traits in Eastern Polynesian languages and demonstrate a significantly improved fit over the corresponding method which ignores lateral transfer. (This is joint work with Geoff Nicholls.)

26/11/2018 12:00 PM

Queens' Building, Room: W316

Emily Lines, QMUL School of Geography

Revealing Hidden Juvenile Tree Dynamics from Count Data Using Approximate Bayesian Computation

The juvenile life stage is a crucial determinant of forest dynamics and a first indicator of changes to species’ ranges under climate change. However, paucity of detailed re-measurement data of seedlings, saplings and small trees means that their demography is not well understood at large scales. In this study we quantify the effects of climate and density dependence on recruitment and juvenile growth and mortality rates of thirteen species measured in the Spanish Forest Inventory. Single-census sapling count data is used to constrain demographic parameters of a simple forest juvenile dynamics model using a likelihood-free parameterisation method, Approximate Bayesian Computation. Our results highlight marked differences between species, and the important role of climate and stand structure, in controlling juvenile dynamics. Recruitment had a hump-shaped relationship with conspecific density, and for most species conspecific competition had a stronger negative effect than heterospecific competition. Recruitment and mortality rates were positively correlated, and Mediterranean species showed on average higher mortality and lower growth rates than temperate species. Under climate change our model predicted declines in recruitment rates for almost all species. Defensible predictive models of forest dynamics should include realistic representation of critical early life-stage processes and our approach demonstrates that existing coarse count data can be used to parameterise such models. Approximate Bayesian Computation approaches have potentially wide ecological application, in particular to unlock information about past processes from current observations.

14/12/2017 4:00 PM

Queens' W316

H. Maruri-Aguilar, QMUL

Smoothing the logistic model

Smooth supersaturated polynomials have been used for building emulators in computer experiments. The response surfaces built with this method are simple to interpret and have spline-like properties (Bates et al., 2014). We extend the methodology to build smooth logistic regression models. The approach we follow is to regularize the likelihood with a penalization term that accounts for the roughness of the regression model.

The response surface follows data closely yet it is smooth and does not oscillate. We illustrate the method with simulated data and we also present a recent application to build a prediction rule for psychiatric hospital readmissions of patients with a diagnosis of psychosis. This application uses data from the OCTET clinical trial (Burns et al., 2013).
22/06/2018 12:00 PM

LG7, G O Jones Building

R. C. Weng, National Chengchi University

Online Bayesian inference for latent ability models

Latent ability models relate a set of observed variables to a set of latent ability variables. It includes the paired and multiple comparison models, the item response theory models, etc. In this talk, first I will present an online Bayesian approximate method for online gaming analysis using paired and multiple comparison models. Experiments on game data show that the accuracy of the proposed online algorithm is competitive with state of the art systems such as TrueSkill. Second, an efficient algorithm is proposed for Bayesian parameter estimation for item response theory models. Experiments show that the algorithm works well for real Internet ratings data. The proposed method is based on the Woodroofe-Stein identity.
24/05/2018 12:15 PM

W316, Queens' Building

V. Vinciotti, Brunel University London

Identifying overlapping terrorist cells from the Noordin Top actor-event network

Actor-event data are common in sociological settings, whereby one registers the pattern of attendance of a group of social actors to a number of events. We focus on 79 members of the Noordin Top terrorist network, who were monitored attending 45 events. The attendance or non-attendance of the terrorist to events defines the social fabric, such as group coherence and social communities. The aim of the analysis of such data is to learn about this social structure. Actor-event data is often transformed to actor-actor data in order to be further analysed by network models, such as stochastic block models. This transformation and such analyses lead to a natural loss of information, particularly when one is interested in identifying, possibly overlapping, subgroups or communities of actors on the basis of their attendances to events. In this paper we propose an actor-event model for overlapping communities of terrorists, which simplifies interpretation of the network. We propose a mixture model with overlapping clusters for the analysis of the binary actor-event network data, called manet, and develop a Bayesian procedure for inference. After a simulation study, we show how this analysis of the terrorist network has clear interpretative advantages over the more traditional approaches of network analysis
30/03/2017 4:30 PM

Queens' W316

E. Saenz de Cabezon, University of La Rioja

Derivative talking while communicating maths

Derivatives can be presented in a nice, simple and intuitive way that everyone can relate to. In my talk I will not just concentrate on the specific topic of communicating derivatives to the general public but will give other examples, partly taken from the project "The Big Van Theory" that uses comedy as a vehicle to bring science to the general public. For further context, see the pages www.bigvanscience.com/index_en.html(link is external) and www.youtube.com/channel/UCH-Z8ya93m7_RD02WsCSZYA(link is external).
06/04/2017 4:30 PM

Queens' W316

A. Steland, RWTH Aachen University

Large sample approximations and change-point procedures for quadratic forms of covariance matrices of high-dimensional t

New results about large sample approximations for statistical inference and change point analysis of high dimensional vector time series are presented. The results deal with related procedures that can be based on an increasing number of bilinear forms of the sample variance-covariance matrix as arising, for instance, when studying change-in-variance problems for projection statistics and shrinkage covariance matrix estimation.

Contrary to many known results, e.g. from random matrix theory, the results hold true without any constraint on the dimension, the sample size or their ratio, provided the weighting vectors are uniformly l1-bounded. Those results are in terms of (strong resp. weak) approximations by Gaussian processes for partial sum and CUSUM type processes, which imply (functional) central limit theorems under certain conditions. It turns out that the approximations by Gaussian processes hold not only without any constraint on the dimension, the sample size or their ratios, but even without any such constraint with respect to the number of bilinear forms. For the unknown variances and covariances of these bilinear forms nonparametric estimators are proposed and shown to be uniformly consistent.

We present related change-point procedures for the variance of projection statistics as naturally arising in principal component analyses and dictionary learning, amongst others. Further, we discuss how the theoretical results lead to novel distributional approximations and sequential methods for shrinkage covariance matrix estimators in the spirit of Ledoit and Wolf.

This is joint work with Rainer v. Sachs, UC Louvain, Belgium. The work of Ansgar Steland was support by a grant from Deutsche Forschungsgemeinschaft (DFG), grant STE 1034/11-1.
16/03/2017 4:30 PM

Queens' W316

S. Liverani, Brunel University London

Modelling highly collinear spatial data

I will present a statistical approach to distinguish and interpret the complex relationship between several predictors and a response variable at the small area level, in the presence of i) high correlation between the predictors and ii) spatial correlation for the response. Covariates which are highly correlated create collinearity problems when used in a standard multiple regression model. Many methods have been proposed in the literature to address this issue. A very common approach is to create an index which aggregates all the highly correlated variables of interest. For example, it is well known that there is a relationship between social deprivation measured through the Multiple Deprivation Index (IMD) and air pollution; this index is then used as a confounder in assessing the effect of air pollution on health outcomes (e.g. respiratory hospital admissions or mortality). However it would be more informative to look specifically at each domain of the IMD and at its relationship with air pollution to better understand its role as a confounder in the epidemiological analyses. In this paper we illustrate how the complex relationships between the domains of IMD and air pollution can be deconstructed and analysed using profile regression, a Bayesian non-parametric model for clustering responses and covariates simultaneously. Moreover, we include an intrinsic spatial conditional autoregressive (ICAR) term to account for the spatial correlation of the response variable.
02/03/2017 4:30 PM

Queens' W316

G. Hughes, Freelance Data Scientist

Litics - an application of data visualisation

I will discuss the power of analytics in a political landscape. How can we revolutionise and communicate politics using analytics and data visualisation?

I will be covering the current downfalls of our current interaction with politics and move on to discuss the power analytics and visuals that could hold if presented well.
07/12/2017 4:00 PM

Queens' W316

D. S. Robertson, University of Cambridge

Statistical inference in response-adaptive trials

Clinical trials typically randomise patients to the different treatment arms using a fixed randomisation scheme, such as equal randomisation. However, such schemes mean that a large number of patients will continue to be allocated to inferior treatments throughout the trial. To address this ethical issue, response-adaptive randomisation schemes have been proposed, which update the randomisation probabilities using the accumulating response data so that more patients are allocated to treatments that are performing well.

A long-standing barrier to using response-adaptive trials in practice, particularly from a regulatory viewpoint, is concern over bias and type I error inflation. In this talk, I will describe recent methodological advances that aim to address both of these concerns.

First I give a summary of a paper by Bowden and Trippa (2017) on unbiased estimation for response adaptive trials. The authors derive a simple expression for the bias of the usual maximum likelihood estimator, and propose three procedures for bias-adjusted estimation.

I then present recent work on adaptive testing procedures that ensure strong familywise error control. The approach can be used for both fully-sequential and block randomised trials, and for general adaptive randomisation rules. We show there can be a high price to pay in terms of power to achieve familywise error control for randomisation schemes with extreme allocation probabilities. However, for proposed Bayesian adaptive randomisation schemes in the literature, our adaptive tests maintain or increase the power of the trial.
26/10/2017 4:00 PM

Queens' W316

J. K. Rogers, University of Oxford

Analysis of recurrent events in the presence of dependent censoring

Heart failure is characterised by recurrent hospitalisations and yet often only the first is considered in clinical trial reports. In chronic diseases, such as heart failure, analysing all such hospitalisations gives a more complete picture of treatment benefit.

An increase in heart failure hospitalisations is associated with a worsening condition meaning that a comparison of heart failure hospitalisation rates, between treatment groups, can be confounded by the competing risk of death. Any analyses of recurrent events must take into consideration informative censoring that may be present. The Ghosh and Lin (2002) non-parametric analysis of heart failure hospitalisations takes mortality into account whilst also adjusting for different follow-up times and multiple hospitalisations per patient. Another option is to treat the incidence of cardiovascular death as an additional event in the recurrent event process and then adopt the usual analysis strategies. An alternative approach is the use of joint modelling techniques to obtain estimates of treatment effects on heart failure hospitalisation rates, whilst allowing for informative censoring.

This talk shall outline the different methods available for analysing recurrent events in the presence of dependent censoring and the relative merits of each method shall be discussed.
01/07/2017 4:30 PM

Queens' W316

K. Mukherjee, Lancaster University

Bootstrapping M-estimators in GARCH models

In this talk we discuss a class of M-estimators of parameters in GARCH models. The class of estimators contains least absolute deviation and Huber's estimator as well as the well-known quasi maximum likelihood estimator. For some estimators, the asymptotic normality results are obtained only under the existence of fractional unconditional moment assumption on the error distribution and some mild smoothness and moment assumptions on the score function. Next we analyse the bootstrap approximation of the distribution of M-estimators. It is seen that the bootstrap distribution (given the data) is a consistent estimate (in probability) of the distribution of the M-estimators. We propose an algorithm for the computation of M-estimates which at the same time is software-friendly to compute the bootstrap replicates from the given data. We illustrate our algorithm through simulation study and the analysis of recent financial data.
22/03/2018 4:00 PM

W316, Queens' Building

M. Leonelli, University of Glasgow

Flexible approaches for inference on extreme events

Precise knowledge of the tail behaviour of a distribution as well as predicting capabilities about the occurrence of extremes are fundamental in many areas of applications, for instance environmental sciences and finance. Standard inferential routines for extremes require the imposition of arbitrary assumptions which may negatively affect the statistical estimates. The model class of extreme value mixture models, on the other hand, allows for the precise estimation of the tail of a distribution without requiring any arbitrary assumption. After reviewing these models, the talk will discuss two extensions of this approach I have been involved in. First, situations where different extreme structures may be useful to perform inference over the extremes of a time series will be discussed. These are dealt with a novel changepoint approach for extremes, where the changepoints are estimated via Bayesian MCMC routines. Second, an extension of extreme value mixture models to investigate extreme dependence in multivariate applications is introduced and its usefulness is demonstrated using environmental data.
10/05/2018 12:00 PM

W316, Queens' Building

F. Ricciardi, UCL

Bandwidth selection for the regression discontinuity design: a clustering approach using a Dirichlet process mixture model

The regression discontinuity design (RDD) is a quasi-experimental design that estimates the causal effects of a treatment when its assignment is defined by a threshold value for a continuous assignment variable. The RDD assumes that subjects with measurements within a bandwidth around the threshold belong to a common population, so that the threshold can be seen as a randomising device assigning treatment to those falling just above the threshold and withholding it from those who fall just below.

Bandwidth selection represents a compelling decision for the RDD analysis, since there is a trade-off between its size and bias and precision of the estimates: if the bandwidth is small, the bias is generally low but so is precision, if the bandwidth is large the reverse is true. A number of methods to select the “optimal” bandwidth have been proposed in the literature, but their use in practice is limited.

We propose a methodology that, tackling the problem from an applied point of view, consider units’ exchangeability, i.e., their similarity with respect to measured covariates, as the main criteria to select subjects for the analysis, irrespectively of their distance from the threshold. We use a clustering approach based on a Dirichlet process mixture model and then evaluate homogeneity within each cluster using posterior distribution for the parameters defining the mixture, including in the final RDD analysis only clusters which show high homogeneity. We illustrate the validity of our methodology using a simulated experiment.
01/02/2018 4:00 PM

W316, Queens' Building

S. Conde, QMUL

Log-linear LASSO selection, observational causal inference and Markov-stability toxicological communities

This talk will have three differentiated parts. In the first one, I will present some results in which we compare LASSO model selection methods with classical ones in sparse multi-dimensional contingency tables formed with binary variables with a log-linear modelling parametrization. In the second one, I will talk about Mendelian randomization in the presence of multiple instruments and will present results of an application to a data set with multiple metabolites. In the third one, I will talk about a clustering method that uses high-dimensional network theory (Markov Stability), and an application of it to a data set that contains messenger ribonucleic acids (mRNAs) and micro ribonucleic acids (miRNAs) from a toxicological experiment.
18/01/2018 4:00 PM

W316, Queens' Building

R. A. Bailey, University of St Andrews

Hasse diagrams as a visual aid for linear models and analysis of variance

The expectation part of a linear model is often presented as an equation with unknown parameters, and the reader is supposed to know that this is shorthand for a whole family of expectation models (for example, is there interaction or not?). I find it helpful to show the family of models on a Hasse diagram. By changing the lengths of the edges in this diagram, we can go a stage further and use it as a visual display of the analysis of variance.
11/01/2018 4:00 PM

W316, Queens' Building

A. J. Mason, LSHTM

A Bayesian framework for addressing informative missingness in the analysis of clinical trials

The analyses of randomised controlled trials (RCTs) with missing data typically assume that, after conditioning on the observed data, the probability of missing data does not depend on the patient's outcome, and so the data are ‘missing at random’ (MAR). This assumption is often questionable, for example because patients in relatively poor health may be more likely to drop-out. In these cases, methodological guidelines recommend sensitivity analyses to recognise data may be ‘missing not at random’ (MNAR), and call for the development of practical, accessible, approaches for exploring the robustness of conclusions to MNAR assumptions.

We propose a Bayesian framework for this setting, which includes a practical, accessible approach to sensitivity analysis and allows the analyst to draw on expert opinion. To facilitate the implementation of this strategy, we are developing a new web-based tool for eliciting expert opinion about outcome differences between patients with missing versus complete data. The IMPROVE study, a multicentre trial which compares endovascular strategy (EVAR) with open repair for patients with ruptured abdominal aortic aneurysm, was used in the initial development work. In this seminar, we will discuss our proposed framework and demonstrate our elicitation tool, using the IMPROVE trial for illustration.
30/11/2017 4:00 PM

Queens' W316

S. F. Williamson, Lancaster University

A Bayesian adaptive design for clinical trials in rare diseases

Development of treatments for rare diseases is challenging due to the limited number of patients available for participation. Learning about treatment effectiveness with a view to treat patients in the larger outside population, as in the traditional fixed randomised design, may not be a plausible goal. An alternative goal is to treat the patients within the trial as effectively as possible. Using the framework of finite-horizon Markov decision processes and dynamic programming (DP), a novel randomised response-adaptive design is proposed which maximises the total number of patient successes in the trial. Several performance measures of the proposed design are evaluated and compared to alternative designs through extensive simulation studies. For simplicity, a two-armed trial with binary endpoints and immediate responses is considered. However, further evaluations illustrate how the design behaves when patient responses are delayed, and modifications are made to improve its performance in this more realistic setting.

Simulation results for the proposed design show that: (i) the percentage of patients allocated to the superior treatment is much higher than in the traditional fixed randomised design; (ii) relative to the optimal DP design, the power is largely improved upon and (iii) the corresponding treatment effect estimator exhibits only a very small bias and mean squared error. Furthermore, this design is fully randomised which is an advantage from a practical point of view because it protects the trial against various sources of bias.

Overall, the proposed design strikes a very good balance between the power and patient benefit trade-off which greatly increases the prospects of a Bayesian bandit-based design being implemented in practice, particularly for trials involving rare diseases and small populations.

Keywords: Clinical trials; Rare diseases; Bayesian adaptive designs; Sequential allocation; Bandit models; Dynamic programming; Delayed responses.
23/11/2017 4:00 PM

Queens' W316

K. Diaz-Ordaz, LSHTM

Doubly robust instrumental variable methods for a trial with non-adherence

We consider estimation of the causal treatment effects in randomised trials with non-adherence, where there is an interest in treatment effects modification by baseline covariates.

Assuming randomised treatment is a valid instrument, we describe two doubly robust (DR) estimators of the parameters of a partially linear instrumental variable model for the average treatment effect on the treated, conditionally on baseline covariate. The first method is a locally efficient g-estimator, while the second is a targeted minimum loss-based estimator (TMLE).

These two DR estimators can be viewed as a generalisation of the two-stage least squares (TSLS) method in the instrumental variable methodology to a semiparametric model with weaker assumptions. We exploit recent theoretical results to extend the use of data-adaptive machine learning to the g-estimator. A simulation study is used to compare the estimators' finite-sample performance (1) when fitted using parametric models, and (2) using Super Learner, with the TSLS.

Data-adaptive DR estimators have lower bias and improved precision, when compared to incorrectly specified parametric DR estimators. Finally, we illustrate the methods by obtaining the causal effect on the treated of receiving cognitive behavioural therapy training on pain-related disability, with heterogeneous treatment by depression at baseline, using the COPERS (COping with persistent Pain, Effectiveness Research in Self-management) trial.
16/11/2017 4:00 PM

Queens' W316

A. J. Gibberd, Imperial College London

Squashing the Gaussian: regularised estimation of dynamic graphical models

Many modern day datasets exhibit multivariate dependance structure that can be modelled using networks or graphs. For example, in social sciences, biomedical studies, financial applications etc. the association of datasets with latent network structures are ubiquitous. Many of these datasets are time-varying in nature and that motivates the modelling of dynamic networks. In this talk I will present some of our recent research which looks at the challenging task of recovering such networks, even in high-dimensional settings.

Our approach studies the canonical Gaussian graphical model whereby patterns of variable dependence are encoded through partial correlation structure. I will demonstrate how regularisation ideas such as the graphical lasso may be implemented when data is drawn i.i.d. but how this may fail in non-stationary settings. I will then present an overview of our work (with Sandipan Roy, UCL) which extends such methods to dynamic settings. By furnishing appropriate convex M-estimators that enforce smoothness and sparsity assumptions on the Gaussian we demonstrate an ability to recover the true underlying network structure. I will present both synthetic experiments and theoretical analysis which shed light on the performance of these methods.
02/11/2017 4:00 PM

Queens' W316

A. Y. Vasilyev, QMUL

Optimal control of eye-movements during visual search

We study the problem of optimal oculomotor control during the execution of visual search tasks. We introduce a computational model of human eye movements, which takes into account various constraints of the human visual and oculomotor systems. In the model, the choice of the subsequent fixation location is posed as a problem of stochastic optimal control, which relies on reinforcement learning methods. We show that if biological constraints are taken into account, the trajectories simulated under learned policy share both basic statistical properties and scaling behaviour with human eye movements. We validated our model simulations with human psychophysical eyetracking experiments.
09/02/2017 4:30 PM

M203

J. M. S. Wason, University of Cambridge

Novel designs for trials with multiple treatments and biomarkers

Multi-arm trials are increasingly being recommended for use in diseases where multiple experimental treatments are awaiting testing. This is because they allow a shared control group, which considerably reduces the sample size required compared to separate randomised trials. Further gains in efficiency can be obtained by introducing interim analyses (multi-arm multi-stage, MAMS trials). At the interim analyses, a variety of modifications are possible, including changing the allocation to different treatments, dropping of ineffective treatments or stopping the trial early if sufficient evidence of a treatment being superior to control is found. These modifications allow focusing of resources on the most promising treatments, and thereby increase both the efficiency and ethical properties of the trial.

In this talk I will describe some different types of MAMS designs and how they may be useful in different situations. I will also discuss the design of trials that test efficacy of multiple treatments in different patient subgroups. I propose a design that incorporates biological hypotheses about links between treatments and biomarker subgroups effects of treatments, but allows alternative links to be formed during the trial. The statistical properties of this design compare well to alternative approaches available.
12/01/2017 4:30 PM

BR 3.02

N. Stallard, University of Warwick

Seamless phase II/III clinical trials incorporating early outcome data

Most statistical methodology for confirmatory phase III clinical trials focuses on the comparison of a control treatment with a single experimental treatment, with selection of this experimental treatment made in an earlier
exploratory phase II trial. Recently, however, there has been increasing interest in methods for adaptive seamless phase II/III trials that combine the treatment selection element of a phase II clinical trial with the definitive analysis usually associated with phase III clinical trials. A number of methods have been proposed for the analysis of such trials to address the statistical challenge of ensuring control of the type I error rate. These methods rely on the independence of the test statistics used in the different stages of the trial.

In some settings the primary endpoint can be observed only after long-term follow-up, so that at the time of the first interim analysis primary endpoint data are available for only a relatively small proportion of the patients randomised. In this case if short-term endpoint data are also available, these could be used along with the long-term data to inform treatment selection. The use of such data breaks the assumption of independence underlying existing analysis methods. This talk presents new methods that allow for the use of short-term data. The new methods control the overall type I error rate, either when the treatment selection rule is pre-specified, or when it can be fully flexible. In both cases there is a gain in power from the use of the short-term endpoint data when the short and long-term endpoints are correlated.
15/12/2016 4:30 PM

BR 3.02

R. Silva, UCL

Some machine learning tools to aid causal inference

Causal inference from observational data requires untestable assumptions. As assumptions may fail, it is important to be able to understand how conclusions vary under different premises. Machine learning methods are particularly good at searching for hypotheses, but they do not always provide ways of expressing a continuum of assumptions from which causal estimands can be proposed. We introduce one family of assumptions and algorithms that can be used to provide alternative explanations for treatment effects. If we have time, I will also discuss some other developments on the integration of observational and interventional data using a nonparametric Bayesian approach.
01/12/2016 4:30 PM

BR 3.02

M. H. Davies, GSK

The use and abuse of statistics in industry

A personal perspective, gained from nearly 30 years of applying statistical methods in a variety of industries (FMCG, Defence, Paper, Pharmaceuticals and Vaccines).

The emphasis is on the application (not the theory) of statistics to support the manufacturing, quality control and R&D functions.

The objective of this session is to present real life examples/situations to raise awareness and stimulate discussion.

Buzz words: Experimental Design (DoE), Taguchi Methods, LeanSigma, Design for Manufacture (DfM), Process Capability, Statistical Process Control (SPC), Analytical Method Validation and Good Manufacturing Practice (GMP).
24/11/2016 4:30 PM

BR 3.02

J. Bowden, University of Bristol

Graphical tools to detect and adjust for invalid instruments in Mendelian randomization

The funnel plot is a graphical visualisation of summary data estimates from a meta-analysis, and is a useful tool for detecting departures from the standard modelling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental level, by equating it to determining the centre of mass of a physical system. We exploit this fact to forge new connections between statistical inference and bias adjustment in the evidence synthesis and causal inference literatures. An on-line web application (named the `Meta-Analyzer') is introduced to further facilitate this physical analogy. Finally, we demonstrate the utility of the Meta-Analyzer as a tool for detecting and adjusting for invalid instruments within the context of Mendelian randomization.
03/11/2016 4:30 PM

BR 3.02

V. V. Anisimov, University of Glasgow

Modern trends in predictive modelling clinical trial operations

Statistical design and operation of clinical trials are affected by stochasticity in patient enrolment and various events' appearance. The complexity of large trials and multi-state hierarchic structure of various operational processes require developing modern predictive analytical techniques using stochastic processes with random parameters in the empirical Bayesian setting for efficient modelling and predicting trial operation.

Forecasting patient enrolment is one of the bottleneck problems as uncertainties in enrolment substantially affect trial time completion, supply chain and associated costs. An analytic methodology for predictive patient enrolment modelling using a Poisson-gamma model is developed by Anisimov and Fedorov (2005–2007). This methodology is extended further to risk-based monitoring interim trial performance of different metrics associated with enrolment, screen failures, various events, AE, and detecting outliers.

As the next stage of generalization, to model the complicated hierarchic processes on top of enrolment a new methodology using evolving stochastic processes is proposed. This class of processes provides a rather general and unified framework to describe various operational processes including follow-up patients, patients' visits, various events and associated costs.

The technique for evaluating predictive distributions, means and credibility bounds for evolving processes is developed (Anisimov, 2016). Some applications to modelling operational characteristics in clinical trials are considered. For these models, predictive characteristics are derived in a closed form, thus, Monte Carlo simulation is not required.

References
1. Anisimov V., Predictive hierarchic modelling of operational characteristics in clinical trials. Communications in Statistics - Simulation and Computation, 45, 05, 2016, 1477–1488.
27/10/2016 4:30 PM

BR 3.02

J. E. Griffin, University of Kent

Adaptive MCMC schemes for variable selection problems

Data sets with many variables (often, in the hundreds, thousands, or more) are routinely collected in many disciplines. This has led to interest in variable selection in regression models with a large number of variables. A standard Bayesian approach defines a prior on the model space and uses Markov chain Monte Carlo methods to sample the posterior. Unfortunately, the size of the space (2^p if there are p variables) and the use of simple proposals in Metropolis-Hastings steps has led to samplers that mix poorly over models. In this talk, I will describe two adaptive Metropolis-Hastings schemes which adapt an independence proposal to the posterior distribution. This leads to substantial improvements in the mixing over standard algorithms in large data sets. The methods will be illustrated on simulated and real data with hundreds or thousands of possible variables.
09/06/2016 4:30 PM

M103

B. Zhang, QMUL

Functional mixed-effects analysis of variance for human movement patterns

By using advanced motion capture systems, human movement data can be collected densely over time. We construct a functional mixed-effects model to analyse such kind of data. This model is flexible enough to study functional data which are collected from orthogonal designs. Covariance structure plays a central role in functional data analysis. In this method, within-curve covariance is analysed under stochastic process perspective and between-curve covariance structure of functional responses is determined by the design. In particular, we are interested in the problem of hypothesis testing and generalize functional F test to the mixed-effects analysis of variance.

We apply this method to analyse movement patterns in patients with cerebral palsy. Hasse diagrams are used to represent the structure of these gait data from an orthogonal block design. In order to assess effects of ankle-foot orthoses, which are commonly-prescribed to patients with abnormal gait patterns, pointwise F tests and functional F tests are used. To explore more about how ankle-foot orthoses influence human movement, we are observing more gait data in a split-plot design. Randomizations of this design are based on Bailey (2008).
02/06/2016 4:30 PM

M103

R. Killick, Lancaster University

Online changepoint detection: a new way of thinking

Online changepoint detection has its origins in statistical process control where once a changepoint is detected the process is stopped, the fault rectified and the process monitoring then begins in control again. In modern day applications such as network traffic and medical monitoring it is infeasible to adopt this strategy. In particular the out of control monitoring is often vital to diagnosis of the problem; instead of fault analysis monitoring continues throughout the period of change and a second change is indicated when the process returns to the control state.

Recent offline changepoint detection literature has demonstrated the importance of considering the changepoints globally and not focusing on detecting a single changepoint in the presence of several. In this talk we will argue that this is also the case for online changepoint detection and discuss what is meant by a "global" view in online detection. This presents several problems as the standard definitions of average run length and detection delay are not clearly applicable. Following consideration of this we show the increased accuracy in future (and past) changepoint detections when taking this viewpoint and demonstrate the method on real world applications.
28/04/2016 4:30 PM

M103

L. I. Pettit, QMUL

Measuring discordancy between prior and data by a mixture of conjugate priors

In Bayesian inference the choice of prior distribution is important. The prior represents beliefs and knowledge about the parameter(s). For data from an exponential family a convenient prior is a conjugate one. This can be updated to find the posterior distribution and experts can choose the parameters as equivalent to imaginary samples. This technique can also be used to combine results from different studies. A disadvantage is that we are unaware of any degree of incompatibility between the prior chosen and the data obtained. This could represent overconfidence by selecting too small a variance or indicate differences between studies.

We suggest employing a mixture of conjugate priors which have the same mean but different finite variances. We give a large weight to the component of the mixture with smaller variance. The posterior weight on the first component of the mixture will be a measure of how discordant the data and the expert's prior are or how different are the two studies. We consider choosing the size of the larger variance by considering the difference in information between the two priors. We also investigate the effect of different parameterisations of the parameter of interest. We consider a number of distributions and compare this method for measuring the discordancy with previously suggested diagnostics. This is joint work with Mitra Noosha.
21/04/2016 4:30 PM

M103

T. Sharia, RHUL

Stochastic approximation and on-line estimation algorithms

Asymptotic behaviour of a wide class of stochastic approximation procedures will be discussed. This class of procedures has three main characteristics: truncations with random moving bounds, a matrix-valued random step-size sequence, and a dynamically changing random regression function. A number of examples will be presented to demonstrate the flexibility of this class, with the main emphases on on-line procedures for parametric statistical estimation. The proposed method ensures an efficient use of auxiliary information in the estimation process, and is consistent and asymptotically efficient under certain regularity conditions.
31/03/2016 4:30 PM

M103

D. Stowell, QMUL

Characterising networks of calling birds from their timing

When you encounter a flock of birds, with individuals calling to each other, it is often clear that the birds are influencing one another through their calls. Can we infer the structure of their social network, simply by analysing the timing of calls? We introduce a model-based analysis for temporal patterns of animal call timing, originally developed for networks of firing neurons. This has advantages over previous methods in that it can correctly handle common-cause confounds and provides a generative model of call patterns with explicit parameters for the parallel influences between individuals. We illustrate with data recorded from songbirds, to make inferences about individual identity and about patterns of influence in communication networks.
17/03/2016 4:30 PM

M103

K. J. McConway, The Open University

Statistics and the media: a statistician’s view

How should statisticians interact with the media? What should statisticians know about how the media operate? For several years I have worked (occasionally) with journalists, and provided expert statistical comments on press releases and media stories. I will describe my experience of the many-sided relationship between researchers, press officers, journalists, and the public they are writing for, from the point of view of the statisticians who are also involved. I will discuss the complicated nature of numbers as facts. Using examples such as the question of whether mobile phones cause brain tumours, I will explain how none of the parties in this relationship makes things easy for the others. Finally I will present a few reasons for being optimistic about the position of statistics in the media.
03/03/2016 4:30 PM

M103

J. T. Griffin, QMUL

Estimating malaria vaccine efficacy and predicting population-level impact

The final results of a multi-centre clinical trial of a vaccine against malaria, RTS,S, were published in 2015. Along with three other groups, we had access to the trial data to use as inputs into mathematical models of malaria transmission. Public health funding bodies and policy makers would like to know how the trial results generalise to other settings. This talk describes how we made use of the data to predict the population-level impact across Africa that vaccination might have, and how the uncertainty from various sources was incorporated.
11/02/2016 4:30 PM

M103

K. Yu, Brunel University London

Tail-index regression for both small sample bias and massive data analysis

Tail-index is an important measure to gauge the heavy-tailed behavior of a distribution. Tail-index regression is introduced when covariate information is available. Existing models may face two challenges: extreme analysis or tail modelling with small to moderate size data usually results in small sample bias, and on the other hand, the issue of storage and computational efficiency with massive data sets also exists for Tail-index regression. In this talk we present new tail-index regression methods, which have unbiased estimates of both regression coefficients and tail-index under small data, and are able to support online analytical processing (OLAP) without accessing the raw data in massive data analysis.
28/01/2016 4:30 PM

M103

J. E. Barrett, UCL

Adaptive clinical trials: selective recruitment designs

In a selective recruitment design not every patient is recruited onto a clinical trial. Instead, we evaluate how much statistical information a patient is expected to provide (as a function of their covariates) and only recruit patients that will provide a sufficient level of expected information. Patients deemed statistically uninformative are rejected. Allocation to a treatment arm is also done in a manner that maximises the expected information gain.

The benefit of selective recruitment is that a successful trial can potentially be achieved with fewer recruits, thereby leading to economic and ethical advantages. We will explore various methods for quantifying how informative a patient is based on uncertainty sampling, the posterior entropy, the expected generalisation error and variance reduction. The protocol will be applied to both time-to-event outcomes and binary outcomes. Results from experimental data and numerical simulations will be presented.
21/01/2016 4:30 PM

M103

A. P. Mander, University of Cambridge

The Product of Independent Probability dose Escalation (PIPE) for dual agent dose escalation

Dual-agent trials are now increasingly common in oncology research, and many proposed dose-escalation designs are available in the statistical literature. Despite this, the translation from statistical design to practical application is slow, as has been highlighted in single-agent phase I trials, where a 3+3 rule-based design is often still used. To expedite this process, new dose-escalation designs need to be not only scientifically beneficial but also easy to understand and implement by clinicians. We proposed a curve-free (nonparametric) design for a dual-agent trial in which the model parameters are the probabilities of toxicity at each of the dose combinations. We show that it is relatively trivial for a clinician's prior beliefs or historical information to be incorporated in the model and updating is fast and computationally simple through the use of conjugate Bayesian inference. Monotonicity is ensured by considering only a set of monotonic contours for the distribution of the maximum tolerated contour, which defines the dose-escalation decision process. Varied experimentation around the contour is achievable, and multiple dose
combinations can be recommended to take forward to phase II. Code for R, Stata and Excel are available for implementation.
14/01/2016 4:30 PM

M203

W. Y. Yeung, Lancaster University

Bayesian adaptive dose-escalation procedures utilizing a gain function with binary and continuous responses

The main purpose of dose-escalation trials is to identify the dose(s) that are safe and efficacious for further investigations in later studies. Therefore, dose-limiting events (DLEs) and indicative responses of efficacy should be considered in the dose-escalation procedure.

In this presentation, Bayesian adaptive approaches that incorporate both safety and efficacy will be introduced. A logistic regression model is used for modelling the probabilities of an occurrence of a DLE at their corresponding dose levels while a linear log-log or a non-parametric model is used for efficacy. Escalation decisions are based on the combination of both models through a gain function to balance efficacy utilities versus costs for safety risks. These dose-escalation procedures aim to achieve either one objective: estimate the optimal dose, calculated via the gain function and interpreted as the safe dose which gives maximum beneficial therapeutic effect; or to achieve two objectives: estimating both the maximum tolerated dose (MTD), the highest dose that is considered as safe, and the optimal dose accurately at the end of a dose-escalation study. The recommended dose(s) obtained under these procedures provide information about the safety and efficacy profile of the novel drug to facilitate later studies. We evaluate the different strategies via simulations based on an example constructed from a real trial. To assess the robustness of the single-objective approach, scenarios where the efficacy responses of subjects are generated from an Emax model, but treated as coming from a linear log-log model are considered. We also find that the non-parametric model estimates the efficacy responses well for a large range of different underlying true shapes. The dual-objective approaches give promising results in terms of having most of their recommendations made at the two real target doses.
17/12/2015 4:30 PM

M203

L. Giraitis, QMUL

Testing for stability of the mean of heteroskedastic time series

Time series models are often fitted to the data without preliminary checks for stability of the mean and variance, conditions that may not hold in much economic and financial data, particularly over long periods. Ignoring such shifts may result in fitting models with spurious dynamics that lead to unsupported and controversial conclusions about time dependence, causality, and the effects of unanticipated shocks. In spite of what may seem as obvious differences between a time series of independent variates with changing variance and a stationary conditionally heteroskedastic (GARCH) process, such processes may be hard to distinguish in applied work using basic time series diagnostic tools. We develop and study some practical and easily implemented statistical procedures to test the mean and variance stability of uncorrelated and serially dependent time series. Application of the new methods to analyze the volatility properties of stock market returns leads to some unexpected surprising findings concerning the advantages of modeling time varying changes in unconditional variance.

Joint work with V. Dalla and P. C. B. Philips
03/12/2015 4:30 PM

M203

N. E. Fenton, QMUL

Bayesian networks: why smart data is better than big data

Due to relatively recent algorithmic breakthroughs Bayesian networks have become an increasingly popular technique for risk assessment and decision analysis. This talk will provide an overview of successful applications (including transport safety, medical, law/forensics, operational risk, and football prediction). What is common to all of these applications is that the Bayesian network models are built using a combination of expert judgment and (often very limited) data. I will explain why Bayesian networks 'learnt' purely from data - even when 'big data' is available - generally do not work well, and will also explain the impediments to wider use of Bayesian networks.
26/11/2015 4:30 PM

M203

S. W. Hee, University of Warwick

Decision-theoretic designs for small clinical trials

Small clinical trials are sometimes unavoidable, for example, in the setting of rare diseases, specifically targeted subpopulation and vulnerable population. The most common designs used in these trials are based on the frequentist paradigm with either a large hypothesized effect size or relaxing the type I and/or II error rates. One of the novel designs that has been proposed is the Bayesian decision-theoretic approach which is more intuitive for trials whose aim is to decide whether or not to conduct further clinical research with the experimental treatment. In this talk, I will start with a review of Bayesian decision-theoretic designs followed by a more detailed discussion on designing a series of trials using this framework.
12/11/2015 4:30 PM

M203

M. S. Massa, University of Oxford

Statistical modelling with graphical models

Graphical models have been studied and formalised across many communities of researchers (artificial intelligence, machine learning, statistics, to name just a few) and nowadays they represent a powerful tool for tackling many diverse applications. They still represent an exciting area of research and many new types of graphical models have been introduced to accommodate more complex situations arising from more challenging research questions and data available. Even the interpretation of graphical models can be quite different in different contexts. If we think for example of high-dimensional settings, the original notion of conditional independence between random variables encoded by the conditional dependence graph is generally lost and the interest is in finding the most important components of thousands of random variables.

In this talk we will present some of the challenges we are faced when using graphical models to address research questions coming from interdisciplinary collaborations. We will present two case studies arising from collaborations with researchers in Biology and Neuropsychology and will try to elucidate some of the new frameworks arising. In particular we will show how graphical models can be very powerful for both an explorative statistical analysis and answering more advanced questions in statistical modelling and prediction.
05/11/2015 4:30 PM

M203

T. W. Waite, University of Manchester

Random designs for robustness to functional model misspecification

Statistical design of experiments allows empirical studies in science and engineering to be conducted more efficiently through careful choice of the settings of the controllable variables under investigation. Much conventional work in optimal design of experiments begins by assuming a particular structural form for the model generating the data, or perhaps a small set of possible parametric models. However, these parametric models will only ever be an approximation to the true relationship between the response and controllable variables, and the impact of this approximation step on the performance of the design is rarely quantified.

We consider response surface problems where it is explicitly acknowledged that a linear model approximation differs from the true mean response by the addition of a discrepancy function. The most realistic approaches to this problem develop optimal designs that are robust to discrepancy functions from an infinite-dimensional class of possible functions. Typically it is assumed that the class of possible discrepancies is defined by a bound on either (i) the maximum absolute value, or (ii) the squared integral, of all possible discrepancy functions.

Under assumption (ii), minimax prediction error criteria fail to select a finite design. This occurs because all finitely supported deterministic designs have the problem that the maximum, over all possible discrepancy functions, of the integrated mean squared error of prediction (IMSEP) is infinite.

We demonstrate a new approach in which finite designs are drawn at random from a highly structured distribution, called a designer, of possible designs. If we also average over the random choice of design, then the maximum IMSEP is finite. We develop a class of designers for which the maximum IMSEP is analytically and computationally tractable. Algorithms for the selection of minimax efficient designers are considered, and the inherent bias-variance trade-off is illustrated.

Joint work with Dave Woods, Southampton Statistical Sciences Research Institute, University of Southampton
22/10/2015 5:30 PM

M203

M. Hamada, Japan Broadcasting Corporation

Mathematical statistics among different categories of information

In this seminar, a novel concept in mathematical statistics is proposed. Ordinarily, some topological factors such as Gromov-Hausdorrf distance, dilatation and distortion are defined inside one metric space. The proposed idea puts a probability operator with some topological factors over different metric spaces which can project from these to one common measure space. By assuming a compact Polish space, first the original information is projected to a metric space. Then the projected information from the different spaces is mapped to one common space where some topological factors are applied. The inference and estimation can be calculated.

The merit of the proposed idea is to be able to compare values and some qualities of different fields with those for one metric space. This novel concept can let Information of post Big Data become more natural for people.

In the seminar, the situation of the Great Earthquake in Japan on 11 March 2011 is also introduced.
11/06/2015 5:30 PM

M203

M. Z. Hossain, QMUL

Generalized linear mixed models for completely randomized design based on randomization

I will focus on the derivation of a generalized linear mixed model (GLMM) in the context of completely randomized design (CRD) based on randomization ideas for linear models. The randomization approach to derive linear models is adapted to the link-transformed mean responses including random effects with fixed effects.

Typically, the random effects in a GLMM are uncorrelated and assumed to follow a normal distribution mainly for computational simplicity. However, in our case, due to the randomization the random effects are correlated. We develop the likelihood function and an estimation algorithm where we do not assume that the random effects have a normal distribution.

I will present and compare the simulation results of a simple example with GLM (generalized linear model) and HGLM (hierarchical generalized linear model) which is suitable for normally distributed correlated random effects.
04/06/2015 2:35 PM

M203

H.-Y. Liu, QMUL

Group sequential monitoring of optimal response-adaptive randomised multi-armed clinical trials
28/05/2015 5:30 PM

M203

R. L. Hooper, Blizard Institute

The dog-leg design: giving clinical trials more power to their elbow

In 1948 the MRC streptomycin trial established the principles of the modern clinical trial, and for longer still the idea of a control or comparison group recruited concurrently to the intervention group has been recognised as essential to obtaining sound evidence for clinical effectiveness. But must a clinical trial proceed by running an intervention and comparator in parallel? In this seminar I will focus on trials where participants are randomised in clusters. This is common when evaluating health service interventions that are delivered within an organisational unit such as a school or general practice. I will look in particular at trials where the comparator is routine care: these trials effectively ask how individuals' outcomes would compare before and after introducing the new treatment in a cluster. I will discuss some surprisingly efficient alternatives to parallel group trial designs in this case, made possible by delaying introduction of the intervention in some clusters after randomisation, with these clusters continuing in the meantime to receive routine care.
21/05/2015 5:30 PM

M203

S. S. Villar, University of Cambridge

Bandit models for the design of Bayesian adaptive clinical trials for rare diseases

The multi-armed bandit problem describes a sequential experiment in which the goal is to achieve the largest possible mean reward by choosing from different reward distributions with unknown parameters. This problem has become a paradigmatic framework to describe the dilemma between exploration (learning about distributions' parameters) and exploitation (earning from distributions that look superior based on limited data), which characterises any data based learning process.

Over the past 40 years bandit-based solutions, and particularly the concept of index policy introduced by Gittins and Jones, have been fruitfully developed and deployed to address a wide variety of stochastic scheduling problems arising in practice. Across this literature, the use of bandit models to optimally design clinical trials became a typical motivating application, yet little of the resulting theory has ever been used in the actual design and analysis of clinical trials. In this talk I will illustrate both theoretically and via simulations, the advantages and disadvantages of bandit-based allocation rules approaches to clinical trials. Based on that, I will reflect on the reasons why these ideas have not been used in practice and describe a novel implementation of the Gittins index rule that overcomes these difficulties, trading off a small deviation from optimality for a fully randomized, adaptive group allocation procedure which offers substantial improvements in terms of patient benefit, especially relevant for small populations.

This talk is based on recent joint work with Jack Bowden and James Wason.
14/05/2015 5:30 PM

M203

V. V. Toropov, QMUL

Development of optimisation techniques for aerospace applications

Current aerospace applications exhibit several features that are not yet adequately addressed by the available optimisation tools:
- Large scale (~1000 design variables) optimisation problems with expensive (10+ hours) response function evaluations
- Discrete optimisation with even moderately expensive response functions
- Optimisation with non-deterministic responses
- Multidisciplinary optimisation in an industrial setting.

The presentation discusses recent progress towards addressing these issues identifying general trends and metamodel-based methods for solving large scale optimisation problems.

Issues that have to be addressed to obtain high quality metamodels of computationally expensive responses include establishing appropriate Designs of Experiments (DOE) focusing on the optimum Latin hypercube DOEs and including nested DOEs. Several metamodel types will be reviewed focusing on the ones obtained by the Moving Least Squares method due to its controlled noise-smoothing capability and by the Genetic Programming due to its ability to arrive at explicit functions of design variables. The use of variable fidelity responses for establishing high accuracy metamodels is also considered.

Examples of recent aerospace applications include
- Turbomachinery applications
- Optimisation of composite wing panels
- Topology optimisation and parametric optimisation in the preliminary design of a lattice composite fuselage
- Optimisation and stochastic analysis of a landing system for the ESA ExoMars mission
07/05/2015 5:30 PM

M203

J. Q. Shi, Newcastle University

Generalised Gaussian process regression model for non-Gaussian functional data

In this talk I will discuss a generalized Gaussian process concurrent regression model for functional data where the functional response variable has a binomial, Poisson or other non-Gaussian distribution from an exponential family while the covariates are mixed functional and scalar variables. The proposed model offers a nonparametric generalized concurrent regression method for functional data with multi-dimensional covariates, and provides a natural framework on modeling common mean structure and covariance structure simultaneously for repeatedly observed functional data. The mean structure provides an overall information about the observations, while the covariance structure can be used to catch up the characteristic of each individual batch. The prior specification of covariance kernel enables us to accommodate a wide class of nonlinear models. The definition of the model, the inference and the implementation as well as its asymptotic properties will be discussed. I will also present several numerical examples with different types of non-Gaussian response variables.
30/04/2015 4:45 PM

Building 58 Room 4121 at the University of Southampton

D. S. Coad, QMUL

Bias calculations for adaptive generalised linear models

A generalised linear model is considered in which the design variables may be functions of previous responses. Interest lies in estimating the parameters of the model. Approximations are derived for the bias and variance of the maximum likelihood estimators of the parameters. The derivations involve differentiating the fundamental identity of sequential analysis. The normal linear regression model, the logistic regression model and the dilution-series model are used to illustrate the approximations.
30/04/2015 3:15 PM

Building 58 Room 4121 at the University of Southampton

S. G. Gilmour, University of Southampton

Future directions for design of experiments

Design and analysis of experiments is sometimes seen as an area of statistics in which there are few new problems. I will argue that modern biological and industrial experiments, often with automatic data collection systems, require advances in the methodology of designed experiments if they are to be applied successfully in practice. The basic philosophy of design will be reexamined in this context. Experiments can now be designed to maximise the information in the data without computational restrictions limiting either the data analysis that can be done or the search for a design. Very large amounts of data may be collected from each experimental unit and various empirical modelling techniques may used to analyse these data. In order to ensure that the data contain the required information, it is vital that attention be paid to the experimental design, the sampling design and any mechanistic information that can be built into the model. The application of these ideas to some particular processes will be used to illustrate the kinds of method that can be developed.
23/04/2015 5:30 PM

M203

O. Sverdlov, EMD Serono

Bayesian design of proof-of-concept binary outcome trials

In this talk, I will present a Bayesian approach to the problem of comparing two independent binomial proportions and its application to the design and analysis of proof-of-concept clinical trials.

First, I will discuss numerical integration methods to compute exact posterior distribution functions, probability densities, and quantiles of the risk difference, relative risk, and odds ratio. These numerical methods are building blocks for applying exact Bayesian analysis in practice. Exact probability calculations provide improved accuracy compared to normal approximations and are computationally more efficient than simulation-based approaches, especially when these calculations have to be invoked repeatedly as part of another simulation study.

Second, I will show applicability of exact Bayesian calculations in the context of a proof-of-concept clinical trial in ophthalmology. A single-stage design and a two-stage adaptive design based on posterior predictive probability of achieving proof-of-concept based on dual criteria of statistical significance and clinical relevance will be presented. A two-stage design allows early stopping for either futility or efficacy, thereby providing a higher level of cost-efficiency than a single-stage design. A take-home message is that exact Bayesian methods provide an elegant and efficient way to facilitate design and analysis of proof-of-concept studies.

Reference:

Sverdlov O, Ryeznik Y, Wu S. (2015). Exact Bayesian inference comparing binomial proportions, with application to proof-of-concept clinical trials. Therapeutic Innovation and Regulatory Science 49(1), 163-174.
26/03/2015 4:30 PM

M203

L. Zou, William Harvey Research Institute

Comparison of randomization methods for testing the interaction between treatments and stratification factor in logistic

This study was motivated by two ongoing clinical trials run by EMR, to see whether B cell pathotype would cause the response rate to differ by two biological therapies for Rheumatoid Arthritis patients. Both trials used B cell pathotype as a stratification factor in the randomizations, and the effect of interest was the interaction between treatments and B cell pathotype. The B cell pathotype was classified by a synovial biopsy that each patient received before the randomization. The categories were B cell rich, B cell poor and Unknown (if the biopsy result was delayed). The biopsy result of unknown patients would be revealed once it was ready during the trial.

Randomizations studied include complete randomization, covariate-adaptive randomization, hierarchical dynamic randomization, permuted block randomization and Begg-Iglewicz randomization. The comparison was based on simulations using the measures: selection bias, imbalance, power for testing treatment and interaction effects and inefficiency of the randomization. Because the outcome was binary variable whether a patient was responder, logistic regression was the natural choice for the post analysis. Treatment and interaction effects as well as the power to detect their significance were estimated using the logistic model with independent variables: treatments, pathotype and their interaction.
19/03/2015 4:30 PM

M203

W. P. Bergsma, LSE

Regression modelling with I-priors

As is well-known, the maximum likelihood method overfits regression models when the dimension of the model is large relative to the sample size. To address this problem, a number of approaches have been used, such as dimension reduction (as in, e.g., multiple regression selection methods or the lasso method), subjective priors (which we interpret broadly to include random effects models or Gaussian process regression), or regularization. In addition to the model assumptions, these three approaches introduce, by their nature, further assumptions for the purpose of estimating the model.

The first main contribution of this talk is an alternative method which, like maximum likelihood, requires no assumptions other than those pertaining to the model of interest. Our proposal is based on a new information theoretic Gaussian proper prior for the regression function based on the Fisher information. We call it the I-prior, the 'I' referring to information. The method is no more difficult to implement than random effects models or Gaussian process regression models.

Our second main contribution is a modelling methodology made possible by the I-prior, which is applicable to classification, multilevel modelling, functional data analysis and longitudinal data analysis. For a number of data sets that have previously been analyzed in the literature, we show our methodology performs competitively with existing methods.
05/03/2015 4:30 PM

M203

J. L. Hutton, University of Warwick

Chain event graphs for informative missingness

Chain event graphs (CEGs) extend graphical models to address situations in which, after one variable takes a particular value, possible values of future variables differ from those following alternative values. These graphs are a useful framework for modelling discrete processes which exhibit strong asymmetric dependence structures, and are derived from probability trees by merging the vertices in the trees together whose associated conditional probabilities are the same.

We exploit this framework to develop new classes of models where missingness is influential and data are unlikely to be missing at random. Context-specific symmetries are captured by the CEG. As models can be scored efficiently and in closed form, standard Bayesian selection methods can be used to search over a range of models. The selected maximum a posteriori model can be easily read back to the client in a graphically transparent way.

The efficacy of our methods are illustrated using a longitudinal study from birth to age 25 of children in New Zealand, analysing their hospital admissions aged 18-25 years with respect to family functioning, education, and substance abuse aged 16-18 years. Of the initial 1265 people, 25% had missing data at age 16, and 20% had missing data on hospital admissions aged 18-25 years. More outcome data were missing for poorer scores on social factors. For example, 21% for mothers with no formal education compared to 13% for mothers with tertiary qualifications.

This is joint work with Lorna Barclay and Jim Smith.
26/02/2015 4:30 PM

M203

I. Kosmidis, UCL

Model-based clustering using copulas with applications

The majority of model-based clustering techniques is based on multivariate Normal models and their variants. This talk introduces and studies the framework of copula-based finite mixture models for clustering applications. In particular, the use of copulas in model-based clustering offers two direct advantages over current methods:

i) the appropriate choice of copulas provides the ability to obtain a range of exotic shapes for the clusters, and
ii) the explicit choice of marginal distributions for the clusters allows the modelling of multivariate data of various modes (discrete, continuous, both discrete and continuous) in a natural way.

Estimation in the general case can be performed using standard EM, and, depending on the mode of the data, more efficient procedures can be used that can fully exploit the copula structure. The closure properties of the mixture models under marginalisation will be discussed, and for continuous, real-valued data parametric rotations in the sample space will be introduced, with a parallel discussion on parameter identifiability depending on the choice of copulas for the components. The exposition of the methodology will be accompanied by the analysis of real and artificial data.

This is joint work with Dimitris Karlis at the Athens University of Economics and Business.

Related preprint: http://arxiv.org/abs/1404.4077
19/02/2015 4:30 PM

M203

A. Koloydenko, RHUL

Positive definite matrices, Procrustes analysis, and other non-Euclidean approaches to statistical analysis of diffusion

Symmetric positive semi-definite (SPD) matrices have recently seen several new applications, including Diffusion Tensor Imaging (DTI) in MRI, covariance descriptors and structure tensors in computer vision, and kernels in machine learning.

Depending on the application, various geometries have been explored for statistical analysis of SPD-valued data. We will focus on DTI, where the naive Euclidean approach was generally criticised for its “swelling” effect in interpolation, and violations of positive definiteness in extrapolation and estimation. The affine invariant and log-Euclidean Riemannian metrics were subsequently proposed to remedy the above deficiencies. However, practitioners have recently argued that these geometric approaches are an overkill in some relevant noise models.

We will examine a couple of related alternative approaches that in a sense reside in between the two aforementioned extremes. These alternatives are based on the square root Euclidean and Procrustes size-and-shape metrics. Unlike the Riemannian approach, our approaches, we think, operate more naturally with respect to the boundary of the cone of SPD matrices. In particular, we prove that the Procrustes metric, when used to compute weighted Frechet averages, preserves ranks. We also establish and prove a key relationship between these two metrics, as well as inequalities ranking traces (mean diffusivity) and determinants of the interpolants based on the Riemannian, Euclidean, and our alternative metrics. Remarkably, traces and determinants of our alternative interpolants compare differently. A general proof of the determinant inequality was just developed and may also be of value to the more general matrix analysis community.

Several experimental illustrations will be shown based on synthetic and real human brain DT MRI data.

No special background in statistical analysis on non-Euclidean manifolds is assumed.

This is a joint work with Prof Ian Dryden (University of Nottingham) and Dr Diwei Zhou (Loughborough University), with a more recent contribution by Dr Koenraad Audenaert (RHUL).
05/02/2015 4:30 PM

M203

B. L. Sturm, QMUL

Out of the barn and into the yard, and other colourful results from my recent paroxysm about the practice of evaluation

I call attention to what I call the “crisis of evaluation” in music information retrieval (MIR) research. Among other things, MIR seeks to address the variety of needs for music information of listeners, music recording archives, and music companies. A large portion of MIR research has thus been devoted to the automated description of music in terms of genre, mood, and other meaningful terms. However, my recent work reveals four things: 1) many published results unknowingly use datasets with faults that render them meaningless; 2) state-of-the-art (“high classification accuracy”) systems are fooled by irrelevant factors; 3) most published results are based upon an invalid evaluation design; and 4) a lot of work has unknowingly built, tuned, tested, compared and advertised “horses” instead of solutions. (The true story of the horse Clever Hans provides the most appropriate illustration.) I argue why these problems have occurred, and how we can address them by adopting the formal design and evaluation of experiments, and other best practices.

Relevant publications:
[1] B. L. Sturm, “Classification accuracy is not enough: On the evaluation of music genre recognition systems,” J. Intell. Info. Systems, vol. 41, no. 3, pp. 371–406, 2013.
http://link.springer.com/article/10.1007%2Fs10844-013-0250-y(link is external)

[2] B. L. Sturm, “A simple method to determine if a music information retrieval system is a “horse”,” IEEE Trans. Multimedia, vol. 16, no. 6, pp. 1636–1644, 2014.
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6847693(link is external)

[3] B. L. Sturm, “The state of the art ten years after a state of the art: Future research in music information retrieval,” J. New Music Research, vol. 43, no. 2, pp. 147–172, 2014.
http://www.tandfonline.com/doi/abs/10.1080/09298215.2014.894533#.VMDT0KZ...
22/01/2015 4:30 PM

M203

O. Volkov, QMUL

Optimal relaxed designs of experiments

A relaxed design is a continuous design whose replications can be any nonnegative real number. The talk introduces the method of relaxed designs and identifies its applications to sample size determination, cost-efficient design, constrained design and multi-stage Bayesian design. The main focus is on applications that could be intractable with standard optimal design.
15/01/2015 4:30 PM

M203

S. Lunagomez, Harvard University

Valid inference from non-ignorable network sampling designs

Consider a population where subjects are susceptible to a disease (e.g. AIDS). The objective is to perform inferences on a population quantity (like the prevalence of HIV on a high-risk subpopulation, e.g. intra-venous drug abusers) via sampling mechanisms based on a social network (link-tracing designs, RDS). We develop a general framework for making Bayesian inference on the population quantity that: models the uncertainty in the underlying social network using a random graph model, incorporates dependence among the individual responses according to the social network via a Markov Random Field, models the uncertainty regarding the sampling on the social network, and deals with the non-ignorability of the sampling design. The proposed framework is general in the sense that it allows a wide range of different specifications for the components of the model we just mentioned. Samples from the posterior distribution are obtained via Bayesian model averaging. Our model is compared with standard methods in simulation studies and it is applied to real data.
11/12/2014 4:30 PM

M203

M. Mauch, QMUL

Making sense and science out of musical data

I will give an overview of my work in music informatics research (MIR) with some applications to singing research and tracking the evolution of music. I first will give a very high-level overview of my work, starting with my Dynamic Bayesian Network approach to chord recognition, a system for lyrics-to-audio alignment (SongPrompter), and some other shiny applications of Music Informatics (Songle.jp, Last.fm Driver's Seat). Secondly, I will talk about some scientific applications of music informatics, including the study of singing intonation and intonation drift as well as the evolution of music both in the lab and in the real charts.
04/12/2014 4:30 PM

M203

B. Calderhead, Imperial College London

A general construction for parallelising Metropolis-Hastings algorithms

Markov chain Monte Carlo methods are essential tools for solving many modern day statistical and computational problems, however a major limitation is the inherently sequential nature of these algorithms. In this talk I'll present some work I recently published in PNAS on a natural generalisation of the Metropolis-Hastings algorithm that allows for parallelising a single chain using existing MCMC methods. We can do so by proposing multiple points in parallel, then constructing and sampling from a finite state Markov chain on the proposed points such that the overall procedure has the correct target density as its stationary distribution. The approach is generally applicable and straightforward to implement. I'll demonstrate how this construction may be used to greatly increase the computational speed and statistical efficiency of a variety of existing MCMC methods, including Metropolis-Adjusted Langevin Algorithms and Adaptive MCMC. Furthermore, I'll discuss how it allows for a principled way of utilising every integration step within Hamiltonian Monte Carlo methods; our approach increases robustness to the choice of algorithmic parameters and results in increased accuracy of Monte Carlo estimates with little extra computational cost.
27/11/2014 4:30 PM

M203

T. Jaki, Lancaster University

Treatment selection in multi-arm, multi-stage clinical studies

Adaptive designs that are based on group-sequential approaches have the benefit of being efficient as stopping boundaries can be found that lead to good operating characteristics with test decisions based solely on sufficient statistics. The drawback of these so called “pre-planned adaptive” designs is that unexpected design changes
are not possible without impacting the error rates. “Flexible adaptive designs”, and in particular designs based on p-value combination, on the other hand can cope with a large number of contingencies at the cost of reduced efficiency.

In this presentation we focus on so called multi-arm multi-stage trials which compare several active treatments against control at a series of interim analyses. We will focus on the methods by Stallard and Todd [1] and Magirr et al. [2], two different approaches which are based on group-sequential ideas, and discuss how these “pre-planned
adaptive designs” can be modified to allow for flexibility. We then show how the added flexibility can be used for treatment selection and evaluate the impact on power in a simulation study. The results show that a combination of a well chosen pre-planned design and an application of the conditional error principle to allow flexible treatment selection results in an impressive overall procedure.
______________________
[1] Stallard, N, & Todd, S. 2003. Sequential designs for phase III clinical trials incorporating treatment selection. Statistics in Medicine, 22, 689-703.
[2] Magirr, D, Jaki, T, & Whitehead, J. 2012. A generalised Dunnett test for multi-arm, multi-stage clinical studies with treatment selection. Biometrika, 99, 494-501.
20/11/2014 4:30 PM

M203

G. A. Young, Imperial College London

Inference in the presence of nuisance parameters

Two routes most commonly proposed for accurate inference on a scalar interest parameter in the presence of a (possibly high-dimensional) nuisance parameter are parametric simulation (`bootstrap') methods, and analytic procedures based on normal approximation to adjusted forms of the signed root likelihood ratio statistic. Both methods yield, under some null hypothesis of interest, p-values which are uniformly distributed to error of third-order in the available sample size. But, given a specific inference problem, what is the formal relationship between p-values calculated by the two approaches? We elucidate the extent to which the two methodologies actually just give the same inference.
06/11/2014 3:30 PM

M103

D. Woods, University of Southampton

Bayesian design of experiments and Gaussian process models

The design of many experiments can be considered as implicitly Bayesian, with prior knowledge being used informally to aid decisions such as which factors to vary and the choice of plausible causal relationships between the factors and measured responses. Bayesian methods allow uncertainty in such decisions to be incorporated into design selection through prior distributions that encapsulate information available from scientific knowledge or previous experimentation. Further, a design may be explicitly tailored to the aim of the experiment through a decision-theoretic approach with an appropriate loss function.

We will present novel methodology for two problems in this area, related through the application of Gaussian process (GP) regression models. Firstly, we consider Bayesian design for prediction from a GP model, as might be used for the collection of spatial data or for a computer experiment to interrogate a numerical model. Secondly, we address Bayesian design for parametric regression models, and demonstrate the application of GP emulators to mitigate the computational issues that have traditionally been a barrier to the application of these designs.
06/11/2014 3:00 PM

M103

W. Just, QMUL

Detecting phase synchronisation in time series data sets

Synchronisation phenomena in their various disguises are among the most prominent features in coupled dynamical structures. Within this talk we first introduce how the vague notion of a phase can be given a more precise meaning using what has been coined as analytic signal processing. This approach then allows to distinguish different types of synchronisation phenomena, and in particular to detect synchronisation of the phase of signals where amplitudes remain uncorrelated. These ideas are finally applied to data sets to explore whether phase synchronisation plays a role in the interpretation of physiological movement data.

Attachment Size

Slides for talk [PDF 1,147KB] 1.12 MB
30/10/2014 4:30 PM

M203

I. Andrianakis, London School of Hygiene and Tropical Medicine

Calibration of an individual based HIV computer model using emulation and history matching

Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in physics, engineering, biology and other disciplines. The utility of these models depends on how well they are calibrated to empirical data. Their calibration is hindered however, both by large numbers of input and output parameters and by run times that increase with the model's complexity. In this talk we present a calibration method called History Matching, which is iterative and scales well with the dimensionality of the problem. History matching is based on the concept of an emulator, which is a Bayesian representation of our beliefs about the model, given the runs that are available to us. Capitalising on the efficiency of the emulator, History Matching iteratively discards regions of the input space that are unlikely to provide a good match to the empirical data, and is based on successive runs of the computer model in narrowing areas of the input space, which are known as waves. This calibration technique can be embedded in a comprehensive error modelling framework, that takes into account various sources of uncertainty, due to the parameters, the model itself, the observations etc. A calibration example of a high dimensional HIV model will be used to illustrate the method.
23/10/2014 5:30 PM

M203

P. R. Curtis, QMUL

Emulation with smooth supersaturated models: solvability, stability, sensitivity and design

Smooth supersaturated models are a class of emulators with a supersaturated polynomial basis, that is there are more model terms than design points. In this talk I will give some key results regarding the structure and solvability of these models as well as some insights regarding the numeric stability of fitting these large models. Sensitivity analysis using Sobol indices is often used to reduce the parameter space of expensive computer experiments and a simple formula is given for computing these indices for a smooth supersaturated model. Finally, I present the results of some simulation studies exploring ways to use the emulated response surface to generate new design points.
29/05/2014 5:30 PM

M103

Kabir Soeny, School of Mathematical Sciences, QMUL

Optimization of Dose Regimens under Pharmacokinetic and Pharmacodynamic Constraints

Following the correct selection of a therapy based on the indication, an optimal dose regimen is the most important determinant of therapeutic success of a medical therapy. After giving an introduction to the Efficient Dosing (ED) algorithm developed by us to compute dose regimens which ensure that the blood concentration of the drug in the body is kept close to the target level, I will show how the algorithm can be applied to the Pharmacodynamic models for infectious diseases. The optimized dose regimens satisfy three conditions: (1) minimize the concentration of the anti-infective drug lying outside the therapeutic window (if any), (2) ensure a target reduction in viral load, and (3) minimize drug exposure once the goal of viral load reduction has been achieved. The algorithm can also be used to compute the number of doses required for treatment.
15/05/2014 5:30 PM

M103

John Paul Gosling, School of Mathematics, University of Leeds

Subjective judgements in skin sensitisation hazard assessments

One key quantity of interest in skin sensitisation hazard assessment is the mean threshold for skin sensitisation for some defined population (called the sensitising potency). Before considering the sensitising potency of the chemical, hazard assessors consider whether the chemical has the potential to be a skin sensitiser in humans. Bayesian belief network approaches to this part of the assessment, which handles the disparate lines of evidence within a probabilistic framework, have been applied successfully. The greater challenge comes in the quantification of uncertainty about the sensitising potency.

To make inferences about sensitising potency, we used a Bayes linear framework to model hazard assessors' expectations and uncertainties and to update those beliefs in the light of some competing data sources. In producing a tool for synthesising multiple lines of evidence and estimating hazard, we developed a transparent mechanism to help defend and communicate risk management decisions. In this talk, I will attempt to describe the principles of this Bayesian modelling and formal processes for capturing expert knowledge. And, hopefully, I will be able to highlight their applicability where fast decisions are needed and data are sparse.
08/05/2014 5:30 PM

M103

Yoshifumi Ukita, Yokohama College of Commerce and Wolfson College, Cambridge (visitor)

Models based on orthonormal systems for experimental design

In this talk, models based on orthonormal systems for experimental design are presented. In such models, it is possible to use fast Fourier Transforms (FFT) to calculate the parameters, which are independent, and which are complex numbers expressed as Fourier coefficients.

Theorems for the relation between the Fourier coefficients and the effect of each factor are also given. Using these theorems, the effect of each factor can be easily obtained from the computed Fourier coefficients. The paper finally shows that the analysis of variance can be used on the proposed models without the need to calculate the degrees of freedom.
27/03/2014 4:30 PM

M103

Leon Danon, School of Mathematical Sciences, QMUL

Collective behaviour in social systems

Human social systems show unexpected patterns when studied from a collective point of view. In this talk I will present a few examples of collective behaviour in social systems: human movement patterns, social encounter networks and music collaboration networks, all of which are data driven. I'll try to make the talk short and aim to start a discussion.
13/03/2014 4:00 PM

M103

Altea Lorenzo-Arribas, Biomathematics & Statistics Scotland (BioSS), The James Hutton Institute, Aberdeen

Cumulative link mixed models and the partial proportional odds assumption

This talk will focus on the challenges faced in mixed modelling with ordinal response variables. Topics covered will include: the advantages of an ordinal approach versus a -widely used in practice- more generic continuous approach; the implications of the proportional odds assumption and more flexible approaches such as the partial proportional odds assumption; and the implementation of mixed models in this context. Both simulations and applications to real data regarding perceptions on environmental matters will be shown.
06/03/2014 4:30 PM

M103

Stella Hadjantoni, School of Economics and Finance, QMUL

Methods for the re-estimation of large-scale linear models after adding and deleting observations

It is often computationally infeasible to re-estimate afresh a large-scale model when a small number of observations is sequentially modified. Furthermore, in some cases a dataset is too large and might not be able to fit in a computer's memory and in such cases out of core algorithms need to be developed. Similarly data might not be available at once and recursive estimation strategies need to be applied. Within this context the aim is to design computationally efficient and numerically stable algorithms. Initially, the re-estimation of the generalized least squares (GLS) solution after observations are deleted, known as downdating, is examined. The new method to estimate the downdated general linear model (GLM), updates the original GLM with the imaginary deleted observations. This results to a non-positive definite dispersion matrix which comprises complex covariance values. This updated-GLM with imaginary values has been proven to derive the same GLS estimator as that of solving afresh the original GLM after downdating.
The estimation of the downdated-GLM is formulated as a generalized linear least squares problem (GLLSP). The solution of the GLLSP derives the GLS estimator even when the dispersion matrix is singular. The main computational tool is the generalized QR decomposition which is employed based on hyperbolic Householder transformations, however, no complex arithmetic is used in practice. The special case of computing the GLS estimator of the downdated-SUR (seemingly unrelated regressions) model is considered. The method is extended to the problem of concurrently adding and deleting observations from the model. The special structure of the matrices and properties of the SUR model are efficiently exploited in order to reduce the computational burden of the estimation algorithm. The proposed algorithms are applied to synthetic and real data. Their performance when compared with algorithms that estimate the same model afresh confirms their computational efficiency.
27/02/2014 4:30 PM

M103

Vassilios Stathopoulos, Centre for computational statistics and machine learning, UCL

Bat call identification with Gaussian process multinomial probit regression and a dynamic time warping kernel

We study the problem of identifying bat species from echolocation calls in order to build automated bioacoustic monitoring algorithms. We employ the Dynamic Time Warping algorithm which has been successfully applied for bird flight calls identification and show that classification performance is superior to hand crafted call shape parameters used in previous research. This highlights that generic bioacoustic software with good classification rates can be constructed with little domain knowledge. We conduct a study with field data of 21 bat species from the north and central Mexico using a multinomial probit regression model with Gaussian process prior and a full EP approximation of the posterior of latent function values. Results indicate high classification accuracy across almost all classes while misclassification rate across families of species is low highlighting the common evolutionary path of echolocation in bats.
13/02/2014 4:30 PM

M103

David Siegmund, Department of Statistics, Stanford University

Detection of Genomic Signals by Resequencing

Several problems of genomic analysis involve detection of local genomic signals. When
the data are generated by sequence based methods, the variability of read depth at different
positions on the genome suggests point process models involving non-homogeneous Poisson
processes, or perhaps negative binomial processes if there is excess variability. We discuss a
number of examples, and consider in detail a model for detection of insertions and deletions
(indels) based on paired end reads.
This is joint research with Nancy Zhang and Benjamin Yakir.
06/02/2014 4:30 PM

M103

Peter Congdon, The School of Geography, QMUL

Measuring spatial clustering in disease patterns

The talk considers a cluster detection methodology which describes the cluster status of each area, and provides alternative/complementary perspectives to spatial scan cluster detection. The focus is on spatial health risk patterns (area disease prevalence, area mortality, etc) when area relative risks are unknown parameters. The method provides additional insights with regard to cluster centre areas vs. cluster edge areas. The method also considers both low risk clustering and high risk clustering in an integrated perspective, and measures high/low risk outlier status. The application of the method is considered with simulated data (and known spatial clustering), and with real examples, both univariate and bivariate.
30/01/2014 4:30 PM

M103

Javier Rubio, Department of Statistics, The University of Warwick

Modelling of skewness and kurtosis with double two-piece distributions

In this talk, I will present a brief summary of several classes of univariate flexible distributions employed to model skewness and kurtosis. We will discuss a simple classication of these distributions in terms of their tail behaviour. This classication motivates the introduction of a new family of distributions (double two{piece distributions), which is obtained by using a transformation dened on the family of uni-modal symmetric continuous distributions containing a shape parameter. The proposed distributions contain five interpretable parameters that control the mode, as well as the scale and shape in each direction. Four-parameter subfamilies of this class of transformations are also discussed. It is also presented an interpretable scale and location invariant benchmark prior as well as conditions for the existence of the corresponding posterior distribution. Finally, the use of this sort of models is illustrated with a real data example.
17/12/2013 4:30 PM

M203

Alexandra Piryatinska, Department of Mathematics, San Francisco State University

Detection of changes in the generating mechanism of time series via the epsilon-complexity of continuous functions

A novel methodology for the detection of abrupt changes in the generating mechanisms (stochastic, deterministic or mixed) of a time series, without any prior knowledge about them, will be presented. This methodology has two components: the first is a novel concept of the epsilon-complexity, and the second is a method for the change point detection. In the talk, we will give the definition of the epsilon-complexity of a continuous function defined on a compact segment. We will show that for the Holder class of functions there exists an effective characterization of the epsilon-complexity. The results of simulations and applications to the electroencephalogram data and financial time series will be presented. (The talk is based on joint work with Boris Darkhovsky at the Russian Academy of Sciences.)
05/12/2013 4:30 PM

M203

Peter Challenor, Exeter University

Climate, Models and Uncertainty

tba
21/11/2013 4:30 PM

M203

Erica Thompson, Centre for Climate Change Economics and Policy, LSE

Statistical challenges in climate change research

Climate change research methods, particularly those aspects involving projection of future climatic conditions, depend heavily upon statistical techniques but are still at an early stage of development. I will discuss what I see as the key statistical challenges for climate research, including the problems of too little and too much data, the principles of inference from model output, and the relationship of statistics with dynamics (physics). With reference to some specific examples from the latest IPCC report and beyond, I will show that there is a need for statisticians to become more involved with climate research and to do so in a manner that clarifies, rather than obscures, the role and influence of physical constraints, of necessary simplifying assumptions, and of subjective expert judgement.
07/11/2013 3:45 PM

Seminar to be held in southampton University

Steven Gilmour (Southampton) and Luzia Trinca (UEP)

Multi-stratum Designs for Statistical Inference

It is increasingly realised that many industrial experiments involve some factors whose levels are harder to reset than others, leading to multi-stratum structures. Designs are usually chosen to optimise the point estimation of fixed effects parameters, such as polynomial terms in a response surface model, using criteria such as D- or A-optimality. Gilmour and Trinca (2012) introduced the DP- and AP-optimality criteria, which optimise interval estimation, or equivalently hypothesis testing, by ensuring that unbiased (pure) error estimates can be obtained. We now extend these ideas to multi-stratum structures, by adapting the stratum-by-stratum algorithm of Trinca and Gilmour (2014) to ensure optimal interval estimation in the lowest stratum. It turns out that, in most practical situations, this also ensures that adequate pure error estimates are available in the higher strata. Several examples show that good practical designs can be obtained, even with fairly small run sizes.
07/11/2013 2:15 PM

Seminar to be held in Southampton University

Hugo Maruri-Aguilar, School of Mathematical Sciences, QMUL

Optimal design for smooth supersaturated models (SSM)

Smoooth supersaturated models (SSM) are interpolation models in which the underlying model size, and typically the degree, is higher than would normally be used in statistics, but where the extra degrees of freedom are used to make the model smooth.

I will describe the methodology, discuss briefly the role of orthogonal polynomials and then address two design problems. The first is selection of knots and the second a more traditional design problem using SSM to obtain the kernels of interest for D-optimality.

This is joint work with Ron Bates (Rolls-Royce), Peter Curtis (QMUL) and Henry Wynn (LSE).
31/10/2013 4:30 PM

M203

Haeran Cho, Department of Mathematics, University of Bristol

Modelling and forecasting daily electricity loads via curve linear regression

We study the problem of modelling and the short-term forecasting of electricity loads. Regarding the electricity load on each day as a curve, we propose to model the dependence between successive daily curves via curve linear regression. The key ingredient in curve linear regression modelling is the dimension reduction based on a singular value decomposition in a Hilbert space, which reduces the curve regression problem to several ordinary (i.e. scalar) linear regression problems. We illustrate the method by performing one-day ahead forecasting of the electricity loads consumed by the customers of EDF between 2011 and mid-2012, where we also compare our method with other available models.

This is a joint work with Yannig Goude, Xavier Brossat and Qiwei Yao.
17/10/2013 4:30 PM

M203

Maria Vazquez, Department of Public Health, University of Oxford

Control charts applied to the management of bipolar disorder patients

Control charts are well known tools in industrial statistical process control. They are used to distinguish between random error and systematic variability. The use of these tools in medicine has only started in recent years.

In this Seminar we present a project in which we explore the ability of Shewhart's control rules to predict severe manic and depressive episodes in bipolar disorder patients. In our study, we consider three types of control charts and a variety of scenarios using real data.
10/10/2013 5:30 PM

M203

Roberto Fontana, Dipartimento di Scienze Matematiche Politecnico di Torino

Saturated designs: some applications

In the first part of the talk we study saturated fractions of factorial designs under the perspective of Algebraic Statistics. Exploiting the identification of a fraction with a binary contingency table, we define a criterion to check whether a fraction is saturated or not with respect to a given model. The proposed criterion is based on combinatorial algebraic objects, namely the circuit basis of the toric ideal associated to the design matrix of the model. It is a joint work with Fabio Rapallo (Universit`a del Piemonte Orientale, Italy) and Maria Piera Rogantin (Italy).

In the second part of the talk we study optimal saturated designs, mainly Doptimal designs. Efficient algorithms for searching for optimal saturated designs are widely available. They maximize a given efficiency measure (such as D-optimality) and provide an optimum design. Nevertheless, they do not guarantee a global optimal design. Indeed, they start from an initial random design and find a local optimal design. If the initial design is changed the optimum found will, in general, be different. A natural question arises. Should we stop at the design found or should we run the algorithm again in search of a better design? This paper uses very recent methods and software for discovery probability to support the decision to continue or stop the sampling. A software tool written in SAS has been developed.
06/06/2013 5:30 PM

M203

Peter Kimani, Warwick Medical School

Conditionally unbiased estimation in adaptive seamless designs

In order to accelerate drug development, adaptive seamless designs (ASDs) have been
proposed. In this talk, I will consider two-stage ASDs, where in stage 1, data are
collected to perform treatment selection or sub-population selection. In stage 2,
additional data are collected to perform confirmatory analysis for the selected
treatments or sub-populations. Unlike the traditional testing procedures, for ASDs,
stage 1 data are also used in the confirmatory analysis. Although ASDs are efficient,
using stage 1 data both for selection and confirmatory analysis poses statistical
challenges in making inference.

I will focus on point estimation at the end trials that use ASDs. Estimation is
challenging because multiple hypotheses are considered at stage 1, and the experimental
treatment (or the sub-population) that appears to be the most effective is selected
which may lead to bias. Estimators derived need to account for this fact. In this talk,
I will describe estimators we have developed.
23/05/2013 5:30 PM

M203

Angela Noufaily, Department of Mathematics And Statistics, The Open University

An improved algorithm for outbreak detection in multiple surveillance systems

In England and Wales, a large-scale multiple statistical surveillance system for infectious disease outbreaks
has been in operation for nearly two decades. This system uses a robust quasi-Poisson regression algorithm to
identify aberrances in weekly counts of isolates reported to the Health Protection Agency. In this paper, we
review the performance of the system with a view to reducing the number of false reports, while retaining good
power to detect genuine outbreaks. We undertook extensive simulations to evaluate the existing system in a
range of contrasting scenarios. We suggest several improvements relating to the treatment of trends, seasonality,
re-weighting of baselines and error structure. We validate these results by running the existing and proposed
new systems in parallel on real data. We find that the new system greatly reduces the number of alarms while
maintaining good overall performance and in some instances increasing the sensitivity.
02/05/2013 5:30 PM

M203

Rosemary Bailey, QMUL/University of St Andrews

Circular designs balanced for neighbours at distances one and two

We consider experiments where the experimental units are arranged in a circle or in a single line in space or time. If neighbouring treatments may affect the response on an experimental unit, then we need a model which includes the effects of direct treatments, left neighbours and right neighbours. It is desirable that each ordered pair of treatments occurs just once as neighbours and just once with a single unit in between. A circular design with this property is equivalent to a special type of quasigroup.

In one variant of this, self-neighbours are forbidden. In a further variant, it is assumed that the left-neighbour effect is the same as the right-neighbour effect, so all that is needed is that each unordered pair of treatments occurs just once as neighbours and just once with a single unit in between.

I shall report progress on finding methods of constructing the three types of design.
02/05/2013 4:30 PM

M203

Marion Chatfield/Simon Bate University of Southampton/ Glaxo Smith Kline

Using the experimental design and its randomisation to construct a mixed model

In many areas of scientific research complex experimental designs are now routinely used. With the advent of mixed model algorithms, implemented in many statistical software packages, the analysis of data generated from such experiments has become more accessible. However, failing to correctly identify the experimental design used can lead to incorrect model selection and misleading inferences. A procedure is described that identifies the structure of the experimental design and, given the randomisation, generates a maximal mixed model. This model is determined before the experiment is conducted and provides a starting point for the final statistical analysis. The whole process can be illustrated using a generalisation of the Hasse diagram called the Terms Relationships diagram. Most parts of the algorithm have been implemented in a program written in R. It is shown that the model selection process can be simplified by placing experimental design (crossed/nested structure and randomisation) at the centre of a systematic procedure.
22/04/2013 4:30 PM

130 Wolfson Institute

Steve Coad, School of Mathematical Sciences, QMUL

Inference following adaptive biased coin designs

Suppose that two treatments are being compared in a clinical trial. Then, if complete randomisation is used, the next patient is equally likely to be assigned to one of the two treatments. So this randomisation rule does not take into account the previous treatment assignments, responses and covariate vectors, and the current patient's covariate vector. The use of an adaptive biased coin which takes some or all of this information into account can lead to a more powerful trial.

The different types of such designs which are available are reviewed and the consequences for inference discussed. Issues related to both point and interval estimation will be addressed.
28/03/2013 4:30 PM

M203

Ivonne Solís, MRC Human Nutrition Research

Graphical models with latent variables and their application in developmental psychology

We present a novel strategy of statistical inference for graphical models with latent Gaussian variables, and observed variables that follow non-standard sampling distributions. We restrict our attention to those graphs in which the latent variables have a substantive interpretation. In addition, we adopt the assumption that the distribution of the observed variables may be meaningfully interpreted as arising after marginalising over the latent variables. We illustrate the method with two studies that investigate developmental changes in cognitive functions of young children in one case and of cognitive decline of Alzheimer’s patients in the other. These studies involve the assessment of competing causal models for several psychological constructs; and the observed measurements are gathered from the administration of batteries of tasks subject to complicated sampling protocols.
21/03/2013 4:30 PM

M203

Magdalena Chudy, EECS QMUL

On the relation between bowing gesture and tone production in classical cello performance. Searching for effective metho

In this presentation I would like to introduce a multimodal database which was created within the
scopes of my PhD study on “Cello Performer Modelling Using Timbre Features”.

The database consists of bowing gestures and music samples of six cello players recorded on two different
cellos. The gesture and audio measurements were collected in order to identify performer-dependent sound
features of the players performing on the same instrument and to investigate a potentially existing
correlation between the individual sound features and specific bowing control parameters necessary for
production of desired richness of tone. The current study goal is to find such combinations of respective
bowing gestures and acoustical features which can be seen as patterns and are able to characterise each
player in the database.

Following the data presentation I would like to state some other research questions that clearly emerge
and open a discussion on analysis methods which could help to answer them.
14/03/2013 3:30 PM

M203

Karla Díaz-Ordaz, LSHTM

Handling missing values in hierarchical clinical trial data

Missing data are common in clinical trials but often analysis is based on “completecases”. Complete-case analyses (which delete observations with missing information on any studied covariate) are inefficient and may be biased. Methodological guidelines recommend using multiple imputation (MI). However, for MI to provide valid inferences, the imputation model must recognise the study design. In this talk, we will survey current missing data practice in the clinical trials literature and describe current good practice methodology for hierarchical data.

Using real data from a cluster randomized trial as an example, we see how treatment effects can be sensitive to the choice of method to address the missing data problem. We finish by presenting a few results from a large simulation study, designed to compare the performance of : (a) Multilevel MI that accounts for clustering through cluster random effects, (b) MI that that includes a fixed effect for each cluster and (c) single-level MI that ignores clustering.
28/02/2013 4:30 PM

M203

Clifford Lam, Department of Statistics, LSE

Regularization of Spatial Panel Time Series

In this talk we introduce the need for the estimation of
cross-sectional dependence, or "network" of a panel of time series. In
spatial econometrics and other disciplines, the so-called spatial weight
matrix in a spatial lag model is always assumed known, when it is still
on debate if results of estimation can be sensitive to such assumed
known weight matrices. Since these weight matrices are often sparse, we
propose to regularize it from the data using a well-known technique by
now -- the adaptive LASSO. The technique in quantifying time dependence
is relatively new for statistics and time series literatures.
Non-asymptotic inequalities, as well as asymptotic sign consistency for
the weight matrices elements are presented with explicit rates of
convergence spelt out. A block coordinate descent algorithm is presented
together with results from simulation experiments and a real data
analysis.
21/02/2013 4:30 PM

M203

Dr Mark Strong, Section of Public Health, School of Health and Related Research, The University of Sheffield

Health Economic Model Error and the Expected Value of Model Improvement

George Box famously said “All models are wrong, some are useful”. The challenges
are to determine which models are useful and to quantify how wrong is “wrong”.
In this talk I will explore the problem of determining model adequacy in the
context of health economic decision making.

In health economics, models are used to predict the costs and health benefits
under the competing treatment options (e.g. drug A versus drug B).

The decision problem is typically of the following form. An expensive new drug,
A, has arrived on the market. Should the NHS use it? How much additional health
will society gain if the NHS uses new drug A over existing drug B? What will the
extra cost be? What healthcare activity will be displaced if we use drug A
rather than drug B? Will this be a good use of scarce healthcare resources?
I will describe a general approach to determining model adequacy that is based
on quantifying the “expected value of model improvement”. I will illustrate the
method in a case study.
24/01/2013 4:30 PM

M203

Ruby Childs

Making the best out of things

I was once a QMUL Maths student, quite lost on what I wanted to do next. I strived to get into Investment banking and it wasn't all that it seems, so I had to make a change. I now programme; I'm now creative.

Not everyone knows what to do after finishing University or how to make the best out of themselves. Join me to talk about tips of how to do well in University and how to do well after, from my own mistakes.

Slides for the seminar are available following the link: http://prezi.com/_hpds1p9jrqx/making-the-best-out-of-things/
17/01/2013 4:30 PM

M203

Alexis Boukouvalas Aston Research Centre for Healthy Ageing (ARCHA) Aston University

Optimal Design for Stochastic emulation with heteroscedastic Gaussian Process models

We examine optimal design for parameter estimation of Gaussian process regression models under input-dependent noise. Such a noise model leads to heteroscedastic models as opposed to homoscedastic models where the noise is assumed to be constant. Our motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators as statistical surrogates. In the case of stochastic simulators, the simulator may be evaluated repeatedly for a given parameter setting allowing for replicate observations in the experimental design. Our findings are applicable however in the wider context of design for Gaussian process regression and kriging where the parameter variance is sought to be minimised. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates, that is we seek designs that enable us to best learn about the Gaussian process model.

We construct heteroscedastic Gaussian process representations and propose an experimental design technique based on an extension of Fisher information to heteroscedastic models. We empirically show that the although a strict ordering of the Fisher information to the maximum likelihood parameter variance is not exact, the approximation error is reduced as the ratio of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology model, the replicate-only optimal designs are shown to outperform both replicate-only and non-replicate space-filling designs as well as non-replicate optimal designs. We consider both local and Bayesian D-optimal designs in our experiments.
Nested row-column designs for near-factorial experiments with two treatment factors and one control treatment
15/11/2012 4:30 PM

M203

Fatima Jichi Medical Statistician, UCL School of Life and Medical Sciences

Growth Mixture Modelling of Child Behaviour in a Study of Children Receiving Multidimensional Treatment Foster Care in

The study aims to evaluate the response of children to a new treatment, Multidimensional Treatment Foster Care in England (MTFCE). Trajectories of child behaviour were studied over time to identify subgroups of treatment response.
Growth Mixture Modelling (GMM) was used to find subgroups in the data. A GMM describes longitudinal measures of a single outcome measure as being driven by a set of subject-varying continuous unobserved or latent variables - the so-called growth factors. The growth factors define the individual trajectories. GMM estimates mean growth curves for each class, and individual variation around these growth curves. This allows us to find clusters in the data. Starting characteristics of children were included into the GMM to see if these predicted class membership. Class membership was also checked to see if it predicted outcomes of interest.
08/11/2012 4:30 PM

M203

Hugo Maruri-Aguilar, School of Mathematical Sciences, Queen Mary, University of London

Computer simulators

Computer simulators

A computer experiment consists of simulation of a computer model
which is expected to mimic or represent some aspect of reality.
The analysis of computer simulations is a relatively recent newcomer
in the bag of tools available for the statistics practitioner.
Although simulations do not neccesarily represent reality, it is
possible to gain knowledge about a certain phenomena through the
analysis of such simulations, and the role of the statistician is
to design efficient experiments to explore the parameter region
and to model with areasonable degree of accuracy the response.

I intend to guide the talk through a series of examples derived
from practice, ranging from analysis of airplane blades to the
sensitivity of parameters in a model for disease spread. The
main example will be based on the analysis for a model of the
evolution of rotavirus in a population.
31/05/2012 5:30 PM

M203

Richard StevensSenior StatisticianDepartment of Primary Health CareUniversity of Oxford

Statistical models for monitoring chronic disease

When setting a monitoring programme for conditions such as diabetes, hypertension,
high blood pressure, kidney disease or HIV, one aspect - the interval between
monitoring tests - is often made by consensus rather than from evidence. The
difficulty with randomized trials in this area is easily demonstrated. Oxford's
Monitoring and Diagnosis group, and collaborators, have used longitudinal modelling
to show that over-frequent monitoring leads to a kind of 'multiple testing' problem
and hence to over-treatment. This talk will discuss the methods we use and
illustrate them with a clinical example.
17/05/2012 4:45 PM

Seminar to be held in Southampton University, Building 54 Room 10037

Heiko GrossmannSchool of Mathematical Sciences, QMUL

Analysis of variance for dummies with the AutomaticAnova package

The analysis of variance (Anova) is one of the most popular statistical methods for analysing data. It is most powerful when applied to data from designed experiments. Statistics courses for biologists and other scientists usually explain the underlying theory for simple designs such as the completely randomized design, the randomized complete block design or two-factor factorial designs. However, in applications usually much more advanced designs are used which involve complicated crossing and nesting structures as well as random and fixed effects. Although when designing an experiment scientists can often rely on their intuition, analysing the collected data frequently represents a major challenge to the non-expert.

This talk presents the AutomaticAnova package which has been designed as a user-friendly Mathematica package that enables researchers to analyse complicated Anova models without requiring much statistical background. It is based on RA Bailey's theory of orthogonal designs which covers a wide range of models and in particular all designs that can be obtained by iterative crossing and nesting of factors. The theory distinguishes between block factors, which have random effects, and treatment factors whose effects are fixed and so in general the models are mixed effects models. For these designs the Anova table can be derived in an elegant way by using Hasse diagrams.

The AutomaticAnova package provides a graphical user interface which has been implemented by using Mathematica's GUIKit. Input data are submitted in the form of a Microsoft Excel spreadsheet and essentially the user only has to specify which columns in the spreadsheet represent block and treatment factors respectively, and whether only main effects or main effects and interactions should be included in the analysis. Dynamic enabling/disabling of dialogs and controls minimizes the risk of providing incorrect input information. The most important feature of the AutomaticAnova package is then, however, that the model for the analysis of variance is automatically inferred from the structure of the design in the Excel file. In particular, no model formula needs to be specified for the analysis. It is believed that this aspect of the package's functionality will be highly attractive to practitioners. The output includes the Anova table, estimated variance components and Hasse diagrams and can be saved as a PDF file.

Another feature of the package is that it can be used at the planning stage of an experiment when no response data are yet available to see what the analysis would like. That is, having only specified the design in the form of a spreadsheet the packages provides the so-called skeleton analysis of variance which shows the breakdown of the sum of squares and corresponding degrees of freedom. Also, when no response data are available the package uses Mathematica's symbolic capabilities to derive analytical formulae for the estimators of the variance components.

The presentation will demonstrate the use of the package and describe the principles underlying its implementation. Several examples will be used to illustrate how the AutomaticAnova package can help the non-statistician to analyse complicated Anova designs without having to worry too much about statistics.
17/05/2012 3:15 PM

Seminar to be held in Southampton University, Building 54 Room 10037

Kalliopi Mylona University of Southampton

Analysing data from optimal mixed-level supersaturated designs using group screening

Supersaturated designs (SSDs) are used for screening out the important factors from a
large set of potentially active variables. The huge advantage of these designs is that
they reduce the experimental cost drastically, but their critical disadvantage is the
high degree of confounding among factorial effects. In this contribution, we focus on
mixed-level factorial designs which have different numbers of levels for the factors.
Such designs are often useful for experiments involving both qualitative and quantitative
factors. When analyzing data from SSDs, as in any decision problem, errors of various
types must be balanced against cost. In SSDs, there is a cost of declaring an inactive
factor to be active (i.e. making a Type I error), and a cost of declaring an active
effect to be inactive (i.e. making a Type II error). Type II errors are usually considered
much more serious than Type I errors. We present a group screening method for analysing
data from E(f_{NOD})-optimal mixed-level supersaturated designs possessing the equal
occurrence property. Based on the idea of the group screening methods, the f factors
are sub-divided into g ?group-factors?. The ?group-factors? are then studied using the
penalized likelihood methods involving a factorial design with orthogonal or near-orthogonal
columns. The penalized likelihood methods indicate which ?group factors? have a large
effect and need to be studied in a follow-up experiment. We will compare various methods
in terms of Type I and Type II error rates using a simulation study.

Keywords and phrases: Group screening method, Data analysis, Penalized least squares,
Super-saturated design.
03/05/2012 5:30 PM

M203

Lynn R. LaMotte, Louisiana State University Health Sciences Center, New Orleans

Statistical questions in estimating postmortem interval from insect evidence

Insect evidence around a decomposing body can provide a biological clock
by which the time of exposure can be estimated. As decomposition progresses,
flyy larvae grow and go through distinct developmental stages, and a
succession of insect species visits the scene.

Viewed broadly, the question, how long the body has been exposed, fits
into the framework of inverse prediction. However, insect evidence is
both quantitative and categorical. Size data are multivariate, and their
magnitudes, variances, and correlations change with age. Presence/absence
of important species manifests categorically, but the number of distinct
categories can number in the thousands.

The statistical challenge is to devise an approach that can provide a
credible, defensible estimate of postmortem interval based on such data.
In this talk I shall present the setting and describe joint work I have
undertaken with Jeffrey D. Wells, a forensic entomologist, to address
this question.
24/04/2012 5:30 PM

M203

Nicolas SAVYUniversité Paul Sabatier - Toulouse 3

On the use of Fleming and Harrington's test to detect late effects in clinical trials

In this work, we deal with the question of detection of late effects in the setting of clinical trials. The most natural test for detecting this kind of effects was introduced by Fleming and Harrington. However, this test depends on a parameter, that, is the context of clinical trials, must be chosen a priori.

We examine the reasons why this test is adapted to the detection of late effects by studying its optimality in terms Pitman Asymptotic Relative Efficiency. We give an explicit form of the function describing alternatives for which the test is optimal. Moreover, we will observe, by means of a simulations study, this test is not very sensitive to the value of the parameter, which is very reassuring for its use in clinical trials.
29/03/2012 5:30 PM

M203

Professor Byron Jones, Biometrical FellowStatistical Methodology GroupNovartis Pharm AGBaselSwitzerland

Model-Based Bayesian Adaptive Dose Finding Designs for a Phase II Trial

After giving a brief overview of the different phases of drug development,
I will present a case study that describes the planning of a dose-finding study
for a compound that was in early clinical development at the time of the study.
Data from a previous trial with the same primary endpoint was available for a
marketed drug that had the same pharmacological mechanism, which provided
strong prior information for some characteristics of the new compound, including
the shape of the dose-response relationship. The design used for this trial included
an adaptive element where the allocation of doses to the patients was changed
after an interim analysis. In this talk I will compare the performance different adaptive
designs and compare them to a corresponding non-adaptive design. I will also compare
the performance of Bayesian and model-based maximum likelihood estimation relative
to the use of simple pairwise comparisons of treatment means.
22/03/2012 4:30 PM

M203

Steve CoadSchool of Mathematical Sciences, QMUL

Estimation following Adaptively Randomised Clinical Trials

Suppose that two treatments are being compared in a clinical trial
in which response-adaptive randomisation is used. Upon termination of
the trial, interest lies in estimating parameters of interest.
Although the usual estimators will be approximately unbiased for
trials with moderate to large numbers of patients, their biases may
be appreciable for small to moderate-sized trials and the corresponding
confidence intervals may also have coverage probabilities far from the
nominal values. An adaptive two-parameter model is studied in which
there is a parameter of interest and a nuisance parameter. Corrected
confidence intervals based on the signed root transformation are
constructed for the parameter of interest which have coverage probabilities
close to the nominal values for trials with a small number of patients.
The accuracy of the approximations is assessed by simulation for two examples.
An extension of the approach to higher dimensions is discussed.
15/03/2012 4:30 PM

M203

Mohammad Lutfor RahmanSchool of Mathematical Sciences, QMUL

Multi-Stratum and Split-Plot Designs in Two Industrial Experiments

Hard-to-set factors lead to split-plot type designs and mixed models. Mixed models are used to
analyze multi-stratum designs as each stratum may have random effects on the responses. It is
usual to use residual maximum likelihood (REML) to estimate random effects and generalized
least squares (GLS) to estimate fixed effects. However, a typical property of REML-GLS estimation
is that it gives highly undesirable and misleading conclusions in non-orthogonal split-plot
designs with few main plots. More specifically, the variance components are often estimated
poorly using maximum likelihood (ML) methods when there are few main plots. To overcome the
problem a Bayesian method considering informative priors for variance components and using
Markov chain Monte Carlo (MCMC) sampling would be an alternative approach.

In the current study we have implemented MCMC techniques in two industrial experiments. During
binary data analysis, we have faced convergence problems frequently. Perhaps these are due to
separation problems in the data. In future, we will define a design criterion that will minimize
the problem of separation.
01/03/2012 4:30 PM

M203

Marco Geraci Institute of Child Health, University College London

Quantile inference for complex survey data with missing values

The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. The analysis is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias.

In this talk I will address some issues that arise when the target of the inference is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is considered. A novel multiple imputation method based on sequential quantile regressions (QR) is developed. Such method is able to preserve the distributional relationships in the data, including conditional skewness and kurtosis, and to successfully handle bounded outcomes. The motivating example concerns the analysis of birthweight determinants in a large cohort of British children.
23/02/2012 4:30 PM

M203

Miguel JuarezUniversity of Sheffield

From time course gene expression to gene regulatory networks

The accelerated development of high-throughput technologies has enabled understanding of how biological systems function at a molecular level, for instance by unraveling the interaction structure of genes responsible for carrying out a given process. Systems biology has the potential to enhance knowledge acquisition and facilitate the reverse engineering of global regulatory networks using gene expression time course experiments.

In this talk I will present some models we have developed for estimating a gene interaction network from time course experimental data. The basic structure of these models is governed by a dynamic Bayesian network, which allows us to include expert biological information as well. Given the complexity of model fit, we resort to numerical methods for model estimation.

I will exemplify gene network inference using experimental data from the metabolic change in Streotomyces coelicolor and the circadian clock in Arabidopsis thaliana.
16/02/2012 4:30 PM

M203

Mirela DomijanWarwick Systems Biology Centre

An overview of several methods to analyse dynamics of chemical reaction networks

In order to make sense of many biological processes, it is crucial to understand
the dynamics of the underlying chemical reactions. Chemical reaction systems are
known to exhibit some interesting and complex dynamics, such as multistability
(a situation where two or more stable equilibria coexist) or oscillations.

Here we take the deterministic approach and assume that the reactions obey the
law of mass-action, so the systems are described by ODEs with specific polynomial
structure. For such systems, this polynomial structure allows us to gain
surprisingly deep insights into systems' dynamics.

In my talk I will overview several methods for analysing these specific chemical
reaction networks, encompassing algebraic geometry, bifurcation theory and graph
theory.
03/03/2011 4:30 PM

M203

Serge Guillas Department of Statistical Science, University College London

Bayesian calibration and emulation of geophysical computer models

In this talk, we demonstrate a procedure for calibrating and emulating complex computer
simulation models having uncertain inputs and internal parameters, with application to
the NCAR Thermosphere-Ionosphere-Electrodynamics General Circulation Model (TIE-GCM),
and illustrate preliminary findings for Computational Fluid Dynamics and tsunami wave
modelling. In the case of TIE-GCM, we compare simulated magnetic perturbations with
observations at two ground locations for various combinations of calibration parameters.
These calibration parameters are: the amplitude of the semidiurnal tidal perturbation
in the height of a constant-pressure surface at the TIE-GCM lower boundary, the local
time at which this maximises and the minimum night-time electron density.

A fully Bayesian approach, that describes correlations in time and in the calibration
input space is implemented. A Markov Chain Monte Carlo (MCMC) approach leads to potential
optimal values for the amplitude and phase (within the limitations of the selected
data and calibration parameters) but not for the minimum night-time electron density.
The procedure can be extended to include additional data types and calibration parameters.
03/06/2010 5:30 PM

M203

Teo Sharia Department of Mathematics Royal Holloway, University of London

On-line parameter estimation procedures with application to estimating autoregressive parameters

Seminar series:

Statistics Seminar

A wide class of on-line estimation procedures will be proposed for the general statistical model.
In particular, new procedures for estimating autoregressive parameters in $AR(m)$ models will be
considered. The proposed method allows for incorporation of auxiliary information into the estimation
process, and is consistent and asymptotically efficient under certain regularity conditions. Also,
these procedures are naturally on-line and do not require storing all the data.

Two important special cases will be considered in detail: linear procedures and likelihood procedures
with the LS truncations. A specific example will also be presented to briefly discuss some practical
aspects of applications of the procedures of this type.
20/05/2010 5:30 PM

Jouni Kuha Department of StatisticsLondon School of Economics

The role of education in social mobility - Path analysis for discrete variables

Seminar series:

Statistics Seminar

Classical path analysis provides a simple way of expressing the observed
association of two variables as the sum of two terms which can with good
reason be described as the "direct effect" of one variable on the other
and the "indirect effect" via a third, intervening variable. This result
is used for linear models for continuous variables. It would often be of
interest to have a similar effect decomposition for cases where some of
the variables are discrete and modelled using non-linear models. One
such problem occurs in the study of social mobility, where the aim is to
decompose the association between a person's own and his/her parents'
social classes into an indirect effect attributable to associations
between education and class, and a direct effect not due to differences
in education.

Extending the idea of linear path analysis to non-linear models
requires, first, an extended definition of what is meant by total,
direct and indirect effects and, second, a way of calculating sample
estimates of these effects and their standard errors. One solution to
these questions is presented in this talk. The method is applied to data
from the UK General Household Survey, illustrating the magnitude of the
contribution of education to social mobility in Britain in recent
decades.

[This is joint work with John Goldthorpe (Nuffield College, Oxford)]
13/05/2010 5:00 PM

Joint meeting QMUL-S3RI, Statistics Department, Southampton University

Barbara Bogacka School of Mathematical SciencesQueen Mary, University of London

First in Human dose selection studies - a lesson from the TGN1412 trial

Seminar series:

Statistics Seminar

In 2006 the TGN1412 clinical trial was suddenly aborted due to a very strong cyto-toxic
reaction of the six volunteers who were treated with the drug candidate. An Expert
Scientific Group on Clinical Trials as well as the RSS Working Party wrote reports on
what happened, how this could have been avoided and what to recommend for future trials
of this kind. I will present some work related to designs of such trials, in particular
my work on adaptive design of experiments. I will also present some recommendations of
the RSS Working Group (Senn et al. 2007).

Senn, S., Amin, D., Bailey, R.A., Bird, S.M., Bogacka, B., Colman, P., Garrett, A., Grieve, A., Lachmann, P. (2007).
Statistical Issues in first-in-man studies. JRSS A.
06/05/2010 5:30 PM

M203

Ioannis Kosmidis Department of StatisticsThe University of Warwick

The reduction of bias in GLMs with emphasis on models with categorical responses

Seminar series:

Statistics Seminar

For estimation in exponential family models, Kosmidis & Firth (2009, Biometrika) show
how the bias of the maximum likelihood estimator may be reduced by appropriate adjustments
to the efficient score function. In this presentation the main results of that study are
discussed, complemented by recent work on the easy implementation and the beneficial
side-effects that bias reduction can have in the estimation of some well-used generalised
linear models for categorical responses. The construction of confidence intervals to
accompany the bias-reduced estimates is discussed.
25/03/2010 4:30 PM

M203

Stefano Conti Health Protection Agency

Dimensions of Design Space: A Decision-Theoretic Approach to Optimal Research Design

Seminar series:

Statistics Seminar

Bayesian decision theory can be used not only to establish the optimal sample
size and its allocation in a single clinical study, but also to identify an optimal
portfolio of research combining different types of study design. Within a single
study, the highest societal pay-off to proposed research is achieved when its
sample sizes, and allocation between available treatment options, are chosen to
maximise the Expected Net Benefit of Sampling (ENBS). Where a number of
different types of study informing different parameters in the decision problem
could be conducted, the simultaneous estimation of ENBS across all dimensions
of the design space is required to identify the optimal sample sizes and allocations
within such a research portfolio. This is illustrated through a simple
example of a decision model of zanamivir for the treatment of influenza. The
possible study designs include:
i) a single trial of all the parameters;
ii) a clinical trial providing evidence only on clinical endpoints;
iii) an epidemiological study of natural history of disease and
iv) a survey of quality of life.
The possible combinations, samples sizes and allocation between trial arms are
evaluated over a range of cost-effectiveness thresholds. The computational challenges
are addressed by implementing optimisation algorithms to search the
ENBS surface more efficiently over such large dimensions.
18/03/2011 4:30 PM

M203

Eleni Bakra MRC Biostatistics Unit, Cambridge

Tempered simplex sampler

Seminar series:

Statistics Seminar

Usual Markov chain Monte Carlo (MCMC) methods use a single Markov chain to sample
from the distribution of interest. If the target distribution is described by isolated
modes then it may be difficult for these methods to jump between the modes and for this
reason, the mixing is slow. Usually different starting positions are used to find out
isolated modes but this is not always feasible especially when the modes are difficult
to find or there is a big number of them. In this talk, I avoid these problems by
introducing a new population MCMC sampler, the tempered simplex sampler. The tempered
simplex sampler uses a tempering ladder to promote mixing while a population of Markov
chains is regarded under each temperature. The sampler proceeds by first updating the
Markov chains under each temperature using ideas from the Nelder-Mead simplex method
and then, by exchanging different populations of Markov chains under different
temperatures. The performance of the tempered simplex sampler is outlined on several
examples.
18/02/2010 4:30 PM

M203

Heiko Grossmann School of Mathematical SciencesQueen Mary University

Analysis of an experiment on bumblebee personality

Seminar series:

Statistics Seminar

This talk follows up on a presentation given by Helene Muller from QM's School
of Biological and Chemical Sciences at a statistics study group meeting in
January 2009.

The problem is to devise an appropriate analysis for investigating if bumblebees
behave in a consistent way. The dataset consists of N=729 observations which
represent repeated measurements on 81 bees under various experimental conditions.
A modelling strategy for these data is presented, which yields to fitting a nested
linear mixed model to the Box-Cox transformed responses. The results from the
corresponding analysis appear to be very satisfying and allow a classification
of bees into consistent and inconsistent ones. This is joint work with Helene
Muller and Lars Chittka.
21/01/2010 4:30 PM

M203

Mitra Noosha Queen MaryQueen Mary graduate students seminar

Discordance between prior and data using conjugate priors

Seminar series:

Statistics Seminar

In Bayesian Inference the choice of prior is very important to
indicate our beliefs and knowledge. However, if these initial beliefs
are not well elicited, then the data may not conform to our
expectations. The degree of discordancy between the observed data and
the proper prior is of interest. Pettit and Young (1996) suggested a
Bayes Factor to find the degree of discordancy. I have extended their
work to further examples.

I try to find explanations for Bayes Factor behaviour. As an
alternative I have looked at a mixture prior consisting of the
elicited prior and another with the same mean but a larger variance.
The posterior weight on the more diffuse prior can be used as a
measure of the prior and data discordancy and also gives an automatic
robust prior. I discuss various examples and show this new measure is
well correlated with the Bayes factor approach.
10/12/2009 4:30 PM

M203

SEMINAR CANCELLED

Seminar series:

Statistics Seminar
12/12/2009 4:30 PM

M203

A.I. Bejan Cambridge University

Inference and Optimal Experimental Design for Random Graph Models
Seminar series:

Statistics Seminar
We consider inference and optimal design problems for finite clusters from bond percolation on the integer lattice Z^d or, equivalently, for SIR epidemics evolving on a bounded or unbounded subset of Z^d with constant life times. The bond percolation probability p is considered to be unknown, possibly depending, through the experimental design, on other parameters. We consider inference under each of the following two scenarios:

The observations consist of the set of sites which are ever infected, so that the routes by which infections travel are not observed (in terms of the bond percolation process, this corresponds to a knowledge of the connected component containing the initially infected site--the location of this site within the component not being relevant to inference for p).

All that is observed is the size of the set of sites which are ever infected.

We discuss practical aspects of Bayesian utility-based optimal designs for the former scenario and prove that the sequence of MLE's for p converges to the critical percolation probability pc under the latter scenario (when the size of the finite cluster grows).
This is a joint work with Professor Gavin Gibson and Dr Stan Zachary, both with Heriot-Watt University, Edinburgh.
26/11/2009 4:30 PM

M203

M.J. Costa University of Warwick

tba

Seminar series:

Statistics Seminar
19/11/2009 4:30 PM

M203

S.G. Gilmour Queen Mary

Analysing Categorical Data from Multi-Stratum Designs

Seminar series:

Statistics Seminar
23/11/2011 12:00 PM

203

Jessica Enright

TBA

Seminar series:

Queen Mary Internal Postgraduate Seminar
12/11/2009 4:30 PM

M203

W. Yeung Queen Mary Queen Mary graduate students semina

tba

Seminar series:

Statistics Seminar
29/10/2009 4:30 AM

M203

H. Maruri-Aguilar Queen Mary

Designs for computer experiments

Seminar series:

Statistics Seminar

When modelling a computer experiment, the deviation between model and simulation data is due only to the bias (discrepancy) between the model for the computer experiment and the deterministic (albeit complicated) computer simulation. For this reason, replications in computer experiments add no extra information and the experimenter is more interested in efficiently exploring the design region.

I'll present a survey of designs useful for exploring the design region and for modelling computer simulations.
22/10/2010 5:30 PM

M203

K. Anaya-Izquierdo Open University

Sensitivity analysis, cuts and geometry

Seminar series:

Statistics Seminar

Sensitivity analysis in statistical science studies how scientifically relevant changes in the way we formulate problems affect answers to our questions of interest. New advances in statistical geometry allow us to build a rigorous framework in which to investigate these problems and develop insightful computational tools, including new diagnostic measures and plots.

This talk will be about statistical model elaboration using sensitivity analysis aided with geometry. Throughout we assume there is a working parametric model. The key idea here is to explore discretisations of the data, at which point multinomial distributions become universal (all possible models are cov- ered). The resulting structure is well-suited to discussing practically important statistical topics, such as exponential families and generalised linear models. The theory of cuts in exponential families allows clean inferential separation between interest and nuisance parameters and provides a basis for appropriate model elaboration. Examples are given where the resulting sensitivity analyses indicate the need for specific model elaboration or data re-examination.
08/10/2009 5:45 PM

M203

K.G. Russell University of Wollongong and S3RI Joint meeting with the Southampton Statistical Sciences Research Inst

D-Optimal designs for Poisson regression

Seminar series:

Statistics Seminar

The quality of incomplete-block designs is commonly assessed by the A-, D-, and E-optimality criteria. If there exists a balanced incomplete-block design for the given parameters, then it is optimal on all these criteria. It is therefore natural to use the proxy criteria of (almost) equal replication and (almost) equal concurrences when choosing a block design.

However, work over the last decade for block size 2 has shown that when the number of blocks is near the lower limit for estimability of all treatment contrasts then the D-criterion favours very different designs from the A- and E-criteria. In fact, the A- and E-optimal designs are far from equi-replicate and are amongst the worst on the D-criterion.

I shall report on current work which extends these results to all block sizes. Thus the problem is not blocks of size 2; it is low replication.
08/10/2009 4:15 PM

M203

R.A. Bailey Queen Mary Joint meeting with the Southampton Statistical Sciences Research Institute

Conflicts between optimality criteria for block designs with low replication

Seminar series:

Statistics Seminar

The quality of incomplete-block designs is commonly assessed by the A-, D-, and E-optimality criteria. If there exists a balanced incomplete-block design for the given parameters, then it is optimal on all these criteria. It is therefore natural to use the proxy criteria of (almost) equal replication and (almost) equal concurrences when choosing a block design.

However, work over the last decade for block size 2 has shown that when the number of blocks is near the lower limit for estimability of all treatment contrasts then the D-criterion favours very different designs from the A- and E-criteria. In fact, the A- and E-optimal designs are far from equi-replicate and are amongst the worst on the D-criterion.

I shall report on current work which extends these results to all block sizes. Thus the problem is not blocks of size 2; it is low replication.

Attachment	Size
Slides for talk [PDF 1,147KB]	1.12 MB

For 2024, the talks are held on Wednesdays at 14:00-15:00pm in room MB-503 on floor 5 of the School of Mathematical Sciences Building, Queen Mary University of London.

The seminar is organised in a hybrid fashion. Attendance can be either in-person or via zoom using that link.

The current seminar organisers are Arthur Guillaumin and Kostas Papafitsoros

Global main menu

Areas of study

Study at Queen Mary

Experience Queen Mary

Research and Innovation

Research by faculties and centres

Collaborations and partnerships