Statistics and Data Science Seminar

DateRoomSpeakerTitle

03/04/2024 2:00 PMMB503Prof. Maria Grith (Erasmus University Rotterdam)Neural Tangent Kernel in Implied Volatility Forecasting: A Nonlinear Functional Autoregression Approach
Implied volatility (IV) forecasting is inherently challenging due to its high dimensionality across various moneyness and maturity, and nonlinearity in both spatial and temporal aspects. We utilize implied volatility surfaces (IVS) to represent comprehensive spatial dependence and model the nonlinear temporal dependencies within a series of IVS. Leveraging advanced kernelbased machine learning techniques, we introduce the functional Neural Tangent Kernel (fNTK) estimator within the Nonlinear Functional Autoregression framework, specifically tailored to capture intricate relationships within implied volatilities. We establish the connection between NTK and functional kernel regression, emphasizing its role in contemporary nonparametric statistical modeling. Empirically, we analyze S&P 500 Index options from January 2009 to December 2021, encompassing more than 6 million European calls and puts, thereby showcasing the superior forecast accuracy of fNTK. We demonstrate the significant economic value of having an accurate implied volatility forecaster within trading strategies. Notably, short deltaneutral straddle trading, supported by fNTK, achieves a Sharpe ratio ranging from 1.45 to 2.02, resulting in a relative enhancement in trading outcomes ranging from 77% to 583%.

10/04/2024 2:00 PMMB503Dr Nicolas Hernandez (QMUL)Simultaneous predictive confidence bands for functional time series models
Functional Time Series (FTS) are sequences of dependent random elements taking values on some functional space. Most of the research on this domain focuses on producing a predictor able to forecast the next function, having observed a part of the sequence. For this, the Autoregressive Hilbertian process is a suitable framework. Here, we address the problem of constructing simultaneous predictive confidence bands for a stationary FTS. The method is based on an entropy measure for stochastic processes. To construct predictive bands, we use a Reproducing Kernel Hilbert Spaces (RKHS) to represent the functions and a functional bootstrap procedure that allows us to estimate the prediction law and a Reproducing Kernel Hilbert Spaces (RKHS) to represent the functions, considering then the basis associated to the reproducing kernel. We then classify the points on the projected space according to those that belong to the minimum entropy set (MES) and those that do not. We map the minimum entropy set back to the functional space and construct a band using the regularity property of the RKHS. The proposed methodology is illustrated through artificial and real data sets.

09/02/2012 4:30 PMM203Steve Bush School of Mathematical Sciences, University of Technology, SydneyOptimal Designs for Stated Choice Experiments that Incorporate Position Effects
Choice experiments are widely used in transportation, marketing, health and environmental research to measure consumer preferences. From these consumer preferences, we can calculate willingness to pay for an improved product or state, and hence make policy decisions based on these preferences.
In a choice experiment, we present choice sets to the respondent sequentially. Each choice set consists of m options, each of which describes a product or state, which we generically call an item. Each item is described by a set of attributes, the features that we are interested in measuring. Respondents are asked to select the most preferred item in each choice set. We then use the multinomial logit model to determine the importance of each attribute.
In some situations we may be interested in whether an item's position within the choice set affects the probability that the item is selected. This problem is reminiscent of donkey voting in elections, and can also be seen in the design of tournaments, where the home team is expected to have an advantage.
In this presentation, we present a discussion of stated choice experiments, and then discuss a model that incorporates position effects for choice experiments with arbitrary m. This is an extension of the model proposed by Davidson and Beaver (1977) for m = 2. We give optimal designs for the estimation of attribute main effects plus the position effects under the null hypothesis of equal selection probabilities. We conclude with some simulations that compare how well optimal designs and nearoptimal designs estimate the attribute main effects and position effects 
26/01/2012 10:56 AMM203Helen Warren London School of Hygiene and Tropical MedicineDerivation and Assessment of Robustness Criteria for Vulnerability of Block Designs in the Event of Observation Loss
This presentation summarises the main content of my PhD, researching into
the robustness of incomplete block designs, which introduces a Vulnerability
Measure to determine the likelihood of a design becoming disconnected with
inestimable treatment contrasts, as a result of random observation loss. For
any general block design, formulae have been derived and a program has been
written to calculate and output the vulnerability measures.
Comparisons are made between the vulnerability and optimality of designs.
The vulnerability measures can aid in design construction, be used as a pilot
procedure to ensure the proposed design is sufficiently robust, or as a method
of design selection by ranking the vulnerability measures of a set of competing
designs in order to identify the least vulnerable design. In particular, this can
distinguish between nonisomorphic BIBDs. By observing combinatorial relationships
between concurrences and block intersections of designs, this ranking
method is compared with other approaches in literature that consider the
effects on the efficiency of BIBDs, by either the loss of two complete blocks, or
the loss of up to three random observations.
The loss of whole blocks of observations is also considered, presenting improvements
on bounded conditions for the maximal robustness of designs.
Special cases of design classes are considered, e.g. complement BIBDs and repeated
BIBDs, as well as nonbalanced designs such as Regular Graph Designs 
17/07/2011 3:00 PMMathematics Lecture TheaterHugo MaruriAguilar School of Mathematical Sciences, Queen Mary, University of LondonSmooth polynomial methods for computer simulations
Smooth supersaturated polynomial interpolators (Bates et al. 2009)
are an alternative to modelling computer simulations. They have
the flexibility of polynomial modeling, while avoiding the
inconvenience of undesired polynomial oscillations (i.e. Runge's
phenomenon). Smooth polynomials have been observed to be most
effective for small sample sizes, although their use is not
restricted in this respect. The talk will survey the smooth
polynomial technique, comparing with traditional alternatives like
kriging or thinplate splines. Extensions and examples will be
presented.
This is joint work with Henry Wynn and Ron Bates (LSE). 
02/06/2011 5:30 PMM203Wai Yin Yeung School of Mathematical Sciences Queen Mary, University of LondonThe power of biased coin designs
The biased coin design introduced by Efron (1971, Biometrika) is a design for allocating patients in clinical
trials which helps to maintain the balance and randomness of the experiment. Its power is studied by Chen (2006,
Journal of Statistical Planning and Inference) and compared with that of repeated simple random sampling when
there are two treatment groups and patients’ responses are normally distributed. Another design similar to Efron’s
biased coin design called the adjustable biased coin design has been developed by Baldi Antognini and Giovagnoli
(2004, Journal of the Royal Statistical Society Series C) for patient allocation. Both designs aim to balance the
number of patients in two treatment groups. It is shown by Baldi Antognini (2008, Journal of Statistical Planning
and Inference) theoretically the adjustable biased coin design is uniformly more powerful than Efron’s biased coin
design. It means that the adjustable biased coin design gives a more balanced trial than Efron’s biased coin design.
Moreover, the biased coin design methods can also be applied to patients grouped by prognostic factors in order
to balance the number of patients in two treatments for each of the factors. This is called the covariateadaptive
biased coin design by Shao, Yu and Zhong (2010, Biometrika). It is believed that the covariateadaptive biased
coin design gains more power than Efron’s biased coin design and recently the covariateadjustable biased coin
design is also under investigation. However, the case when there is an interaction between covariates has not been
looked at in details for any of the above designs.
This talk will consist of three parts. First, numerical values for the simulated power for the adjustable
biased coin design which has not been studied before will be shown and compare with the simulated power of
Efron’s biased coin design. Then, the powers of repeated simple random sampling and the biased coin design will
be studied when responses are binary. The theoretical calculations and exact numerical results will then be given
for the unconditional powers of the two designs for binary responses. Finally, the expression for the power of
covariateadaptive randomization by normal approximation will be introduced. Numerical values for the normal
approximation will also be given to compare with the exact value of the biased coin design. In addition, for
the covariateadaptive biased coin design, the idea of global and marginal balance will also be introduced and
compared their difference when we have interactions for the covariates. 
05/05/2011 5:30 PMM103Prof. Dr. Vladimir V. Anisimov Senior Director, Research Statistics Unit,Quantitative Sciences, GlaxoSmithKlineStatistical techniques for predictive patient recruitment, randomization and drug supply modelling in multicentre clinic
A large clinical trial for testing a new drug usually involves a large number of patients and is
carried out in different countries using multiple clinical centres. The patients are recruited in
different centres, after a screening period they are randomized to different treatments according
to some randomization scheme and then get a prescribed drug. A design of multicenter clinical trials
consists of several stages including statistical design (choosing a statistical model for the analysis
of patient responses, randomization scheme, sample size needed for testing hypothesis, etc.) and
predicting patient's recruitment and drug supply needed to cover patient's demand.
The talk is devoted to the discussion of the advanced statistical techniques for modelling and
predicting stochastic processes describing the behaviour of trial in time. For modelling patient's
recruitment, the innovative predictive analytic statistical methodology is developed [1,2,3].
Patient's flows are modelled by using Poisson processes with random delays and gamma distributed
rates. ML and Bayesian techniques for estimating parameters using recruitment data and asymptotic
approximations for creating predictive bounds in time for the number of patients in centres/regions
are developed. It allows also to evaluate the optimal number of clinical centres needed to complete
the trial before deadline with a given probability and predict trial performance.
This technique is extended further to predicting the number of different events in trials with
waiting time to response. Closedform analytic expressions for the predictive distributions are
derived. Implementation in oncology trials is considered.
The technique for predicting the number of patients randomized to different treatments for the basic
randomization schemes – unstratified and centre/regionstratified, is developed and the impact of
randomization process on the statistical power and sample size of the trial is also investigated [4].
Using these results, an innovative riskbased statistical approach to predicting the amount of drug
supply required to cover patient demand with a given risk of stockout is developed [3]. The software
tools in R for patient's recruitment, event modelling and drug supply modelling based on these
techniques are developed. These tools are on the way of implementation in GSK and already led to
significant benefits and cost savings.
References
[1] Anisimov, V.V., Fedorov, V.V., Modeling, prediction and adaptive adjustment of recruitment in
multicentre trials. Statistics in Medicine, Vol. 26, No. 27, 2007, pp. 49584975.
[2] Anisimov, V.V., Recruitment modeling and predicting in clinical trials, Pharmaceutical
Outsourcing. Vol. 10, Issue 1, March/April 2009, pp. 4448.
[3] Anisimov, V.V., Predictive modelling of recruitment and drug supply in multicenter clinical
trials. Proc. of the Joint Statistical Meeting, Washington, USA, August, 2009, pp. 12481259.
[4] Anisimov, V., Impact of stratified randomization in clinical trials, In: Giovagnoli A., Atkinson
AC., Torsney B. (Eds), MODA 9  Advances in ModelOriented Design and Analysis. PhysicaVerlag/Springer,
Berlin, 2010, pp. 18.
[5] Anisimov, V., Drug supply modeling in clinical trials (statistical methodology), Pharmaceutical
Outsourcing, May/June, 2010, pp. 5055.
[6] Anisimov, V.V., Effects of unstratified and centrestratified randomization in multicentre clinical
trials. Pharmaceutical Statistics, v. 10, iss. 1, 2011, pp. 5059. 
31/03/2011 5:30 PMM203Roger Sugden School of Mathematical Sciences, QMULPrediction under unequal probability sampling
I consider a very simple prediction problem and contrast two
classical approaches with the Bayesian approach: firstly
in the case of no selection (or selection at random) and
secondly with limited design information in the form of
unequal probability weights for the sampled units.
I find the Bayesian approach much less ad hoc than the
alternatives! 
24/03/2011 4:30 PMM203Muddakkir Manas Khadim School of Mathematical Sciences, QMULAn Algorithm for Generating a Response Surface Splitplot Design
The estimation of the variance components of a response surface model for a
Splitplot design has been of much interest in recent years. Different techniques
are available for estimating these variance components that includes REML, a
bayesian approach, the replication of the center point runs and a randomization
based approach. The available numbers of degrees of freedom is also an
important issue when estimating these variance components. In our talk, we
will present an algorithm for generating a Doptimal Splitplot design such that
the generated design has a required number of degrees of freedom for estimating
the variance components using the randomization based approach. One advantage
of using this approach is that it gives pure error estimates of the variance
components. 
17/03/2011 4:30 PMM203M. Sofia Massa Department of Statistics, University of OxfordGraphical models combination
In some recent applications, the interest is in combining information about relationships between variables from independent studies performed under partially comparable circumstances. One possible way of formalising this problem is to consider combination of families of distribution respecting conditional independence constraints with respect to a graph G, i.e., graphical models. In this talk I will start by giving a brief introduction to graphical models and by introducing some motivating examples of the research question. Then I will present some relevant types of combinations and associated properties. Finally I will discuss some issues related to the estimation of the parameters of the combination.

10/03/2011 4:30 PMM203Shahrul MtIsa Imperial College LondonImproving EvidenceBased RiskBenefit DecisionMaking of Medicines for Children
Riskbenefit assessment for decisionmaking based on evidence is a subject of continuing interest. However, randomised clinical trials evidence of risks and benefits are not always available especially for drugs used in children mainly due to ethical concern of children being subjects of clinical trials. This thesis appraises riskbenefit evidence from published trials in children for the case study; assesses the riskbenefit balance of drugs, proposes a framework for riskbenefit evidence synthesis, and demonstrates the extent of its contribution.
The review shows trial designs lack safety planning leading to inconsistency safety reporting, and lack of efficacy evidence. The General Practice Research Database (GPRD) data was exploited to synthesise evidence of risks of cisapride and domperidone in children with gastrooesophageal reflux as a case study. Efficacy data are only available through review evidence.
Analysis of prescribing trends does not identify further riskbenefit issues but suggest the lack of evidence has led to inappropriate prescribing in children. Known adverse events are defined from the British National Formulary and quantified. Proportional reporting ratio technique is applied to other clinical events to generate potential safety signals. Signals are validated; and analysed for confirmatory association through covariates adjustment in regressions. The degree of associations between signals and drugs are assessed using Bradford Hill’s criteria for causation. Verified risks are known adverse events with 95% statistical significance, and signals in abdominal pain group and bronchitis and bronchiolitis group.
The drugs’ riskbenefit profiles are illustrated using the two verified signals and an efficacy outcome. Sensitivity of input parameters is studied via simulations. The findings are used to hypothetically advise riskbenefit aspects of trial designs. The value of information from this study varies between stakeholders and the keys to communicating risks and benefits lie in presentation and understanding. The generalisability and scope of the proposed methods are discussed 
24/02/2011 4:30 PMM203Gemma Stephenson National Oceanography Centre, SouthamptonUsing derivative information in the statistical analysis of computer models
Complex deterministic dynamical models are an important tool for climate prediction. Often though, such models are computationally too expensive to perform the many runs required. In this case one option is to build a Gaussian process emulator which acts as a surrogate, enabling fast prediction of the model output at specified input configurations. Derivative information may be available, either through the running of an appropriate adjoint model or as a result of some analysis previously performed. An emulator would likely benefit from the inclusion of this derivative information. Whether further efficiency is achieved, however, depends on the computational cost of obtaining the derivatives. Results of the emulation of a radiation transport model, with and without derivatives, are presented.
The knowledge of the derivatives of complex models can add greatly to their utility, for example in the application of sensitivity analysis or data assimilation. One way of generating such derivatives, as suggested above, is by coding an adjoint model. In climate science in particular adjoint models are becoming increasingly popular, despite the initial overhead of coding the adjoint and the subsequent, additional computational expense required to run the model.
We suggest an alternative method for generating partial derivatives of complex model output, with respect to model inputs. We propose the use of a Gaussian process emulator which can be used to estimate derivatives even without any derivative information known a priori. We show how an emulator can be employed to provide derivative information about an intermediate complexity climate model, CGOLDSTEIN, and compare the performance of such an emulator to the CGOLDSTEIN adjoint model. 
17/02/2011 3:30 PMMathematics Lecture TheaterBen Parker School of Mathematical Sciences, Queen Mary, University of LondonDesign of Experiments for Markov Chains, or how often should we open the box?
Suppose we have a system that we wish to make repeated measurements on,
but where measurement is expensive or disruptive. Motivated by an
example of probing data networks, we model this as a black box system:
we can either chose to open the box or not at any time period, and our
aim is to infer the parameters that govern how the system evolves over
time.
By regarding this system evolution as an experiment that is to be
optimised, we present a method for finding optimal time points at which
to measure, and discuss some numerical results.
We show how we can generalise this result to find optimal measurement
times for any system that evolves according to the Markov principle.
This is joint work with Steven Gilmour and John Schormans (Queen Mary). 
17/02/2011 4:30 PMM203Stefanie Biedermann University of SouthamptonOptimal Designs for Indirect Regression
In many real life applications, it is impossible to observe the feature of interest
directly. For example, noninvasive medical imaging techniques rely on indirect
observations to reconstruct an image of the patient’s internal organs. In this paper,
we investigate optimal designs for such indirect regression problems.
We use the optimal designs as benchmarks to investigate the efficiency of designs
commonly used in applications. Several examples are discussed for illustration.
Our designs provide guidelines to scientists regarding the experimental conditions
at which the indirect observations should be taken in order to obtain an accurate
estimate for the object of interest.
This is joint work with Nicolai Bissantz and Holger Dette (Bochum) and Edmund
Jones (Bristol). 
10/02/2011 4:30 PMM203Theodore Papamarkou NonCommunicable Disease (NCD) Research Group, Strangeways Research Laboratory University of CambridgePatterns of EthnoLinguistic and Genomic Diversity in SubSaharan Africa
SubSaharan African populations are characterized by a relatively complex genetic
architecture. Their excessive allele frequency differentiation, linkage disequilibrium
patterns and haplotype sharing have been understudied. The aim of our newly launched
project on African diversity is to understand the genetic diversity among subSaharan
African populations and its correlation with ethnic, archaeological and linguistic
variation. Ultimately, the study is hoping to disentangle past population histories
and therefore detect the evolutionary history of subSaharan African populations,
who are the origin of anatomically modern humans. Additionally, subsequent
genomewide association studies, mostly related to lipid metabolism, are expected
to identify previously unsuspected biological pathways involved in disease etiology.
The talk is meant to broadly address the scope of the study and to outline the
associated statistical challenges. 
03/02/2011 4:30 PMM203Dave Bray School of Mathematical Sciences (Queen Mary) and Department of Mechanical Engineering (Imperial College)Nanoparticles Dispersion: A Quantitative Measurement
Nanoparticle clustering within composite materials is known to affect the performance of the material, such as its toughness, and can ultimately cause its mechanical failure.
The type of nanoparticle dispersion is often judged through micrographs of the material, obtained using an electron or atomic force microscope. However no standard quantitative method is in use for classifying these materials into good (homogeneous) and poor (heterogeneous) dispersion. For material scientists it is of pressing concern that a suitable method is found to measure particle dispersion to enable further progress to be made in understanding the effect of morphology on the material properties.
This talk aims to be of general interest, providing the engineering background, proposed method and measurement results of test cases. 
27/01/2011 4:30 PMM203Ron Bates Design Systems Engineering, RollsRoyce plcStochastic analysis models in the development of turbomachinery
The main focus of this seminar is the industrial application of tools
for stochastic analysis within the standard engineering design process.
Various applications will be discussed and the use of statistical methods
in engineering will be highlighted.
The talk will also explore aspects of uncertainty management, and highlight
some of the challenges faced in delivering practical stochastic analysis
methods for the engineering community. 
21/01/2011 3:00 PMM513Robert Mee University of Tennessee, KnoxvilleOneStep RSM Using Fractional BoxBehnken Designs
In contrast to the usual sequential nature of response surface methodology
(RSM), recent literature has proposed both screening and response surface
exploration using a single threelevel design. This approach is known as
³onestep RSM². We discuss and illustrate shortcomings of the current
onestep RSM designs and analysis. Subsequently, we propose a class of
threelevel designs and an analysis that will address these shortcomings. We
illustrate the designs and analysis with simulated and real data. 
20/01/2011 4:30 PMM203Salvador Gezan Department of Statistics Institute of Food and Agricultural Sciences, University of FloridaOptimal design and analysis of field genetic trials: using old and new statistical tools.
Agronomic and forestry breeding trials tend to be large, often using hundreds of plants
and showing considerable spatial variation. In this study, we present various
alternatives for the design and analysis of field trials to identify “optimal” or
“near optimal” experimental designs and statistical techniques for estimating genetic
parameters through the use of simulated data for single site analysis. These simulations
investigated the consequences of different plot types (single or fourplant row),
experimental designs and patterns of environmental heterogeneity.
Also, spatial techniques such as nearest neighbor methods and modeling of the error
structure by specifying an autoregressive covariance were compared. Because spatial
variation cannot usually be accounted for in the trial design another strategy is to
improve trial analysis by using posthoc blocking. We studied several typical experimental
designs and compared their efficiency with posthoc blocking of the same designs over a
randomized complete block.
Usually, early stages of a breeding program there is a large availability of genotypes that
could be tested but limited resources. Here, unreplicated trials have been recommended as an
option to support on an early screening of genetic material that can be preselected and later
tested in more formal replicated trials for single or multiple sites. In this study, we provide
with a better evaluation/understanding of the statistical and genetic advantages and disadvantages
of the use of unreplicated trials by using simulated data, particularly for clonal trials, under
different replication alternatives. We also measure the gain in precision of using spatial
analysis in unreplicated trials and evaluate the effects of different genetic structures
(additive, dominant and epistasis) on these analyses. 
25/11/2010 4:30 PMM203Mohammad Lutfor Rahman School of Mathematical Sciences, Queen Mary, University of LondonMultistratum designs with categorical responses
It is not possible to completely randomize the order of runs in some multifactor factorial experiments.
This often results in a generalization of the factorial designs called splitplot designs. Sometimes in
industrial experiments complete randomization is not feasible because of having some factors whose
levels are difficult to change. When properly taken into account at the design stage, hardtochange
factors lead naturally to multistratum structures. Mixed models are used to analyze multistratum
designs as each stratum may have random effects on the responses. We intend to design
experiments and analyze categorical data with hardtoset factors with the motivation of random
effects structure in the mixed models. The current study is motivated by a polypropylene experiment
by four Belgian companies where responses are continuous and categorical. We have analyzed the
data from the current experiment using mixed binary logit and mixed cumulative logit models in a
Bayesian approach. Also we obtained outputs following the simplified models by Goos and Gilmour
(2010). While simplified models were used, the output obtained by Bayesian methods were similar to
those obtained by likelihood methods as noninformative priors were considered for the fixed effects. 
25/11/2010 5:00 PMM203Benjamin Gaby School of Mathematical Sciences, Queen Mary, University of LondonBayesian Tests For Outliers In Uniform Samples
In 1979 Barnett derived a series of classical tests that were based on simulations
to test whether extreme observations in a sample were outliers. He did this
for a variety of different probability distributions, including the Normal,
Exponential, Uniform and Pareto distributions. In 1988 Pettit considers
this problem for Exponential samples by using a Bayesian approach based on
deriving Bayes Factors to perform these tests. Then in 1990 he studies the
multivariate Normal distribution in some detail and approaches this problem
by deriving various results using the conditional predictive ordinate. Since
then this problem has been considered for both the Poisson and Binomial
distributions.
Recently I have been studying this problem for the Uniform and Pareto
distributions and our talk is based on the results that I have obtained for
the Uniform case. The talk will be in two parts, first we look at the one
sided Uniform distribution, where I have shown that the largest observation
in the sample minimises the conditional predictive ordinate and then derived
the Bayes Factor to test whether it is an outlier. I then derived the Bayes
Factors for the cases when we have multiple outliers generated by the same
probability distribution and generated by different probability distributions.
For the one sided Uniform distribution all the results that I obtained managed
to be exact in the fact that I did not have to approximate any integrals.
The second part of the talk looks at the two sided Uniform distribution,
where the structure of the problem was exactly the same as for the one sided
Uniform distribution except that it was a lot more complicated because of
it being a two parameter problem. I dealt with this by using a transformation
that made this a one parameter problem and then used an analytical approach to
approximate the Bayes Factors by an infinite series, where a full derivation
for the approximation and proof that the series converges are given. Finally
in this section, I extend my ideas to solve the problem for multivariate
Uniform distribution. 
18/11/2010 4:30 PMM203Piotr Zwiernik Department of Statistics, University of WarwickCumulants and Lcumulants spaces
In this talk I will first introduce cumulants which form a
convenient language to describe and approximate probability
distributions. A rich combinatorial structure of cumulants
helps to understand them better. The combinatorial version
of the definition of cumulants gives also a direct
generalization to Lcumulants. Without going to much into
technical details I will try to show how Lcumulants can be
used in the analysis of certain statistical models.
Our example focuses on phylogenetic tree models which are
graphical models with hidden data. I will also mention some
links with free probability. 
11/11/2010 4:30 PMM203Alfonso Miranda Department of Quantitative Social Science, Institute of Education, University of LondonMissing ordinal covariates with informative selection
This paper considers the problem of parameter estimation in a model for a
continuous response variable y when an important ordinal explanatory
variable x is missing for a large proportion of the sample. Nonmissingness
of x, or sample selection, is correlated with the response variable
and/or with the unobserved values the ordinal explanatory variable takes
when missing. We suggest solving the endogenous selection, or `not missing
at random' (NMAR), problem by modelling the informative selection mechanism,
the ordinal explanatory variable, and the response variable together.
The use of the method is illustrated by reexamining the problem of the ethnic
gap in school achievement at age 16 in England using linked data from
the National Pupil database (NPD), the Longitudinal Study of Young People
in England (LSYPE), and the Census 2001. 
04/11/2010 4:30 PMM203Michal Komorowski Theoretical Systems Biology Group, Imperial College LondonInference, sensitivity and identifiability in stochastic chemical systems.
The aim of the presentation is to present a novel, integrated
theoretical framework for the analysis of stochastic biochemical
reactions models. The framework includes efficient methods for
statistical parameter estimation from experimental data, as well as
tools to study parameter identifiability, sensitivity and robustness.
The methods provide novel conclusions about functionality and
statistical properties of stochastic systems.
I will introduce a general model of chemical reactions described by
the Chemical Master Equation that I approximate using the linear noise
approximation. This allows to write explicit expressions for the
likelihood of experimental data, which lead to an efficient inference
algorithm and a quick method for calculation of the Fisher Information
Matrices.
A number of experimental and theoretical examples will be presented to
show how the techniques can be used to extract information from the
noise structure inherent to experimental data. Examples include
inference of parameters of gene expression using a fluorescent
reporter gene data, a Bayesian hierarchical model for estimation of
transcription rates and a study of the p53 system. Novel insights into
the causes and effects of stochasticity in biochemical systems are
obtained by the analysis of the Fisher Information Matrices.
References:
Komorowski, M. , Finkenstädt , B., Rand, D. A. , (2010); Using single
fluorescent reporter gene to infer halflife of extrinsic noise and
other parameters of gene expression, Biophysical Journal, Vol 98,
Issue 12, 27592769,
Komorowski, M. , Finkenstädt , B., Harper, C. V., Rand, D. A. ,
(2009); Bayesian inference of biochemical kinetic parameters using the
linear noise approximation, BMC Bioinformatics, 2009, 10:343
doi:10.1186/1471210510343, 2009,
B. Finkenstadt; E. A. Heron; M. Komorowski; K. Edwards; S. Tang; C. V.
Harper; J. R. E. Davis; M. R. H. White; A. J. Millar; D. A. Rand,
(2008);
Reconstruction of transcriptional dynamics from gene reporter data
using differential equations, Bioinformatics 15 December 2008; 24:
2901  2907. 
28/10/2010 5:30 PMM203Guy Freeman University of WarwickLearning, prediction and causation with graphical models
Graphical models provide a very promising avenue for making sense
of large, complex datasets. In this talk I review strategies for
learning Bayesian networks, the most popular graphical models currently
in use, and introduce a new graphical model, the chain event graph,
which is an improvement on using the Bayes net in many cases but
which introduces its own challenges for learning, prediction and
causation. 
21/10/2010 5:30 PMM203Alexander Vikhavsky School of Engineering and Material Science, Queen Mary, University of LondonNumerical analysis of the global identifiability of electrochemical systems
We discuss a numerical analysis of the parametric identifiability of electrochemical
systems. Firstly, we analyze global identifiability of the entire set of parameters
in a single ac voltammetry experiment and examine the effect of different waveforms
(square, sawtooth) on the accuracy of the identification procedure.
The analysis of global identifiability is equivalent to finding a global optimum of
a specially designed function. The optimization problem is solved by a random search
method and a statistical analysis of the obtained solution allows for selection of
a subset of the parameters (or they linear combinations), which can be identified.
Finally, we discuss optimization of the waveform for better identifiability. 
14/10/2010 5:30 PMM203Ramon Rizvi School of Mathematical Sciences, Queen Mary, University of LondonBeyond cluster analysisSeminar series:
Cluster analysis is a well established statistical technique which aims
to detect groups in data. Its main use is as an exploratory tool rather
than a conclusive technique.
Recently there has been growing interest in expansions of this technique
under the general umbrella name "Persistence of homology". This new topic
is in the crossroads between statistics and topology; and the main aim is
to describe other features than groups present in multivariate data and
thus it is a natural extension of clustering.
Betti numbers are used to describe data, and applying the first Betti number
coincides with cluster analysis, whereas subsequent Betti numbers enable
detection of "holes" or loops in data. For example, cluster analysis is
unable to detect whether data is gathered around a circle, but with persistent
homology this feature is immediately detected.
The Seminar aims to survey both techniques and illustrate with some
examples.
This Seminar is the result of EPSRC Vacation Bursary Scheme 2010, won
by QMUL undergraduate Ramon Rizvi. 
07/10/2010 5:30 PMM203Rosemary A. Bailey School of Mathematical Sciences, Queen Mary, University of LondonThe randomization model for twophase experimentsSeminar series:
For a singlephase experiment, we allocate treatments to
experimental units using a systematic plan, and then randomize by
permuting the experimental units by a permutation chosen at random from
a suitable group. This leads to the theory developed in J. A.
Nelder's 1965 Royal Society papers. Recently, C. J. Brien and I have
been extending this theory to experiments such as twophase
experiments, where the produce, or outputs, from the first phase are
randomized to a new set of experimental units in the second phase.
This brings in new difficulties, especially with standard software. 
27/05/2010 5:30 PMM203Dan Stowell Centre for Digital Music Queen Mary, University of LondonRating and ranking in a standardised audio listening testSeminar series:
We describe a standardised audio listening test known as MUSHRA, used to
evaluate the perceptual quality of intermediatequality audio algorithms
(for example MP3 compression). The nature of the test involves aspects
of continuous rating as well as ranking of items. We discuss the statistics
used to analyse test data, in light of recent experiences conducting a
user group study. 
29/04/2010 5:30 PMM203Silvia Liverani Department of Statistics Bristol UniversityBayesian Clustering of Curves and Microarray DataSeminar series:
An increasing number of microarray experiments produce time
series of expression levels for many genes. Some recent clustering
algorithms respect the time ordering of the data and are, importantly,
extremely fast. The aim is to cluster and classify the expression
profiles in order to identify genes potentially involved in, and
regulated by, the circadian clock. In this presentation we report new
developments associated with this methodology. The partition space is
intelligently searched placing most effort in refining the partition
where genes are likely to be of most scientific interest. 
11/03/2010 4:30 PMM203Muna Arephin Cancer Research UK Centre for Epidemiology, Mathematics and Statistics Wolfson Institute of Population HealthOrder restricted inference for multiarm trialsSeminar series:
There is an increasing demand to test more than one new treatment in the hope of
finding at least one that is better than the control group in clinical trials. A likelihood
ratio test is developed using order restricted inference, a family of tests is defined and
it is shown that the LRT and Dunnetttype tests are members of this family. Tests are
compared, using power and a simple loss function which takes incorrect selection, and
its impact, into account. The optimal allocation of patients to treatments were sought
to maximize power and minimize expected loss.
For small samples, the LRT statistic for binary data based on order restricted inference
is derived and used to develop a conditional exact test. Twostage adaptive designs for
comparing two experimental arms with a control are developed, in which the trial is
stopped early if the difference between the best treatment and the control is less than
C1; otherwise, it continues, with one arm if one experimental treatment is better than
the other by at least C2, or with both arms otherwise. Values of the constants C1 and
C2 are compared and the adaptive design is found to be more powerful than the fixed
design. 
04/03/2010 4:30 PMM203Kei Kobayashi Department of Mathematical Analysis and Statistical Inference The Institute of Statistical MathematicsBayesian shrinkage prediction and its application to regression problemsSeminar series:
In this talk, we consider Bayesian shrinkage predictions for the Normal regression problem under the frequentist KullbackLeibler risk function. This result is an extension of Komaki (2001, Biometrika) and George (2006, Annals. Stat.).
Firstly, we consider the multivariate Normal model with an unknown mean and a known covariance. The covariance matrix can be changed after the first sampling. We assume rotation invariant priors of the covariance matrix and the future covariance matrix and show that the shrinkage predictive density with the rescaled rotation invariant superharmonic priors is minimax under the KullbackLeibler risk. Moreover, if the prior is not constant, Bayesian predictive density based on the prior dominates the one with the uniform prior.
In this case, the rescaled priors are independent of the covariance matrix of future samples. Therefore, we can calculate the posterior distribution and the mean of the predictive distribution (i.e. the posterior mean and the Bayesian estimate for quadratic loss) based on some of the rescaled Stein priors without knowledge of future covariance. Since the predictive density with the uniform prior is minimax, the one with each rescaled Stein prior is also minimax.Next we consider Bayesian predictions whose prior can depend on the future covariance. In this case, we prove that the Bayesian prediction based on a rescaled superharmonic prior dominates the one with the uniform prior without assuming the rotation invariance.
Applying these results to the prediction of response variables in the Normal regression model, we show that there exists the prior distribution such that the corresponding Bayesian predictive density dominates that based on the uniform prior. Since the prior distribution depends on the future explanatory variables, both the posterior distribution and the mean of the predictive distribution may depend on the future explanatory variables.The Stein effect has robustness in the sense that it depends on the loss function rather than the true distribution of the observations. Our result shows that the Stein effect has
robustness with respect to the covariance of the true distribution of the future observations. 
25/02/2010 4:30 PM203Anthony C. Atkinson Department of StatisticsLondon School of EconomicsOptimum Experimental Designs for Enzyme Kinetic ModelsSeminar series:
Enzymes are biological catalysts that act on substrates. The speed of reaction as a function of substrate concentration typically follows
the nonlinear MichaelisMenten model. The reactions can be modified by the presence of inhibitors, which can act by several different mechanisms, leading to a variety of models, all also nonlinear.
The paper describes the models and derives optimum experimental designs for model building. These include Doptimum designs for all the parameters and Dsoptimum designs for subsets of parameters. The Dsoptimum designs may be nonsingular and so do not provide estimates of all parameters; designs are suggested which have both good D and Dsefficiencies. Also derived are designs for testing the equality of parameters. 
11/02/2010 4:30 PMM203Janet Godolphin Department of Mathematics University of SurreyEstimability and Connectivity in mway DesignsSeminar series:
The classical problem of ascertaining the connectivity status of an
mway design has received much attention, particularly in the cases
where m=2 and m=3. In the general case, a new approach yields the
connectivity status for the overall design and for each of the individual
factors directly from the kernel space of the design matrix. Furthermore,
the set of estimable parametric functions in each factor is derived from
a segregated component of this kernel space.
The kernel space approach enables a simple derivation of some classical
results. Examples are given to illustrate the main results. 
04/02/2010 4:30 PMM203Henry P. Wynn Department of Statistics London School of EconomicsInformationbased learning, with thoughts on optimal experimental design.Seminar series:
The information approach to optimal experimental design
is widened to include informationbased learning more
generally, drawing on the classical work of Renyi, Lindley,
de Groot and others. Learning is considered as occurring
when the posterior distribution of the quantity of interest
is more peaked than the prior, in a certain sense.
A key theorem states when this is expected to occur. Some
special examples are considered which show the boundary
between when learning occurs and when it does not. 
28/01/2010 4:30 PMM203Mahbub Latif School of Mathematical Sciences Queen MaryDesign and analysis of transformbothsides nonlinear modelsSeminar series:
Transformation on both sides of a nonlinear regression model has been used in practice to achieve, for example, linearity in the parameters of the model, approximately normally distributed errors, and constant error variance. The method of maximum likelihood is the most common method for estimating the parameters of the nonlinear model and the transformation parameter. In this talk we will discuss a new method, which we call the Anova method, for estimating all the parameters of the transformbothsides nonlinear model. The Anova method is computationally simpler than the maximum likelihood approach and and allows a more natural separation of different sources of lackoffit.
Considering the MichaelisMenten model as an example, we will show the results of a simulation study for comparing maximum likelihood and Anova methods, where the BoxCox transformation is used for transforming both sides of the MichaelisMenten model. We will also show the use of the Anova method in fitting more complex transformbothsides nonlinear models, such as transformbothsides nonlinear mixed effects models and transformbothsides nonlinear model with random block effects. At the end of the talk, we will briefly present a new approach of designing transformbothsides nonlinear MichaelisMenten model. 
21/01/2010 5:00 PMM203Maria Roopa Queen Mary, Queen Mary graduate students seminarBayesian decision procedures for dose escalationa reanalysisSeminar series:
Zhou et.al (2006) developed Bayesian doseescalation procedures for
early phase I clinical trials in oncology.They are based on with discrete
measures of undesirable events and continuous measures of therapeutic
benefit. The objective is to find the optimal dose associated with some
low probability of an adverse event.
To understand their methodology I tried to reproduce their results
using a hierarchical linear model (Lindley and Smith (1972)) with different
orderings of the data. Computations were done in R. I found my results
were consistent with one another but different to the published results.
I then also programmed the model using ``WinBugs'' and again found the
results to be consistent with mine. I concluded that the published results
were in error.
My main interests are in Bayesian approaches for the design and analysis
of dose escalation trials, which involves prior information concerning
parameters of the relationships between dose and the risk of an adverse
event, with the chance to update after every dosing period using Bayes
theorem. In this talk I will discuss some of these issues and also shall
report my current work. 
05/11/2009 4:30 PMM203A. Giovagnoli Università di BolognaRandomized group upanddown (U&D) experimentsSeminar series:
Dating back to Dixon and Mood (1948), an UpandDown procedure is a sequential experiment used in binary response trials for identifying the stress level (treatment) corresponding to a prespecified probability of positive response. In Phase I clinical trials U&D rules can bee seen as a development of the traditional doseescalation procedure (Storer, 1998). Recently Baldi Antognini et al. (2008) have proposed a group version of U&D procedures whereby at each stage a group of m units is treated at the same level and the number of observed positive responses determines how to randomize the level assignment of the next group. This design generalizes a vast class of U&Ds previously considered (Derman, 1957; Durham and Flournoy 1994; Giovagnoli and Pintacuda, 1998; Gezmu and Flournoy, 2006). The properties of the design change as the randomization method varies: appropriate randomization schemes guarantee desirable results in terms of the asymptotic behaviour of the experiment (see also Bortot and Giovagnoli, 2005). Results can be extended to continuous responses (Ivanova and Kim, 2009).
Other approaches for identifying a target dose, alternative to the nonparametric U&D, are the parametric Continual Reassessment Method introduced by O'Quigley et al. (1990), and several recent modifications thereof. The debate on dose escalation procedures in the recent statistical literature continues to be very lively.

15/10/2009 5:30 PMM203W. Bergsma London School of EconomicsMarginal models for dependent, clustered and longitudinal categorical dataSeminar series:
In the social, behavioral, educational, economic, and biomedical sciences, data are often collected in ways that introduce dependencies in the observations to be compared. For example, the same respondents are interviewed at several occasions, several members of networks or groups are interviewed within the same survey, or, within families, both children and parents are investigated. Statistical methods that take the dependencies in the data into account must then be used, e.g., when observations at time one and time two are compared in longitudinal studies. At present, researchers almost automatically turn to multi level models or to GEE estimation to deal with these dependencies. Despite the enormous potential and applicability of these recent developments, they require restrictive assumptions on the nature of the dependencies in the data.
The marginal models of this talk provide another way of dealing with these dependencies, without the need for such assumptions, and can be used to answer research questions directly at the intended marginal level. The maximum likelihood method, with its attractive statistical properties, is used for fitting the models. This talk is based on a recent book by the authors in the Springer series Statistics for the Social Sciences, see www.cmm.st.

20/03/2024 2:00 PMMB503Prof. Ioanna Manolopoulou (UCL)Combining observational data with nonrepresentative randomised data in heterogeneous treatment effect modellingBuilding statistical models using nonrandomly sampled data is a wellknown challenge in statistics, and is especially challenging when any part of the statistical model is not fully identifiable. In causal inference, and in particular in the estimation of heterogeneous treatment effects, this arises when observational data are used which may be affected by unobserved confounding. One approach to correct for such confounding is to combine observational data with randomised experiments. However, when these randomised experiments are not representative of the whole population, the effect of deconfounding will be poor for subsets of the population that fall outside the range of these experiments. Depending on the structure of the model and the nature of the prior distributions used within a Bayesian model, this will be addressed by borrowing information from other parts of the space. In this work, we highlight the importance of building models that can account for uncertainty due to unobserved confounding in regions where no deconfounding is possible. To this end, we embed a combination of randomised and observational data into Bayesian Causal Forests (BCF), and make use of adaptive modular inference to harness as much reliable information from the observational data as possible, without leading to overconfidence in regions of poor identifiability. We implement our methods on a set of simulated and real data examples.

13/03/2024 2:00 PMMB503Dr Nicolo Colombo (Royal Holloway)On training locallyadaptive Conformal Prediction
Conformal Prediction (CP) is a distributionfree and nonasymptotic uncertainty estimation method, i.e. it does not rely on assumptions on the underlying data distribution and provides finitesample guarantees. Given any pretrained prediction algorithm and a test sample, a CP algorithm produces a Prediction Set (PS), i.e. a subset of the label space, that is guaranteed to contain the test label with lowerbounded marginal probability. We address the problem of making the PS locally adaptive. The proposed new strategy produces PS that are marginally valid but have inputdependent sizes. The localization process is cast into a smooth minimization problem and can be solved through standard gradient methods.

28/02/2024 2:00 PMMB503Dr Sandipan Roy (University of Bath)MultiResponse Linear Regression Estimation Based on LowRank Presmoothing
Presmoothing is a technique aimed at increasing the signaltonoise ratio in data to improve subsequent estimation and model selection in regression problems. However, presmoothing has thus far been limited to the univariate response regression setting. Motivated by the widespread interest in multiresponse regression analysis in many scientific applications, this article proposes a technique for data presmoothing in this setting based on low rank approximation. We establish theoretical results on the performance of the proposed methodology, and quantify its benefit empirically in a number of simulated experiments. We also demonstrate our proposed low rank presmoothing technique on real data arising from the environmental sciences.

21/02/2024 2:00 PMMB503Dr. Zeljko Kereta (UCL)On improving unsupervised approaches for medical image reconstruction
Deep learningbased image reconstruction approaches have demonstrated considerable success in many imaging modalities. However, their reliance on abundant highquality paired training data remains a significant hurdle in many problem domains where such datasets are not available, for example in medical imaging. Moreover, deep learning approaches in data scarce scenarios often fail to generalise and are prone to reconstruction artefacts in case of distributional shifts. In this talk we present an unsupervised/selfsupervised deep learning approach aimed to address these challenges through a twostage methodology. In the first stage the network is pretrained on simulated training data of ground truth images and measurements. In the second stage the parameters are finetuned on the target image, adapting the model to the shift in distribution. Experimental results showcase the effectiveness of our approach, revealing accelerated deployment, improved stability, and competitive performance despite limited training data.

14/02/2024 2:00 PMMB503Prof. Emmanuil Georgoulis (HeriotWatt)hpVersion Discontinuous Galerkin Methods on Essentially ArbitrarilyShaped Elements
I will present a recent generalisation of the popular interiorpenalty discontinuous Galerkin (dG) method discretizing general classes of linear and nonlinear advectiondiffusionreaction problems to meshes comprising extremely general, essentially arbitrarilyshaped element shapes. In particular, our analysis allows for curved element shapes, without the use of nonlinear elemental maps. The feasibility of the method relies on the definition of a suitable choice of the discontinuitypenalisation, which turns out to be explicitly dependent on the particular element shape, but essentially independent on small shape variations. A priori error bounds for the resulting method will be given, under very mild structural assumptions restricting the magnitude of the local curvature of element boundaries. I also plan to discuss briefly computer implementation aspects of the framework. Numerical experiments will be also presented throughout the talk aiming to motivate and showcase the practicality and the potential advantages of the proposed numerical framework.

31/01/2024 2:00 PMMB503Dr Alex Shestopaloff (QMUL)Robust Detection of LeadLag Relationships in Lagged MultiFactor Models
In multivariate time series systems, key insights can be obtained by discovering leadlag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clusteringdriven methodology for robust detection of leadlag relationships in lagged multifactor models. Within our framework, the envisioned pipeline takes as input a set of time series, and creates an enlarged universe of extracted subsequence time series from each input time series, via a sliding window approach. This is then followed by an application of various clustering techniques, (such as kmeans++ and spectral clustering), employing a variety of pairwise similarity measures, including nonlinear ones. Once the clusters have been extracted, leadlag estimates across clusters are robustly aggregated to enhance the identification of the consistent relationships in the original universe. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our method is not only able to robustly detect leadlag relationships in financial markets, but can also yield insightful results when applied to an environmental data set.

30/11/2023 2:00 PMMB503Dr Poulami Ganguly (SMS, QMUL)Gridfree algorithms for tomographic imaging
The inverse problem of tomographic imaging is the reconstruction of a 3D sample from 2D projection images. An estimate for the 3D reconstruction of a sample is usually obtained by discretizing the reconstruction volume using a voxel grid. This discretization may not be ideal in scenarios where additional prior knowledge is available. In this talk, we look at two applications where gridfree alternatives are advantageous: first, we look at the problem of reconstructing a nanocrystal at atomic resolution from electron microscopy images taken at a few tilt angles. We propose a gridfree algorithm that allows for continuous deviations of the atom locations. We show that this allows for a meaningful incorporation of additional prior knowledge about the system, in particular the potential energy of the configuration, and is able to resolve lattice defects in simulated data. In addition, we show how augmenting such an approach with a model for deformation allows us to propose a gridfree algorithm for tiltseries alignment in cryoelectron tomography. We compare this second approach with existing approaches for tiltseries alignment and show that we can reliably estimate marker locations and deformations without labelling markers in projection data.

07/12/2023 2:00 PMMB503Dr José A. Iglesias (University of Twente)On extremal points for some convex regularizers
Due in part to a wider acceptance of advanced convex optimization methods, nonsmooth regularization terms are now a mainstay of variational approaches in inverse problems, optimal control, and beyond. A majority of those used in practice are positively onehomogeneous, which means that they can be seen as the Minkowski or gauge functional of an infinitedimensional convex set, the generalized unit ball associated to the regularizer.
Under compactness assumptions which are in any case required for the regularization method to be wellposed, these balls can be described as the convex hull of their extremal points. Making such a description explicit has a multitude of applications mostly revolving around sparsity, which is usually the motivation for introducing such regularization functionals in the first place. These include results showing existence of solutions that can be expressed using finitely many of these extremal points, and optimization algorithms based on such iterates, which often admit fast convergence guarantees and gridfree implementations.
In this talk we will consider this description of extremal points in some specific cases. We provide a full characterization for two infimal convolutiontype functionals, the total generalized variation in one dimension and KantorovichRubinstein norms in spaces of signed measures in Euclidean space, as well as some results on different variants of the total (gradient) variation.
Based on joint works with Daniel Walter, Marcello Carioni, Giacomo Cristinelli and Kristian Bredies.

16/11/2023 2:00 PMMB503Dr Natalia Efremova (SBM, QMUL)Leveraging Machine Learning and Earth Observation Data for Sustainable Agriculture
Agriculture is both one of the sectors most susceptible to climate change and a significant contributor to it. Therefore, it is essential to consider both mitigation and adaptation strategies, as well as transforming agricultural practices to promote sustainability and resilience in the agricultural sector. A key objective of application of artificial intelligence (AI) and satellite imagery in agricultural settings is to develop more reliable and scalable methods for monitoring global crop conditions promptly and transparently, while also exploring how we can adapt agriculture to mitigate the effects of climate change. Agricultural monitoring with earth observation data provides a timely and reliable way to access the state of the field or farm and the surrounding territories, used for gathering data and producing forecasts. Computer vision and signal processing techniques play a crucial role in extracting meaningful information from raw satellite data. Growing adoption of AI and machine learning (ML) tools has significantly influenced the expansion of Earth Observation (EO) and remote sensing to agricultural management. In this talk, I will discuss advanced techniques employed throughout the entire data processing cycle, encompassing tasks such as data compression, transmission, image recognition, and forecasting environmental factors like land cover, land use, biomass, organic soil carbon and soil nutrient and more.

09/11/2023 2:00 PMMB503Dr Swati Chandna (Birkbeck, University of London)Nonparametric modeling and estimation for network data
Network data are commonly observed in a wide variety of applications. Such data may arise in the form of a single network observed at a given point in time, or as multiple networks on the same set of nodes, for example, social networks on the same set of individuals over time, or from different social platforms at a given point in time. A nonparametric approach to studying structure in unlabeled networks is offered by the graphon function. There has been a growing interest on the problem of graphon estimation as well as its application to important problems such as bootstrapping networks, estimation of missing links etc. In this talk, I will present results on graphon estimation from a single network observed with node covariates and a natural extension of the graphon model to the bivariate setting where a pair of possibly correlated networks on the same set of nodes are observed.

23/11/2023 2:00 PMMB503Dr Angelica AvilesRivero (University of Cambridge)Functionals, Neural Nets, and Beyond: On MultiModal Graph Learning and Implicit Neural RepresentationsIn this talk, we delve into two pivotal subjects. The first topic revolves around the development of hybrid graph models tailored to the complexities of multimodal data. We present a novel semisupervised hypergraph learning framework, specifically designed for diagnostic purposes. Our approach adopts a hybrid perspective, where we introduce a new methodology centered on a dual embedding strategy and a semiexplicit flow. To illustrate the efficacy of our proposed model, we employ it within the realm of Alzheimer's disease diagnosis, demonstrating its capacity to uncover latent relationships within intricate multimodal data.Transitioning seamlessly to the second subject, we delve into implicit neural representations. We introduce an innovative function designed to harness the strengths of Strong Spatial and Frequency attributes, marking a departure from conventional methods. Remarkably, our novel technique showcases exceptional enhancements in performance across a diverse array of downstream tasks, notably encompassing CT reconstruction and denoising applications. Through rigorous experimentation, we elucidate the advantages enabled by our approach.

26/10/2023 2:00 PMMB503Dr Martin Benning (QMUL)A lifted Bregman formulation for the inversion of deep neural networksWe propose a novel framework for the regularized inversion of deep neural networks. The framework is based on recent work on training feedforward neural networks without the differentiation of activation functions. The framework lifts the parameter space into a higher dimensional space by introducing auxiliary variables, and penalizes these variables with tailored Bregman distances. We propose a family of variational regularizations based on these Bregman distances, present theoretical results and support their practical application with numerical examples. In particular, we present the first convergence result (to the best of our knowledge) for the regularized inversion of a singlelayer perceptron that only assumes that the solution of the inverse problem is in the range of the regularization operator, and that shows that the regularized inverse provably converges to the true inverse if measurement errors converge to zero. This is joint work with Xiaoyu Wang from HeriotWatt University.

12/10/2023 2:00 PMMB503Dr Yury Korolev (University of Bath)Vectorvalued Barron spaces
Approximation properties of infinitely wide neural networks have been studied by several authors in the last few years. New function spaces have been introduced that consist of functions that can be efficiently (i.e., with dimensionindependent rates) approximated by neural networks of finite width, e.g. Barron spaces for networks with a single hidden layer. Typically, these functions act between Euclidean spaces, typically with a highdimensional input space and a lowerdimensional output space. As neural networks gain popularity in inherently infinitedimensional settings such as inverse problems and imaging, it becomes necessary to analyse the properties of neural networks as nonlinear operators acting between infinitedimensional spaces. In this talk, I will discuss a generalisation of Barron spaces to functions that map between Banach spaces and present MonteCarlo (1/sqrt(n)) approximation rates.

05/10/2023 2:00 PMMB503Dr Kolyan Ray (Imperial College London)A variational Bayes approach to debiased inference in highdimensional linear regression
We consider statistical inference for a single coordinate of a highdimensional parameter in sparse linear regression. It is wellknown that highdimensional procedures such as the LASSO can provide biased estimators for this problem and thus require debiasing. Motivated by recent theoretical advances on debiased Bayesian inference, we propose a scalable variational Bayes approach to this problem. We investigate the numerical performance of this algorithm and establish accompanying theoretical guarantees for estimation and uncertainty quantification. Joint work with Ismael Castillo, Alice L’Huillier and Luke Travis.

28/09/2023 2:00 PMMB503Dr Adam Sykulski (Imperial College London)Spatiotemporal Statistical Modelling of Ocean Data
This talk will study spatiotemporal data collected from “drifters” which are instruments designed to freely float around our ocean, mimicking particles of water. While the focus of the talk is in oceanography, this form of data is ubiquitous, for example the spatiotemporal data collected from wearable devices (e.g. medical wristwatches), therefore much of the methodology presented can translate to other applications. The focus is on datadriven statistical solutions, the presentation will not be too technical, and no prerequisite knowledge of oceanography is expected from the audience!

MB503Pierre Miasnikof (University of Toronto)Two statistical techniques for graph structure assessment (complex networks)
I will present two statistical techniques that were specifically designed to address problems in network analysis. The first is a statistical algorithm to determine if a network meets the prerequisite conditions to be meaningfully summarized through clusters. Clustering algorithms will always identify clusters. Unfortunately, if a network does not possess a clustered structure, the (node) clustering exercise will not only be a waste of time, it will inevitably result in misleading conclusions. The second technique is a statistical routine that seeks to answer the question "is network G1 similar to network G2?". To answer this question, we transform the graph into a probability distribution and use a standard KolmogorovSmirnov test.

12/04/2023 11:00 AMMB503Jim Griffin (UCL)Bayesian vector autoregressions with tensor decompositions
Vector autoregressions (VARs) are popular in analyzing economic time series. However, VARs can be overparameterized if the numbers of variables and lags are moderately large. Tensor VAR, a recent solution to overparameterization, treats the coefficient matrix as a thirdorder tensor and estimates the corresponding tensor decomposition to achieve parsimony. In this paper, the inference of Tensor VARs is inspired by the literature on factor models. Firstly, we determine the rank by imposing the Multiplicative Gamma Prior to margins, i.e. elements in the decomposition, and accelerate the computation with an adaptive inferential scheme. Secondly, to obtain interpretable margins, we propose an interweaving algorithm to improve the mixing of margins and introduce a postprocessing procedure to solve column permutations and signswitching issues. In the application of the US macroeconomic data, our models outperform standard VARs in point and density forecasting and yield interpretable results consistent with the US economic history.

05/04/2023 11:00 AMMB503Eftychia Solea (QMUL)Highdimensional Nonparametric Functional Graphical Models via the Functional Additive Partial Correlation Operator
This article develops a novel approach for estimating a highdimensional and nonparametric graphical model for functional data. Our approach is built on a new linear operator, the functional additive partial correlation operator, which extends the partial correlation matrix to both the nonparametric and functional settings. We show that its nonzero elements can be used to characterize the graph, and we employ sparse regression techniques for graph estimation. Moreover, the method does not rely on any distributional assumptions and does not require the computation of multidimensional kernels, thus avoiding the curse of dimensionality. We establish both estimation consistency and graph selection consistency of the proposed estimator, while allowing the number of nodes to grow with the increasing sample size. Through simulation studies, we demonstrate that our method performs better than existing methods in cases where the Gaussian or Gaussian copula assumption does not hold. We also demonstrate the performance of the proposed method by a study of an electroencephalography data set to construct a brain network.

22/03/2023 11:00 AMMB503Eoghan O’Neill (Erasmus University of Rotterdam)Type 1 and Type 2 Tobit Bayesian Additive Regression Tree Models
This paper introduces Type I and Type II Tobit Bayesian Additive Regression Trees (TOBART1 and TOBART2). Simulation results and applications to real data sets demonstrate that TOBART1 produces more accurate predictions than competing methods, and provides posterior intervals for the conditional expectation and other quantities of interest.
TOBART2 extends the Type II Tobit model to account for nonlinearities and model uncertainty by including sums of trees in both the selection and outcome equations. A Dirichlet Process Mixture distribution for the error term allows for departure from the assumption of bivariate normally distributed errors. Simulation studies suggest that TOBART2 can produce more accurate treatment effect estimates than competing approaches. We illustrate the method with an application to the RAND Health Insurance Experiment.

22/02/2023 11:00 AMMB503Javier Rubio Alvarez (UCL)Flexible Excess Hazard Modelling with Applications in Cancer Epidemiology
Excess hazard modelling is one of the main tools in populationbased cancer survival research. This setting allows for direct modelling of the survival due to cancer in the absence of reliable information on the cause of death, which is common in populationbased cancer epidemiology studies. We propose a unifying linkbased additive modelling framework for the excess hazard that allows for the inclusion of many types of covariate effects, including spatial and timedependent effects, using any type of smoother, such as thin plate, cubic splines, tensor products and Markov random fields. Three case studies that illustrate the type of applications of interest in practice will be presented. We will conclude with a discussion on available software tools (in R), as well as a general discussion on the use of the relative survival framework.

01/03/2023 11:00 AMMB503Richard Hooper (QMUL)Optimal design of stepped wedge cluster randomised trials
Stepped wedge trials are cluster randomised clinical trials in which each “cluster” of participants (e.g. all users of a local health service) are randomised, not to one treatment condition or another, but to a particular schedule for crossing from the control condition to the intervention condition. Some clusters might cross over before data collection even begins; some might cross over at some point during the prospective data collection interval, and some might not cross over at all during that interval. In a stepped wedge trial the crossover is always unidirectional. You can cross from control to intervention, but never back again from intervention to control. In some stepped wedge trials participants are recruited from each cluster in one, long, consecutive stream; in others they are recruited once at the start of the trial and followed prospectively as a cohort; in still others they are sampled in a series of crosssectional snapshots of the cluster through time. The unidirectional crossover, the constraints on how many people you can recruit and when, and the way you model the correlation between health outcomes of individuals from the same cluster (the intracluster correlation), all lead to some fascinating problems in the design of experiments, with some equally fascinating solutions. These solutions are of great practical interest to applied health researchers trying to evaluate public health interventions and quality improvement programmes. In my own methods research programme I am particularly interested in two kinds of stepped wedge design: “incomplete” designs, where data collection effort is focused at particular times in particular clusters, and designs with continuous recruitment of participants. I will present some of the findings from this work.

15/02/2023 11:00 AMMB503Yanbo Tang (Imperial College London)Adaptive Quadrature for Bayesian Inference
Adaptive numerical quadrature is used to normalize posterior distributions in many Bayesian models. We provide the first stochastic convergence rate for the error incurred when normalizing a posterior distribution under typical regularity conditions. We give approximations to moments, marginal densities, and quantiles, and provide convergence rates for several of these summaries. Low and highdimensional applications are presented, the latter using adaptive quadrature as one component of a more sophisticated approximation framework, for which limited theory is given. Extension of the theory to the highdimensional framework for the Laplace approximation (a specific instance of an adaptive quadrature method) is considered and guarantees are provided under additional regularity assumptions.

09/02/2023 2:30 PMMB503Michael Pitt (KCL)On some properties of Markov chain Monte Carlo simulation methods when the likelihood is intractable
Markov chain Monte Carlo samplers still converge to the correct posterior distribution of the model parameters when an unbiased estimator is available for the Likelihood. Whilst this allows inference for a very wide variety of intractable problems, a critical issue for performance is the choice of the number of particles (or samples).
We add the following contributions. We provide analytically derived, practical guidelines on the optimal number of particles to use in general scenarios. We show that the results in the article apply more generally to Markov chain Monte Carlo sampling schemes with the likelihood estimated in an unbiased manner. We introduce recent results on the asymptotic limits as T (the length of the time series) becomes large. Applications include Stochastic Volatility models for which the volatility follows a stochastic differential equation.

26/01/2023 11:30 AMMB503Nicola Perra (QMUL)Modelling the spreading of SARSCoV2 across different spatiotemporal scales
In the talk, I will provide an overview of different approaches I have applied to model the unfolding of the COVID19 pandemic and its effects. In doing so, I will discuss the insights obtained by studying the initial phases of the pandemic, the first wave, and the vaccine rollout in the USA, Europe as well as Latin America. I will also discuss the key role of nonpharmaceutical interventions.

02/02/2023 11:00 AMMB503Dan Zhu (Monash University)Distribution Vector Autoregression: Eliciting Macro and Financial Dependence
Vector autoregression is an essential tool in empirical macroeconomics and finance, providing simple yet insightful information, such as the impulse response function of different shocks. This paper extends the scope of vector autoregression under a multivariate distribution regression framework and proposes the distribution impulse response function, which provides a more comprehensive picture of the dynamic heterogeneity. As an empirical application, we apply the proposed method to study the conditional joint distribution of GDP growth rate and financial conditions in the U.S. The results from our new framework confirm some existing findings in the literature: 1) the tight financial condition creates multimodality in the conditional joint distribution, and 2) restricting the upper tail of financial condition has a noticeable impact on longterm GDP growth. Yet, the extracted information on the effect of restricting the lower tail of GDP during the global financial crisis suggests an alternative conclusion, i.e., negligible impact on financial condition.

22/11/2022 11:00 AMMB503Uzu Lim (Oxford)Tangent space and dimension estimation of data manifoldConsider a set of points sampled independently near a smooth compact submanifold of Euclidean space. We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold with high confidence. The algorithm for this estimation is Local PCA, a local version of principal component analysis. Our results accommodate for noisy nonuniform data distribution with the noise that may vary across the manifold, and allow simultaneous estimation at multiple points. Crucially, all of the constants appearing in our bound are explicitly described. The proof uses a matrix concentration inequality to estimate covariance matrices and a Wasserstein distance bound for quantifying nonlinearity of the underlying manifold and nonuniformity of the probability measure.

06/12/2022 11:00 AMMB503John Baez (UCR, CQT NUS and Topos Institute)Information Theory in Population Dynamics
Information theory has interesting connections to the population dynamics of selfreplicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher's fundamental theorem of natural selection.

29/11/2022 11:00 AMMB503Kostas Papafitsoros (QMUL)Automatic Distributed Parameter Selection of Regularisation Functionals in Imaging via Bilevel Optimisation
We will discuss a series of bilevel optimisation problems that use a suitable statisticsbased upper level objective and lead to automatic selection of spatially dependent parameters for regularisation functionals used in image reconstruction. The spatial dependence of the parameters generally leads to a better recovery of highdetailed areas in the reconstructed image. We will introduce the framework by considering initially as a regulariser the weighted Total Variation, and subsequently discuss its artifactfree, higher order extension, weighted Total Generalised Variation. We will then present some recent results regarding extension of the framework to regularisation functionals that involve a more general class of differential operators. The applicability of this extension will be demonstrated with numerical results in image denoising for a Huber Total Variation functional where also the underlying Huber parameter is chosen to be spatially dependent. This provides further flexibility in the regularisation process and eventually results in an improved reconstruction quality.

15/11/2022 11:00 AMMB503Kristian Strommen (University of Oxford)A topological perspective on weather regimes
It has long been suggested that the midlatitude atmospheric circulation possesses what has come to be known as "weather regimes", which can roughly be categorised as regions of phase space with aboveaverage density. Their existence and behaviour has been extensively studied in meteorology and climate science, due to their potential for drastically simplifying the complex and chaotic midlatitude dynamics. Several wellknown, simple nonlinear dynamical systems have been used as toymodels of the atmosphere in order to understand and exemplify such regime behaviour. Nevertheless, no agreedupon and clearcut definition of a "regime" exists in the literature, and unambiguously detecting their existence in the atmospheric circulation is often hindered by the high dimensionality of the system.
In this talk I will first give an overview of some of the approaches used to study and define weather regimes. I will then proceed to propose a definition of weather regime that equates the existence of regimes in a dynamical system with the existence of nontrivial topological structure of the system's attractor. I will discuss how this approach is computationally tractable, practically informative, and identifies the relevant regime structure across a range of examples. This talk is based on the paper https://doi.org/10.1007/s0038202206395x

01/11/2022 11:00 AMMB503Vitaliy Kurlin (University of Liverpool)Geometric Data Science: old challenges and new solutionsGeometric Data Science develops continuous parameterizations
on moduli spaces of data objects up to important equivalences. The key
example is a finite or periodic set of unlabeled points considered up
to rigid motion or isometry preserving interpoint distances. Periodic
point sets model all solid crystalline materials (periodic crystals)
with zerosize points at all atomic centers. A periodic point set is
usually given by a finite motif of points (atoms or ions) in a unit
cell (parallelepiped) spanned by a linear basis. The underlying
lattice can be generated by infinitely many bases. Even worse, the set
of possible motifs for any periodic set is continuously infinite.
This typical ambiguity of data representation was recently resolved bygenerically complete and continuous isometry invariants: PointwiseDistance Distributions (PDD) of periodic point sets. The nearlinear
time algorithm for PDD invariants was tested on more than 200 billion
pairwise comparisons of all 660K+ periodic crystals in the world's
largest collection of real materials: the Cambridge Structural
Database.
The huge experiment above took only two days on a modest desktop and
detected five pairs of isometric duplicates. In each pair, the
crystals are truly isometric to each other but one atom is replaced
with a different atom type, which seems physically impossible without
perturbing distances to atomic neighbors. Five journals are now
investigating the integrity of the underlying publications that
claimed these crystals.
The more important conclusion is the Crystal Isometry Principle
meaning that all real periodic crystals have unique geographicstyle
locations in a common continuous Crystal Isometry Space (CRISP). This
space CRISP is parameterized by complete isometry invariants and
contains all known and not yet discovered periodic crystals.The relevant publications are in NeurIPS 2022, MATCH 2022, SoCG 2021.The latest paper in arxiv:2207.08502 defined complete isometry
invariants with continuous computable metrics on any finite sets of
unlabeled points in a Euclidean space. Many papers are coauthored
with colleagues at Liverpool Materials Innovation Factory and inked at 
25/10/2022 11:00 AMMB503Hugo MaruriAguilar (QMUL)Betti penalisation of LassoThis talk is concerned with an enhancement of Lasso for polynomialregression models. Our polynomial regression uses squarefree
hierarchical models, and these models can be seen as a simplicial
complex. We propose a compound criterion that combines
validation error with a measure of model complexity, and the measure
of model complexity is a sum of Betti numbers of the model.
The compound criteria helps model selection in polynomial regression models
containing higherorder interactions. Simulation results and a real data
example show that the compound criteria produces sparser models with lower
prediction errors than other statistical methods.
As part of the talk, I will mention briefly the history of this project which
I believe is worthy looking at.This is joint work with S. Hu (Alibaba) and Z. Ma (Huawei) 
11/10/2022 11:00 AMMB503Celeste Damiani (QMUL)MammoAI: An AI System for Risk Assessment at Mammography Screening
At the moment, in the UK the majority of women go through the same breast cancer screening programme, but different women have different levels of risk of getting breast cancer.
We look at how we can assess risk using mammograms to enable new breast cancer screening programmes that they are more suited to the level of risk faced by each woman. In particular:
 How can we tell when a woman might be at risk of getting a false negative during a standard mammogram, and should be offered a supplemental screening method?
 How do we assess the risk future cancer after a negative screen?
 How can we use TDA tools for risk assessment on mammograms?
This work is part of the CRUK funded project (reference 49757/A28689) "An Artificial Intelligence System for Realtime Risk Assessment at Mammography Screening (Mammo AI)'"

04/10/2022 11:00 AMMB503Matteo Iacopini (QMUL)Bayesian Additive Regression Trees for RankOrder Data
Rankordered data are popular in many fields, including sports, marketing, finance, politics, and health economics. Most of the existing approaches rely on the restrictive assumption of a linear specification for the latent scores that drive the observed ranks. Besides, despite being provided over time by one or multiple rankers, the temporal dimension and properties of these orderings have been rarely investigated in the literature. To deal with these issues, we introduce two novel families of nonparametric orderstatistics models that considers a static (ROBART) and an autoregressive process (ARROBART) for the latent scores and allows for a nonlinear impact of each covariate on the latent scores. This is achieved by modeling the regression function via a Bayesian additive regression tree (BART), that defines the overall fit as the sum the fit of many small regression trees. As generalizations of the Thurstone family, the proposed ROBART and ARROBART models preserve interpretability and include several popular frameworks as special cases. Joint work with Eoghan O’Neill, Luca Rossini.

27/09/2022 11:00 AMMB503Renata Turkes (University of Antwerp)On the Effectiveness of Persistent HomologyPersistent homology (PH) is one of the most popular methods in Topological Data Analysis. Even though PH has been used in many different types of applications, the reasons behind its success remain elusive; in particular, it is not known for which classes of problems it is most effective, or to what extent it can detect geometric or topological features. The goal of this work is to identify some types of problems where PH performs well or even better than other stateoftheart methods in data analysis. We consider three fundamental shape analysis tasks: the detection of the number of holes, curvature and convexity from 2D and 3D point clouds sampled from shapes. Experiments demonstrate that PH is successful in these tasks, outperforming several baselines, including PointNet, an architecture inspired precisely by the properties of point clouds. In addition, we observe that PH remains effective for limited computational resources and limited training data, as well as outofdistribution test data, including various data transformations and noise. For convexity detection, we provide a theoretical guarantee that PH is effective for this task, and demonstrate the detection of a convexity measure on the FLAVIA dataset of plant leaf images.This talk is based on joint work with Guido Montufar and Nina Otter (https://arxiv.org/abs/2206.10551)The talk will be given in person in MB503, and we will also make it available remotely through the following Zoom link.

13/04/2022 1:00 PMvia ZoomDaniel Paulin, Edinburgh UniversityEfficient MCMC sampling with dimensionfree convergence rate using ADMMtype splitting
Performing exact Bayesian inference for complex models is computationally intractable. Markov chain Monte Carlo (MCMC) algorithms can provide reliable approximations of the posterior distribution but are expensive for large data sets and highdimensional models. A standard approach to mitigate this complexity consists in using subsampling techniques or distributing the data across a cluster. However, these approaches are typically unreliable in highdimensional scenarios. We focus here on a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated alternating direction method of multipliers (ADMM) optimization algorithm. These methods appear to provide empirically stateoftheart performance but their theoretical behavior in high dimension is currently unknown. In this paper, we propose a detailed theoretical study of one of these algorithms known as the split Gibbs sampler. Under regularity conditions, we establish explicit convergence rates for this scheme using Ricci curvature and coupling ideas. We support our theory with numerical illustrations. This is joint work with Maxime Vono (Criteo AI Lab) and Arnaud Doucet (Oxford).

06/04/2022 1:00 PMvia ZoomAxel FinkeConditional sequential Monte Carlo in high dimensionsWe discuss Markov chain Monte Carlo methods called "iterated conditional sequential Monte Carlo" a.k.a. "particle Gibbs samplers". These methods can be used to approximate the joint distribution of all latent states in statespace models. We show that these methods suffer a curse of dimension. We then introduce a novel modification of this method which employs local, randomwalk type moves to circumvent this curse of dimension.

08/03/2022 2:00 PMMB204 in person + via ZoomPeter BubenikHomotopy, Homology, and Persistent Homology using Cech’s Closure Spaces
We use Cech closure spaces, also known as pretopological spaces, to develop a uniform framework that encompasses the discrete homology of metric spaces, the singular homology of topological spaces, and the homology of (directed) clique complexes, along with their respective homotopy theories. We obtain nine homology and six homotopy theories of closure spaces. We show how metric spaces and more general structures such as weighted directed graphs produce filtered closure spaces. For filtered closure spaces, our homology theories produce persistence modules. We extend the definition of GromovHausdorff distance to filtered closure spaces and use it to prove that our persistence modules and their persistence diagrams are stable. We also extend the definitions VietorisRips and Cech complexes to closure spaces and prove that their persistent homology is stable.
This is joint work with Nikola Milicevic.

23/02/2022 1:00 PMvia ZoomHugo MaruriAguilarEchelon designs, Hilbert series and Smolyak grids
Echelon designs were first described in the monograph by Pistone et al. (2000). These designs are defined for continuous factors and include, amongst others, factorial designs. They have the appealing property that the saturated polynomial model associated to it mirrors the geometric configuration of the design. Perhaps surprisingly, the interpolators for such designs are based upon the Hilbert series of the monomial ideal associated with the polynomial model and thus the interpolators satisfy properties of inclusionexclusion.
Echelon designs are quite flexible for modelling and include the recently developed designs known as Smolyak sparse grids. In our talk we present the designs, describe their properties and show examples of application.
This is joint work with H. Wynn (CATS, LSE).
Reference: Pistone et al. (2000) Algebraic Statistics. Chapman & Hall/CRC
Key words: Sparse grids, experimental design, algebraic statistics, polynomial models.

16/02/2022 1:00 PMvia ZoomQuan Zhou (Texas A&M)Informed MCMC sampling for highdimensional model selection problems
Informed Markov chain Monte Carlo (MCMC) methods have been proposed as scalable solutions to Bayesian posterior computation on highdimensional discrete state spaces, but theoretical results about their convergence behavior in general settings are lacking. In this talk, we first consider the variable selection problem. We propose a novel informed MetropolisHastings algorithm which can achieve a mixing rate that is independent of the number of covariates, under mild highdimensional conditions. The mixing time proof relies on a novel method called "twostage drift condition". This result shows that the mixing rate of locally informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation, and thus such methods scale well to highdimensional data. Second, we consider MCMC sampling on general finite state spaces. We propose a class of methods called informed importance tempering (IIT) and develop generally applicable spectral gap bounds that characterize the convergence rate of IIT. Our theory provides important insights into how to choose the proposal weighting scheme for an informed MCMC method. If time permits, we will also briefly discuss the application of our theory to the highdimensional structure learning problem. This talk is based on joint works with A. Smith, H. Chang, J. Yang, D. Vats, G. Roberts and J. Rosenthal.

15/03/2018 4:15 PMW316, Queens' BuildingE. Y. Wang, Wolfson Institute of Population HealthDesign of multiarm, multistage clinical trials with dynamic controls
Often, for a given patient population, there will be more than one treatment available for testing at the Phase III stage. Rather than conducting separate randomised controlled trials for each of these treatments (which could require prohibitively high numbers of patients), this study proposes a multiarm trial assessing the performance of all the available treatments.
A surrogate biomarker/endpoint will be used to judge the performance of the treatments at interims, where, if a treatment underperforms with regard to the surrogate biomarker compared to the control, it will be removed and recruitment instead given to a new promising treatment.
I explored trials of this design which continue over a long period of time, comparing the power of a trial of this design with several consecutive parallel Phase III trials, to explore which design fared best in terms of type I error, power, survival of patients on the trials and longterm survival of patients with the given disease.

04/05/2017 4:30 PMQueens' W316J. M. Cuzick, Wolfson Institute of Population HealthUse of frailty models in medical statistics
Frailty models are typically used when there are unobserved covariates. Here we explore their use in two situations where they provide important insights into two epidemiologic questions.
The first involves the question of type replacement after vaccination against the human papilloma virus (HPV). At least 13 types of HPV are known to cause cancer, especially cervix cancer. Recently vaccines have been developed against some of the more important types, notably types 16 and 18. These vaccines have been shown to prevent infection by the types used with almost 100% efficacy. However a concern has been raised that by eliminating these more common types, a niche will be created in which other types could now flourish and that the benefits of vaccination could be less than anticipated if this were to occur. It will be years before definitive data is available on this, but preliminary evidence could be obtained if it could be shown that there is a negative associated between the occurrence of multiple infections in the same individual. The virus is transmitted by sexual contact and testing for it has become part of cervical screening. As infection increases with greater sexual activity, a woman with one type is more likely to also harbour another type so the question can be phrase as to whether there is a negative association between specific pairs of types in the context of an overall positive association. A frailty model is used for this in which the total number of infections a women has is an unobserved covariate and the question can be rephrased to ask if specific types are negatively correlated conditional on the number of types present in a woman. This is modelled by assuming a multiplicative random variable τ having a log gamma distribution with unit mean and one additional parameter θ so that the occurrence of type j in individual i is modified to be τi pj where τi are iid copies of τ and the joint probability of being infected by types 1,..,k is Ѳkp1…pk with Ѳk = E(τk). A likelihood is obtained and moment based estimation procedures are developed and applied to a large data set.
A second example pertains to an extension of the widely used proportional hazards model for analysing time to event data with censoring. In practice hazards are often not proportional over time and converging hazards are observed, and the effect of a covariate is stronger in early follow up than it is subsequently. This can be modelled by assuming an unobserved multiplicative factor in the hazard function again having a log gamma distribution with unit mean and one additional parameter θ. Integrating out this term leads to a Pareto survival distribution. A (partial) likelihood is obtained and estimation procedures are developed and applied to a large data set.

08/12/2016 4:30 PMBR 3.02C. Wang, Wolfson Institute of Population HealthSpatial analysis and its application in modelling cancer screening coverage in England
One problem that arises from spatial data is that spatial correlation often exists among the observations, since spatial unites close to each other are likely to share similar socioeconomic, infrastructure or other characteristics. Statistical models that ignored spatial correlation may lead to biased parameter estimates. In the econometrics literature, there are several methods to measure and model such spatial correlated effects. We demonstrate some of these statistical tools using realworld data by exploring factors affecting cancer screening coverage in England. In this particular study, we are interested in the impact of car ownership and public transport usage on breast and cervical cancer screening coverage. Districtlevel cancer screening coverage data (in proportions) and UK census data have been collected and linked.
A nonspatial model (using ordinary least squares, OLS) was firstly fitted, and Moran's I statistic was used and found that significant spatial correlation exists even after controlling for a range of predictors. Two alternative spatial models were then tested, namely: 1) spatial autoregressive (SAR) model, and 2) SAR error model, or simply as spatial error model (SEM).
Results from spatial models are compared with the nonspatial models, it has been found that some coefficient estimates are different, and the former outperforms the latter in terms of goodnessoffit. In particular, the SEM is the best model for both types of cancer.
Finally, we discuss some general issues in spatial analysis, such as the modifiable areal unit problem (MAUP), different spatial weighting schemes, and other spatial modelling strategies such as a gravity model and a spatially varying coefficient model.

19/05/2016 4:30 PMM103B. V. North, Wolfson Institute of Population HealthSTARPAC and phase 1 studies with the continual reassessment method
Phase I clinical trials are an essential step in the development of anticancer drugs. The main goal of these studies is to establish the recommended dose and/or schedule of new drugs or drug combinations for phase II trials. The guiding principle for dose escalation in phase I trials is to avoid exposing too many patients to subtherapeutic doses while maintaining rapid accrual and preserving safety by limiting toxic sideeffects. STARPAC is a phase 1 trial examining the use of ATRA, a Vitamin A like compound, in combination with established cancer drugs in combatting pancreatic cancer, a cancer with a dismal survival record which is the 4th highest cancer killer world wide. A challenge for toxicity trials that prescribe doses for newly recruited patients based on the dose and toxicity data from previous patients in that patients are recruited before previous patients have reported toxicity data. In order to safely escalate doses we employ a 2 stage process with the first stage an accelerated rule based procedure and the second stage a modified approach based on the Bayesian Continual Reassessment Method that combines a prior toxicitydose curve with the accumulating patient dose/toxicity data.

04/02/2016 4:30 PMM103P. D. Sasieni, Wolfson Institute of Population HealthAlternatives to net and relative survival for comparison of survival between populations
Most cancer registries choose not to rely on cause of death when presenting survival statistics on cancer patients, but instead to look at overall mortality after diagnosis and adjust for the expected mortality in the cohort had they not been diagnosed with cancer. For many years the relative survival (observed survival divided by expected survival) was estimated by the EdererII method. More recently statisticians have used the theory of classical competing risks to estimate the net survival – that is the survival that would be observed in cancer patients if it were possible to remove all competing causes of death. PoharPerme showed that in general estimators of the relative survival and not consistent for the net survival, and proposed a new consistent estimator of the net survival. PoherPerme’s estimator can have much larger variance than EdererII (and may not be robust). Thus whereas some statisticians have argued that one must use the PoherPerme estimator because it is the only one that is consistent for the net survival, others have argued that there is a biasvariance trade off and EdererII may still be preferred even though it is inconsistent.
We draw analogy from the literature regarding robust estimation of location. If one wants to estimate the mean of a distribution consistently, then it may be difficult to improve on the sample mean. But if one simply wants a measure of location then other estimators are possible and might be preferred to the sample mean. We define a measure of net survival to be a functional satisfying certain equivariance and order conditions. The limits of neither EdererII nor PoharPerme satisfy our definition of being an invariant measure of net survival. We introduce two families of functionals that do satisfy our definition. Consideration of minimum variance and robustness then allows us to select a single member of each family as the preferred measure of net survival.
Noting that in a homogeneous population the relative survival and the net survival are identical and correspond to the survival of the excess hazard, we can then view our functionals of weighted averages of stratumspecific relativesurvival, netsurvival or excesshazards. These can be viewed as standardised estimators with standardising weights that are timedependent. The preferred measures use weights that depend on the numbers at risk in each stratum from a standard population as a function of time. For example, when the strata are defined by age at diagnosis, the standardising weights will depend on the agespecific prevalence of the cancer in the standard population.
We show through simulation that, unlike both EdererII and PoharPerme, our estimators are invariant and robust under changing population structures, and also that they are consistent and reasonably efficient. Although our estimator does not (consistently) estimate the (marginal) net hazard it performs as well or better than both the crude and standardised versions of both EdererII and PoharPerme in all simulations.
Joint work with Adam Brentnall

29/01/2015 4:30 PMM203A. R. Brentnall, Wolfson Institute of Population HealthOn use of the concordance index in epidemiology
Studies of risk factors in epidemiology often use a casecontrol design. The concordance index (or area under the receiver operating characteristic (ROC) curve (AUC)) may be used in unmatched casecontrol studies to measure how well a variable discriminates between cases and controls. The AUC is sometimes used in matched casecontrol studies by ignoring matching, but it lacks interpretation because it is not based on an estimate of the ROC for the population of interest. An alternative measure of the concordance of risk factors conditional on the matching factors will be introduced, and applied to data from breast and lung cancer casecontrol studies.
Another common design in epidemiology is the cohort study, where the aim might be to estimate the concordance index for predictors of censored survival data. A popular method only considers pairs of individuals when the
smaller outcome is uncensored (Harrell's cstatistic). While this statistic can be useful for comparing different models on the same data set, it is dependent on the censoring distribution. Methods to address this issue will be considered and applied to data from a breast cancer trial. 
22/04/2013 3:00 PM130 Wolfson InstitutePeter Sasieni, Wolfson Institute of Population Health, QMULSome Statistical issues arising from evaluating cancer screening
Ideally, cancer screening is initially evaluated through randomised controlled trials, the analysis of which should be straightforward. The statistical challenges arise when one is either trying to combine the results of several trials with different designs, or trying to evaluate routine service screening (which may use improved technologies compared to the original randomised controlled trials).
We will briefly discuss the following problems.
1. Estimation from interval censored data based on imperfect observations. When screening for asymptomatic precancerous disease, one will only identify the disease if the individual with the disease is screened and if the screening test is positive (leading to further investigations and a definitive diagnosis). In the simple model one may have periodic screening with a fixed sensitivity. A more sophisticated analysis would take account of the possibility that as the precancerous lesion grows the sensitivity of the screening test increases.
2. Estimating overdiagnosis (defined as a screendetected cancer that would not have been diagnosed (before the individual died) in the absence of screening) from a trial in which the control arm are all offered screening at the end of the trial. The idea is that with extended followup data one may be able to apply methods designed for noncompliance to estimate overdiagnosis.
3. Improving ecological studies and trend analyses to try to estimate the effects of screening on incidence (overdiagnosis or cancer prevention) and mortality taking into account secular trends in incidence and mortality.
4. Metaanalysis of randomised trials of screening that are heterogeneous in terms of screening interval, duration of followup after the last screen, and whether or not the control group were offered screening at the end of the trial. The idea that we explore is whether by modelling the expected behaviour of the incidence function over time, one can combine estimates of the same quantity in the meta analysis.
5. How should one quantify exposure to screening in an observational (casecontrol) study of cancer screening? The issue is whether one can use such studies to accurately estimate the benefit of screening at different intervals. We will discuss a few options and suggest that they may best be studied by applying them to simulated data. 
08/12/2021 1:00 PMMB204 + ZoomMihai CucuringuSpectral methods for clustering signed and directed networks
We consider the problem of clustering in two important families of
networks: signed and directed, both relatively less well explored
compared to their unsigned and undirected counterparts. Both problems
share an important common feature: they can be solved by exploiting the
spectrum of certain graph Laplacian matrices or derivations thereof. In
signed networks, the edge weights between the nodes may take either
positive or negative values, encoding a measure of similarity or
dissimilarity. We consider a generalized eigenvalue problem involving
graph Laplacians, with performance guarantees under the setting of a
signed stochastic block model. The second problem concerns directed
graphs. Imagine a (social) network in which you spot two subsets of
accounts, X and Y, for which the overwhelming majority of messages (or
friend requests, endorsements, etc) flow from X to Y, and very few flow
from Y to X; would you get suspicious? To this end, we also discuss a
spectral clustering algorithm for directed graphs based on a
complexvalued representation of the adjacency matrix, which is able to
capture the underlying cluster structures, for which the information
encoded in the direction of the edges is crucial. We evaluate the
proposed algorithm in terms of a cut flow imbalancebased objective
function, which, for a pair of given clusters, it captures the
propensity of the edges to flow in a given direction. Experiments on a
directed stochastic block model and realworld networks showcase the
robustness and accuracy of the method, when compared to other
stateoftheart methods. Time permitting, we briefly discuss potential
extensions to the sparse setting and regularization, applications to
leadlag detection in time series and ranking from pairwise comparisons. 
01/12/2021 1:00 PMZoomVukosi Marivate (University of Pretoria)Coming to grips with the reality of data science  it's people all the way down
As practising Data Science researchers and practitioners, the COVID19 pandemic has highlighted both the need for data driven decision making and the reality of what it really takes to get to that point. It is not only about throwing data and models at a problem. It is about understanding the environment that one is in and then strategising on what might best work for that environment. In this talk I look back at some of the work we have done within responding to different challenges within both Data Science and Natural Language Processing. I place at the center people and how they are the important piece in our practice.

24/11/2021 4:00 PMvia ZoomPhilippe GagnonAn asymptotic Peskun ordering and its application to lifted samplers
Please note different time from usual seminar time.
A Peskun ordering between two samplers, implying a dominance of one over the other, is known among the Markov chain Monte Carlo community for being a remarkably strong result, but it is also known for being one that is notably difficult to establish. Indeed, one has to prove that the probability to reach a state, using a sampler, is greater than or equal to the probability using the other sampler, and this must hold for all states excepting the current state. We provide in this paper a weaker version that does not require an inequality between the probabilities for all these states: the dominance holds asymptotically, as a varying parameter grows without bound, as long as the states for which the probabilities are greater than or equal to belong to a massconcentrating set. The weak ordering turns out to be useful to compare lifted samplers for partiallyordered discrete statespaces with their Metropolis–Hastings counterparts. An analysis yields a qualitative conclusion: they asymptotically perform better in certain situations (and we are able to identify these situations), but not necessarily in others (and the reasons why are made clear). The difference in performance is evaluated quantitatively in important applications such as graphicalmodel simulation and variable selection.
Joint work with Florian Maire (Université de Montréal).
The preprint is available at: https://arxiv.org/abs/2003.05492. In the talk, I will focus on the motivations of our work, which will allow to motivate our theoretical result.

17/11/2021 1:00 PMMB203 (please note different location) + ZoomArthur GuillauminDebiased Whittle likelihood for time series and spatial data
Time series and spatial data are ubiquitous in many application areas, such as environmental data, geosciences, astronomy, and finance. A key statistical modelling and estimation challenge for these data is that of dependance between points at different times or locations. While parametric models of covariance can be estimated via exact likelihood, this is illsuited for many practical problems due to the heavy computational cost.
A standard approach to address this relies on approximate likelihood methods. The Whittle likelihood is one such approximation for gridded data, based on the Discrete Fourier Transform of the data. It is popular due to its n log n computational cost, robustness to nonGaussian data, and amenability to interpretation in the spectral domain. However, Whittle likelihood estimates can suffer from a strong bias due to the finite and discrete sampling. This is true in particular for spatial data where bias dominates verses standard deviation in dimension equal or greater than two. Additionally, practical sampling patterns often diverge from theoretical requirements, due to nonsquare observational domains or missing data. In this presentation we present a recently proposed modification to the Whittle likelihood which addresses all these issues at once.
We provide asymptotic results under a framework which we call Significant Correlation Contribution, which allows us to understand the interplay between the sampling pattern and the covariance model. We demonstrate that our modification renders our estimate asymptotically efficient and normal for a wide class of settings and present some practical use cases.

03/11/2021 1:00 PMZoomMarzieh Eidi (Max Planck Institute for Mathematics in the Sciences)Curvaturebased Analysis of Directed (Hyper)Networks
Today we are confronted with huge and highly complex data and one main challenge is to determine the "structure" of complex networks or ''shape'' of data. In the past few years, geometric and topological methods, as powerful tools that originated from Riemannian geometry, are becoming popular for data analysis. In this seminar, after introducing OllivierRicci curvature for (directed) hypergraphs, as one of the main recent applications, I will present the result of the implementation of this tool for the analysis of chemical reaction networks. We will see that this notion alongside FormanRicci curvature are edgebased complementary tools for detecting some important structures in the network.

27/10/2021 1:00 PMMB204 and ZoomJun Yang (University of Oxford)Stereographic Markov Chain Monte Carlo
High dimensional distributions, especially those with heavy tails, are notoriously difficult for off the shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information and local moves, results in empirically observed "stickiness" and poor theoretical mixing properties  lack of geometric ergodicity. In this paper we introduce a new class of MCMC samplers that map the original high dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop Random Walk Metropolis type algorithms as well as versions of Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy tailed distributions and also empirically exhibit rapid convergence.
Joint work with Krzysztof Latuszynski and Gareth O. Roberts.

20/10/2021 1:00 PMZoomTom Leinster (University of Edinburgh)What is the uniform distribution?
Everyone knows what the uniform probability distribution is on a real interval or on a finite set, but it is not so obvious what we should understand "uniform distribution" to mean on a completely arbitrary space. I will give a general definition, taking "space" to mean something slightly more general than compact metric space. The definition rests on a maximum entropy theorem for distributions on metric spaces, which in turn arose from questions about the measurement of biodiversity. This idea of seeking a systematic general notion of uniform distribution is similar in spirit to the quest for an objective prior, and indeed, is at least loosely related to it, as I will explain. (Joint work with Emily Roff.)

06/10/2021 1:00 PMRoom MB503 + Zoom streamingNina Otter (QMUL)A topological perspective on weather regimes
In this talk I will discuss recent and ongoing work on using topology to define and study weather regimes. The talk is based on joint work with K. Strommen, M. Chantry and J. Dorrington, with preprint available at https://arxiv.org/abs/2104.03196.
Zoom link: https://qmulacuk.zoom.us/j/82103051171?pwd=NjJRckR5Z3lJRzRRZlFlblhDNGFzZz09

29/09/2021 1:00 PMMB503 and ZoomPhilippa (Pip) Pattison (University of Sydney)Realisationdependent models for networks
Abstract: In this talk, I summarise progress in building models for social networks that capture many of their wellknown structural features. I focus on a modelling approach which construes global network structure as the outcome of dynamic, potentially realisationdependent, interactive processes occurring within local neighbourhoods of a network. I describe a hierarchy of models implied by the approach and their estimation from partial network data structures obtained through certain types of network sampling schemes. I illustrate how these models can be used to enrich our understanding of community network structures and hence of processes such as the transmission of infectious diseases.
About the speaker: Prof Pip Pattison is a quantitative psychologist by background and the primary focus of her research is the development and application of mathematical and statistical models for social networks and network processes. She is currently the Deputy ViceChancellor (Education) at the University of Sydney.

26/05/2021 2:00 PMZoomConcepcion Ausin (Universidad Carlos III de Madrid)Variational inference for high dimensional structured factor copulas
Factor copula models have been recently proposed for describing the joint distribution of a large number of variables in terms of a few common latent factors. A Bayesian procedure is employed in order to make fast inferences for multifactor and structured factor copulas. To deal with the high dimensional structure, a Variational Inference (VI) algorithm is applied to estimate different specifications of factor copula models. Compared to the Markov Chain Monte Carlo (MCMC) approach, the variational approximation is much faster and could handle a sizable problem in limited time. Another issue of factor copula models is that the bivariate copula functions connecting the variables are unknown in high dimensions. An automatic procedure is derived to recover the hidden dependence structure. By taking advantage of the posterior modes of the latent variables, the bivariate copula functions are selected by minimizing the Bayesian Information Criterion (BIC). Simulation studies in different contexts show that the procedure of bivariate copula selection could be very accurate in comparison to the true generated copula model. The proposed procedure is illustrated with two high dimensional real data sets.

19/05/2021 2:00 PMZoomVictor Veitch (University of Chicago)Counterfactual Invariance to Spurious Correlations
Informally, a ‘spurious correlation’ is the dependence of a model on some aspect of the input data that an analyst thinks shouldn’t matter. In machine learning, these have a knowitwhenyouseeit character, e.g., changing the gender of a sentence’s subject changes a sentiment predictor’s output. I'll talk about counterfactual invariance, a causal formalization of the requirement that changing irrelevant parts of the input shouldn’t change model predictions. We connect counterfactual invariance to outofdomain model performance, and provide schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and meaning of counterfactual invariance depend fundamentally on the true underlying causal structure of the data. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.

12/05/2021 2:00 PMZoomSylvia FruhwirthSchnatter (University of Vienna)Triple the gamma – Achieving Shrinkage and Variable Selection in TVP Models
Timevarying parameter (TVP) models are a popular tool for handling data with smoothly changing parameters. However, in situations with many parameters the flexibility underlying these models may lead to overfitting models and, as a consequence, to a severe loss of statistical efficiency. This occurs, in particular, if only a few parameters are indeed timevarying, while the remaining ones are constant or even insignificant. As a remedy, hierarchical shrinkage priors have been introduced for TVP models to allow shrinkage both of the initial parameters as well as their variances toward zero.
The talk reviews various approaches of introducing shrinkage priors for TVP models. Recently, Cadonna et al (2020) introduced the (hierarchical) triple Gamma prior which includes other popular shrinkage priors such as the double Gamma prior and the horseshoe prior as special cases. The talk also discussed efficient methods for MCMC inference and investigates the close resemblance of the triple Gamma prior with BMA. For illustration, hierarchical shrinkage priors are applied to TVPVARSV models, a popular tool for modelling multivariate macroeconomic time series. The results clearly indicate that shrinkage priors reduce the risk of overfitting and increase statistical efficiency in a TVP modelling framework.
(based on joint work with Annalisa Cadonna and Peter Knaus, Vienna University of Economics and Business)
Full version of Cadonna et al (2020): https://doi.org/10.3390/econometrics8020020

05/05/2021 2:00 PMZoomCeleste Damiani (QMUL)AN AI SYSTEM FOR ASSESSING BREAST DENSITY
At the moment, in the UK all women go through the same breast cancer screening programme. But different women have different levels of risk of getting breast cancer. In our project we are looking at how we can adjust breast cancer screening programmes so that they are more suited to the level of risk faced by each woman  this is known as riskadapted screening. In particular, breast density is the amount of white and bright regions seen on a mammogram. High breast density can make it harder for doctors to detect breast cancer on a screening mammogram and also increases the risk of developing breast cancer. I am going to talk about how we are planning to use AI algorithms to objectively measure breast density and answer the question: how can we tell when a woman might be at risk of getting a false negative during a standard mammogram, and should be offered an alternative screening method? This is part of the CRUK funded project “An Artificial Intelligence System for Realtime Risk Assessment at Mammography Screening (Mammo AI)”

28/04/2021 2:00 PMZoomMaria Grith (Erasmus University of Rotterdam)The BlockAutoregressive Model in NonStandard Bases
We propose a new autoregressive model for the analysis of timeseries with periodic interdependencies. The model is based on the application of a vector autoregressive model to univariate data that is partitioned into ‘blocks’ of observations. For this reason, we refer to it as the blockautoregressive (BAR) model. The untransformed BAR model nests several other autoregressive models such as the regular AR model, the periodic AR model, the (mixed) seasonal AR model, and the scalespecific AR model that was introduced by Bandi et. al (2019). In addition, the BAR model can be transformed using orthonormal bases to unveil dependencies between weighted averages of observations in subsequent blocks. This yields parsimonious model representations that enhance interpretability and improve predictive performance. The model is estimated using OLS and parametric bootstrapping methods in the case of large samples, which is complemented by a basisspecific LASSO step for smaller samples. Both simulated and empirical examples are used to illustrate the model. Joint with Dick van Dijk and Karel de Wit.

21/04/2021 2:00 PMZoomRadu Craiu (University of Toronto)Finding our Way in the Dark: Approximate MCMC for Approximate Bayesian Methods
With larger amounts of data at their disposal, scientists are emboldened to tackle complex questions that require sophisticated statistical models. It is not unusual for the latter to have likelihood functions that elude analytical formulations. Even under such adversity, when one can simulate from the sampling distribution, Bayesian analysis can be conducted using approximate methods such as Approximate Bayesian Computation (ABC) or Bayesian Synthetic Likelihood (BSL). A significant drawback of these methods is that the number of required simulations can be prohibitively large, thus severely limiting their scope. We propose perturbed MCMC samplers that can be used within the ABC and BSL paradigms to significantly accelerate computation while maintaining control on computational efficiency. The proposed strategy relies on recycling samples from the chain’s past. The algorithmic design is supported by a theoretical analysis while practical performance is examined via a series of simulation examples and data analyses. This is joint work with Dr. Evgeny Levi.

14/04/2021 2:00 PMZoomAntonio Lijoi (Bocconi University, Milan)Measuring dependence for Bayesian nonparametric models
The Bayesian approach to inference stands out for naturally allowing borrowing of information across heterogeneous populations or studies. Several popular classes of models in this setting induce a dependence structure on the observations that can be seen as a mixture between the two extreme cases of exchageability and unconditional independence. As an illustrative example in this direction, a recent proposal based on the Dirichlet process will be described. Such a structure leads one to consider the problem of measuring dependence in terms of the distance of the actual prior specification from the two extremes. The talk will describe a novel approach that relies on the Wasserstein distance and is suitably tailored to random measure based models. An application to some noteworthy models in the literature provides some useful insights.

07/04/2021 3:00 PMZoomJingwei Liang (QMUL)Screening for Sparse Online Learning
Sparsity promoting regularizers are widely used to impose lowcomplexity structure (e.g. l1norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit "finite activity identification", namely, they can identify the lowcomplexity structure in a finite number of iterations. However, most online algorithms (such as proximal stochastic gradient descent) do not have the property owing to the vanishing stepsize and nonvanishing variance. In this talk, by combining with a screening rule, I will show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification. One consequence is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited for computational gains. Numerically, significant acceleration can be obtained.

31/03/2021 2:00 PMZoomSebastian Schmon (Improbable, UK)Generalized Posteriors for Approximate Bayesian Computation and Simulationbased Inference
Complex simulators have become a ubiquitous tool in many scientific disciplines, providing highfidelity, implicit probabilistic models of natural and social phenomena. Unfortunately, they typically lack the tractability required for conventional statistical analysis. Approximate Bayesian computation (ABC) has emerged as a key method in simulationbased inference, wherein the true model likelihood and posterior are approximated using samples from the simulator. In this talk, we will first draw connections between ABC and generalized Bayesian inference (GBI) by reinterpreting the accept/reject step in ABC as an implicitly defined error model. Then we argue that these implicit error models will invariably be misspecified.
While ABC posteriors are often treated as a necessary evil for approximating the standard Bayesian posterior, this allows us to reinterpret ABC as a potential robustification strategy. In a second step, we will turn our attention to some recent machine learning approaches to simulationbased inference. While those methods are designed to be exact when the true data generating mechanism is known, we will show that neural density estimators can perform poorly when this assumption is violated. Using our findings on ABC we will argue for a combination of machinelearning and statistics approach to obtain a reliable, but highly efficient algorithm for posterior inference in intractable models.

17/03/2021 2:00 PMZoomAnthony Constantinou (QMUL)Bayesian network structure learning with noisy data
Numerous Bayesian network structure learning algorithms have been proposed in the literature over the past few decades. Each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with synthetic data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This presentation will provide a brief introduction to the two main classes of structure learning, called constraintbased and scorebased, and illustrate how different assumptions of data noise influence structure learning performance.

03/03/2021 2:00 PMZoomRoberto Casarin (Ca' Foscari University of Venice, Italy)Bayesian Dynamic Tensor Regression
Tensorvalued data are becoming increasingly available in economics and this calls for suitable econometric tools. We propose a new dynamic linear model for tensorvalued response variables and covariates that encompasses some wellknown econometric models as special cases. Our contribution is manifold. First, we define a tensor autoregressive process (ART), study its properties, and derive the associated impulse response function. Second, we exploit the PARAFAC lowrank decomposition for providing a parsimonious parametrization and to incorporate sparsity effects. We also contribute to inference methods for tensors by developing a Bayesian framework which allows for including extrasample information and for introducing shrinking effects. We apply the ART model to timevarying multilayer networks of international trade and capital stock and study the propagation of shocks across countries, over time and between layers.

24/02/2021 3:00 PMZoomMichele Guindani (University of California, Irvine, United States)A Common Atom Model for the Bayesian Nonparametric Analysis of Nested Data
The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this talk, we propose a nested Common Atoms Model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a twolayered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. If time allows, we will also discuss an application to the analysis of time series calcium imaging experiments in awake behaving animals.We further investigate the performance of our model in capturing true distributional structures in the population by means of simulation studies.

17/02/2021 2:00 PMZoomMarcelo Pereyra (HeriotWatt University)Bayesian inference with datadriven image priors encoded by neural networks
This talk presents a mathematical and computational methodology for performing Bayesian inference in problems where prior knowledge is available in the form of a training dataset or set of training examples. This prior information is encoded into the model by using a deep neural network, which is combined with an explicit likelihood function by using Bayes' theorem to derive the posterior distribution for the quantities of interest given the available data. Bayesian computation is then performed by using appropriate Markov chain Monte Carlo stochastic algorithms. We study the properties of the proposed models and computation algorithms and illustrate performance on a range of inverse problems related to imaging sciences, where they are used to perform Bayesian point estimation, uncertainty quantification, hypothesis testing, and model misspecification diagnosis.
Based on a joint work with Matthew Holden and Kostas Zygalakis.

12/02/2021 10:00 AMZoomClara Grazian (University of New South Wales, Australia)The importance of being conservative: Bayesian analysis for mixture models
From a Bayesian perspective, mixture models have been characterised by a restrictive prior modelling since their illdefined nature makes most of the improper priors not acceptable. In particular, recent results have shown the inconsistency of the posterior distribution on the number of components when using standard nonparametric prior processes.
We propose an analysis of prior choices associated by their property of conservativeness in the number of components. Among the proposals, we derive a prior distribution on the number of clusters which considers the loss one would incur if the true value representing the number of components were not considered. The prior has an elegant and easy to implement structure, which allows to naturally include any prior information one may have as well as to opt for a default solution in cases where this information is not available.
The methods are then applied on two real datasets. The first dataset consists of retrieval times for monitoring IP packets in computer network systems. The second dataset consists of measures registered in antimicrobial susceptibility tests for 14 compounds used in the treatment of M. Tuberculosis. In both the situations, the number of clusters is uncertain and different solutions lead to different interpretations.

03/02/2021 2:00 PMZoomDavide Ferrari (Free University of BozenBolzano, Italy)Model selection by sparse composition of estimating equations
This talk introduces a method for selecting highdimensional models based on a truncation mechanism
to generate sparse estimating equations. Given a set of lowdimensional estimating equations for the model parameters, a highdimensional model is selected by minimizing the distance between a composite estimating equation and the full likelihood scores subject to a L1type penalty. The proposed strategy reduces the overall model complexity by dropping the noisy terms in the estimating equations. Differently from other approaches to model selection, our penalty involves the inclusion of lowdimensional equations rather than model parameters; this implies that consistency of the final parameter estimates is unaffected by the selection mechanism. Numerical and statistical efficiency of the new methodology is illustrated through examples on simulated and real data. 
25/11/2020 2:00 PMZoomYanbei Chen (Computer Vision Group, QMUL)Image Search with Text Feedback by Visiolinguistic Attention Learning
Zoom link.
Abstract
Image search with text feedback has promising impacts in various realworld applications, such as ecommerce and internet search. Given a reference image and text feedback from user, the goal is to retrieve images that not only resemble the input image, but also change certain aspects in accordance with the given text. This is a challenging task as it requires the synergistic understanding of both image and text. In this work, we tackle this task by a novel Visiolinguistic Attention Learning (VAL) framework. Specifically, we propose a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics. By inserting multiple composite transformers at varying depths, VAL is incentive to encapsulate the multigranular visiolinguistic information, thus yielding an expressive representation for effective image search. We conduct comprehensive evaluation on three datasets: Fashion200k, Shoes and FashionIQ. Extensive experiments show our model exceeds existing approaches on all datasets, demonstrating consistent superiority in coping with various text feedbacks, including attributelike and natural language descriptions.
This work was presented in CVPR 2020. Link to paper here.

09/12/2020 2:00 PMZoomJamie Griffin (Statistics and Data Science Group, QMUL)Estimates of the severity of coronavirus disease 2019: a modelbased analysis
Zoom link.
Abstract
In the face of rapidly changing data, a range of case fatality ratio estimates for coronavirus disease 2019 (COVID19) have been produced that differ substantially in magnitude. We aimed to provide robust estimates, accounting for censoring and ascertainment biases. These early estimates give an indication of the fatality ratio across the spectrum of COVID19 disease and show a strong age gradient in risk of death.
Lancet paper here.

11/11/2020 2:00 PMZoomLuca Rossini (Statistics and Data Science Group, QMUL)Proper Scoring rules for evaluating asymmetry in density forecasting
Zoom link.
Abstract
In this talk, we propose a novel asymmetric continuous probabilistic score (ACPS) for evaluating and comparing density forecasts. It extends the proposed score and defines a weighted version, which emphasizes regions of interest, such as the tails or the center of a variable's range. A test is also introduced to statistically compare the predictive ability of different forecasts. The ACPS is of general use in any situation where the decision maker has asymmetric preferences in the evaluation of the forecasts. In an artificial experiment, the implications of varying the level of asymmetry in the ACPS are illustrated. Then, the proposed score and test are applied to assess and compare density forecasts of macroeconomic relevant datasets (US employment growth) and of commodity prices (oil and electricity prices) with particular focus on the recent COVID19 crisis period.
This is a joint work with Matteo Iacopini and Francesco Ravazzolo. Link to the paper here.

18/11/2020 2:00 PMZoomAyon Mukherjee (Principal Biostatistician, Clinipace Berlin)CovariateAdjusted ResponseAdaptive Designs for Weibull distributed Survival Responses
Zoom link.
Abstract
Covariateadjusted responseadaptive (CARA) designs use available responses to skew the treatment allocation in an ongoing clinical trial in favour of the treatment arm found at an interim stage to be best for a patient’s covariate profile.
There has recently been extensive research on CARA designs mainly involving binary responses. Though exponential survival responses have also been considered, the constant hazard property of the exponential model makes the mean residual life for patients constant, making it too restrictive for wideranging applicability. To overcome this limitation, designs are developed for Weibull distributed survival responses by deriving two variants of optimal designs based on an optimality criterion.
The optimal designs are based on the covariateadjusted doublyadaptive biased coin design (CADBCD) in one case, and the covariateadjusted efficient randomised adaptive design (CAERADE) in the other.. The observed treatment allocation proportions for these designs converge to the expected targeted values, which are derived based on constrained optimization problems. The existing large sample theory for CARA designs rely on Taylor expansion of the allocation probability function, which do not apply to the CAERADE, as it is a discrete and discontinuous function. To overcome this difficulty of discontinuity, and to establish the asymptotic properties of the CAERADE, a stopping time of a martingale process has been introduced. A comparative analysis of these two optimal designs are also discussed. Given the treatment allocation history, response histories, previous covariate information and the covariate profile of the incoming patient, an expression for the conditional probability of a patient being allocated to a particular treatment has been obtained. To apply such designs, the treatment allocation probabilities are sequentially modified based on the history of previous patients’ treatment assignments, responses, covariates and the covariates of the new patient.
For a Phase III clinical trial, the CAERADE is preferable to the CADBCD when the main objective is to minimise the asymptotic variance of the allocation procedure. However, the former procedure being discrete tends to be slower in converging towards the expected target allocation proportion. Since the CAERADE provides a design with minimum variance, it is better than the CADBCD as far as the power of the Wald test for testing treatment differences is concerned. An extensive simulation study of the operating characteristics of the proposed designs supports these findings. It is concluded that the proposed CARA procedures can be suitable alternatives to the traditional balanced randomization designs in survival trials, provided that response data are available during the recruitment phase to enable adaptations to the designs. The findings are illustrated extensively by redesigning an existing clinical trial for treating colorectal cancer.
Keywords: Censored Responses; Optimum allocation; Power; Variability; Covariate Profile

21/10/2020 2:00 PMZoomChris Bamford (Game AI, QMUL)Neural Game Engine: Accurate learning of generalizable forward models from pixelsAbstract:Access to a fast and easily copied forward model of a game is essential for modelbased reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for modelfree algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previous work on the Neural GPU, this talk introduces the Neural Game Engine, as a way to learn models directly from pixels. The learned models are able to generalise to different size game levels to the ones they were trained on without loss of accuracy. Results on deterministic General Video Game AI games demonstrate competitive performance, with many of the games models being learned perfectly both in terms of pixel predictions and reward predictions. The pretrained models are available through the OpenAI Gym interface here: https://github.com/Bam4d/NeuralGameEngine.

25/03/2020 12:00 PMMathematical Sciences Building, Room MB503Dr. Maria Kalli (University of Kent)Cancelled
Cancelled because of coronavirus.

11/03/2020 12:00 PMMathematical Sciences Building, Room MB503Dr. Kalliopi Mylona (King's College London)Cancelled
Cancelled because of coronavirus.

26/02/2020 12:00 PMMathematical Sciences Building, Room MB503Dr. Yunxiao Chen (London School of Economics)Statistical Analysis of Item Preknowledge in Educational Tests: Latent Variable Modelling and Statistical Decision Theory
Tests are a building block of our modern education system. Many tests are highstake, such as admission, licensing, and certification tests, that can significantly change one’s life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in educational tests due to item leakage. That is, a proportion of test takers have access to leaked items before a test is administrated, which leads to inflated performance on the set of leaked items. We develop methods for the simultaneous detection of cheating test takers and compromised items based on data from a single test administration, when both sets are completely unknown. Latent variable models are proposed for the modelling of (1) data consisting only of itemlevel binary scores and (2) data consisting of both itemlevel binary scores and response time, where the former is commonly available in paperandpencil tests and the latter is widely encountered in computerbased tests. The proposed model adds a latent class model component upon a factor model (also known as item response theory model) component, where the factor model component captures item response behaviour driven by test takers’ ability and the latent class model component captures item response behaviour due to item preknowledge. We further propose a statistical decision framework, under which compound decision rules are developed that control local false discovery/nondiscovery rates. Statistical inference is carried out under a Bayesian framework. The proposed method is applied to data from a computerbased nonadaptive licensure assessment.
This is a joint work with Prof. Irini Moustaki and Ms. Yan Lu (PhD student).

19/02/2020 12:00 PMMathematical Sciences Building, Room MB503Dr. Kolyan Ray (Imperial College)Semiparametric Bayesian causal inference using Gaussian process priors
We investigate semiparametric Bayesian inference for average treatment effects based on observational data, which is a challenging problem due to the missing counterfactuals and selection bias. This model has applications in biostatistics and causal inference.
We show that standard Gaussian process priors satisfy a semiparametric Bernsteinvon Mises theorem under sufficient smoothness conditions, thereby showing that the posterior can yield optimal interference. We further propose a novel propensity scorebased prior modification that corrects for the firstorder posterior bias. Numerical simulations confirm significant improvement in both estimation accuracy and uncertainty quantification compared to using an unmodified Gaussian process.

12/02/2020 12:00 PMMathematical Sciences Building, Room: MB503Dr Georgios Papageorgiou (Birkbeck)Bayesian semiparametric analysis of multivariate continuous responses, with variable selection
We present an approach to Bayesian semiparametric inference for Gaussian multivariate response regression. We are motivated by various small and medium dimensional problems from the physical and social sciences. The statistical challenges revolve around dealing with the unknown mean and variance functions and in particular, the correlation matrix. To tackle these problems, we have developed priors over the smooth functions and a Markov chain Monte Carlo algorithm for inference and model selection. Specifically: Dirichlet process mixtures of Gaussian distributions is used as the basis for a clusterinducing prior over the elements of the correlation matrix. The smooth, multidimensional means and variances are represented using radial basis function expansions. The complexity of the model, in terms of variable selection and smoothness, is then controlled by spikeslab priors. A simulation study is presented, demonstrating performance as the response dimension increases. Finally, the model is fit to a number of real world datasets.

11/12/2019 12:00 PMMathematical Sciences Building, Room: MB503Dr Monica Pirani (Imperial College London)A data integration approach to adjust for residual confounding areareferenced environmental health studies
Study designs where data have been aggregated by geographical areas are popular in environmental epidemiology. These studies are commonly based on administrative databases and, providing a complete spatial coverage, are particularly appealing to make inference on the entire population. However, the resulting estimates are often biased and difficult to interpret due to unmeasured confounders, which typically are not available from routinely collected data. We propose a framework to improve inference drawn from such studies exploiting information derived from individuallevel survey data. The latter are summarized in an arealevel scalar score by mimicking at ecologicallevel the wellknown propensity score methodology. The literature on propensity score for confounding adjustment is mainly based on individuallevel studies and assumes a binary exposure variable. Here we generalize its use to cope with areareferenced studies characterized by a continuous exposure. Our approach is based upon Bayesian hierarchical structures specified into a twostage design: (i) geolocated individuallevel data from survey samples are upscaled at ecologicallevel, then the latter are used to estimate a generalized ecological propensity score (EPS) in the insample areas; (ii) the generalized EPS is imputed in the outofsample areas under different assumptions about the missingness mechanisms, then it is included into the ecological regression, linking the exposure of interest to the health outcome. This delivers arealevel risk estimates which allow a fuller adjustment for confounding than traditional areal studies. The methodology is illustrated by using simulations and a case study investigating the risk of lung cancer mortality associated with nitrogen dioxide in England (UK).

04/12/2019 12:00 PMMathematical Sciences Building, Room: MB503Dr Chris Fallaize (University of Nottingham)Unlabelled Shape Analysis with Applications in Bioinformatics
In shape analysis, objects are often represented as configurations of points, known as landmarks. The case where the correspondence between landmarks on different objects is unknown is called unlabelled shape analysis. The alignment task is then to simultaneously identify the correspondence between landmarks and the transformation aligning the objects.
In this talk, I will discuss the alignment of unlabelled shapes, and discuss two applications to problems in structural bioinformatics. The first is a problem in drug discovery, where the main objective is to find the shape information common to all, or subsets of, a set of active compounds. The approach taken resembles a form of clustering, which also gives estimates of the mean shapes of each cluster. The second application is the alignment of protein structures, which will also serve to illustrate how the modelling framework can incorporate very general information regarding the properties we would like alignments to have; in this case, expressed through the sequence order of the points (amino acids) of the proteins. 
15/01/2020 12:00 PMMathematical Sciences Building, Room: MB503Dr Claudia Neves (University of Reading)On trend estimation and testing with application to extreme rainfallExtreme Value Theory provides a rigorous mathematical justification for being able to extrapolate outside the range of the sampled observations. The primary assumption is that the observations are independent and identically distributed. Although the celebrated extreme value theorem still holds under several forms of weak dependence, relaxing the stationarity assumption, for example by considering a trend in extremes, leads to a changeling problem of inference based around the frequency of extreme events. Some studies advocate climate crisis is not so much about startling magnitudes of extreme phenomena but rather how the frequency of extreme events can contribute to the worst case scenarios that could play out on the planet. For instance, the average rainfall may not be changing much, but heavy rainfall may become significantly more or less frequent, meaning that different observations must be endowed with different aspects in their underlying distributions. In this talk, I will present statistical tools for the semiparametric modelling of the evolution of extreme values over time and/or space by considering a trend on the frequency of exceedances above a high (random) threshold. The methodology is illustrated with an application to daily rainfall data from several gauging stations across Germany and The Netherlands.

16/10/2019 12:00 PMMathematical Sciences Building, Room: MB503Dr Hugo MaruriAguilar (QMUL)Lasso for hierarchical polynomial models
In hierarchical polynomial regression, an interaction term such as x1x2 is included in the model only if both main effects x1 and x2 are also included in the model. We note that the divisibility conditions implicit in polynomial hierarchy give way to natural constraints for the model parameters. Our work uses this idea to derive versions of strong and weak hierarchy and to extend existing work in the literature, which at the moment is only concerned with models of degree two. We discuss how to estimate parameters in lasso using standard quadratic programming techniques and apply our proposal to some examples. This is joint work with S. Lunagomez (Lancaster University).

20/03/2019 12:00 PMQueens' Building, Room: W316Maria Lomeli (Babylon Health)Amortised inference using faithful inverses for importance sampling
Automated decisionmaking for medical diagnosis consists of producing differentials for various diseases based on evidence about the state of the patient. A particular way to encode the various relationships between symptoms, riskfactors, and diseases is by using a Bayesian network, where the edge structure reflects the underlying causal mechanisms between the nodes. Due to the combinatorial explosion of computing posterior distributions exactly, various approximate inference schemes have been proposed to tackle this problem, such as variational inference and importance sampling, among others. In addition, amortisation techniques allow us to reduce the cost of inference by carrying out and storing some computations offline. In the medicaldiagnosis task, producing highlyaccurate marginals is key to differential diagnosis. Importance sampling is particularly suited for this, as it is asymptotically exact and a good choice of proposal can provide a reduction in variance. In this talk, I will discuss how we can construct various datadriven proposals by using an inverse factorisation of the model’s joint distribution. The proposal distributions are based on a neural network that is trained with samples from the generative model before inference takes place, whereas the inverse factorisation provides the sampling schedule for the importance sampling scheme. We explored the impact of different inverse factorisations in terms of variance reduction. Our findings reveal that the new scheme produces competitive datadriven proposals for importance sampling.
This is joint work with Divya Gautam, Kostis Gourgoulias, Saurabh Johri and Maneesh Sahani.
Short bio:
Maria Lomeli is currently a research scientist at Babylon Health, UK. Previously, she was a research associate at the Machine Learning group, University of Cambridge, working with Zoubin Ghahramani. She obtained her PhD from the Gatsby Unit, UCL under the supervision of Yee Whye Teh.

27/03/2019 12:00 PMQueens' Building, Room: W316Shaoxiong Hu (Queen Mary University of London)The topological criteria for statistical model selection
The LASSO has recently attracted attention in the context of models with hierarchy restrictions. In these models, an interaction term is allowed only if both main effects are active (strong hierarchy) or if at least one main effect is active (weak hierarchy). For example, under strong hierarchy appearance of the term x1x2 in a model requires both x1 and x2, while under weak hierarchy at least one of x1, x2 is needed. Our work is motivated by possible higherorder interactions in linear regression models. We are concerned with enhancing the performance of LASSO for squarefree hierarchical polynomial models when combining validation error with a measure of model complexity. The measure of the complexity is the sum of Betti numbers of the model, seen as a simplicial complex. We represent the polynomial regression model in terms of components and cycles, borrowing from recent developments in computational topology. We use LASSO as our model selection method combined with Betti numbers. We study and propose an algorithm which combines statistical and algebraic criteria. This compound criterion would allow us to deal with model selection problems of higherorder interactions in polynomial regression models.

06/02/2019 12:00 PMQueens' Building, Room: W316Fengnan Gao (Fudan University and Shanghai Center for Mathematical Sciences)Maximum likelihood estimation of Sublinear Preferential Attachment Models and its connection to urn models
The preferential attachment (PA) network is a popular way of modeling the social networks, the collaboration networks and etc. The PA network model is an evolving network model where new nodes keep coming in. When a new node comes in, it establishes only one connection with an existing node. The random choice on the existing node is via a multinomial distribution with probability weights based on a preferential function f on the degrees. f maps the natural numbers to the positive real line and is assumed apriori nondecreasing, which means the nodes with high degrees are more likely to get new connections, i.e. "the rich get richer". Under sublinear parametric assumptions on the PA function, we proposed the maximum likelihood estimator on f. We show that the MLE yields optimal performance with the asymptotic normality results. Despite the optimal property of the MLE, it depends on the history of the network evolution, which is often difficult to obtain in practice. To avoid such shortcomings of the MLE, we propose the quasi maximum likelihood estimator (QMLE), a historyfree remedy of the MLE. To prove the asymptotic normailty of the QMLE, a connection between the PA model and Svante Janson's urn models is exploited.
This is (partially) joint work with Aad van der Vaart. 
06/03/2019 12:00 PMQueens' Building, Room: W316Zahra Abdulla (King's College)How to bring the fun back to Statistics teaching: Inclusive practices to combat statistical anxiety
One of the major challenges for teachers of Statistics to nonstatisticians is the high levels of statistical anxiety amongst students, student’s perceptions of what their experience has been to learn statistics or mathematics in the past and the potential of the negative impact of these attitudes or beliefs on how students learn statistics.
This talk will aim to showcase how to use different types inclusive practice activities and assessment methods constructively aligned with the learning outcomes, to support in developing students’ confidence in the classroom; through providing a supportive learning environment that works through building trust, setting expectations and making statistics fun.

13/02/2019 12:00 PMQueens' Building, Room: W316Yi Yu (University of Bristol)Univariate Mean Change Point Detection: Penalization, CUSUM and Optimality
The problem of univariate mean change point detection and localization based on a sequence of n independent observations with piecewise constant means has been intensively studied for more than half century, and serves as a blueprint for change point problems in more complex settings. We provide a complete characterization of this classical problem in a general framework in which the upper bound on the noise variance sigma^2, the minimal spacing Delta between two consecutive change points and the minimal magnitude of the changes kappa, are allowed to vary with n. We first show that consistent localization of the change points when the signaltonoise ratio kappa X sqrt(Delta) / sigma is uniformly bounded from above is impossible. In contrast, when kappa X sqrt(Delta) / sigma is diverging in n at any arbitrary slow rate, we demonstrate that two computationallyefficient change point estimators, one based on the solution to an L0penalized least squares problem and the other on the popular WBS algorithm, are both consistent and achieve a localization rate of the order log(n) X (sigma / kappa)^2. We further show that such rate is minimax optimal, up to a log(n) term.
Preprint arXiv

10/12/2018 12:00 PMQueens' Building, Room: W316Luciana Dalla Valle, University of PlymouthAnalysis of Twin Data via Bayesian Nonparametric Conditional Copula
Several studies on heritability in twins aim at understanding the different contribution of environmental and genetic factors to specific traits. Considering the national merit twin study, our purpose is to analyse correctly the influence of socioeconomic status on the relationship between twins’ cognitive abilities. Our methodology is based on conditional copulas, which enable us to model the effect of a covariate driving the strength of dependence between the main variables. We propose a flexible Bayesian nonparametric approach for the estimation of conditional copulas, which can model any conditional copula density. Our methodology extends the work of Wu, Wang and Walker in 2015 by introducing dependence from a covariate in an infinite mixture model. Our results suggest that environmental factors are more influential in families with lower socioeconomic position.

12/11/2018 12:00 PMQueens' Building, Room: W316Tom Berrett, University of CambridgeNonparametric independence testing via mutual information
In this talk I will discuss recent work on the problem of testing the independence of two multivariate random vectors, given a sample from the underlying population. Classical
measures of dependence such as Pearson correlation or Kendall’s tau are often found to not capture the complex dependence between variables in modern datasets, and in recent years a large literature has developed on defining appropriate nonparametric measures of dependence and associated tests. We take the informationtheoretic quantity mutual information as our starting point, and define a new test, which we call MINT, based on the estimation of this quantity, whose decomposition into joint and marginal entropies facilitates the use of recentlydeveloped efficient entropy estimators derived from nearest neighbour distances.The proposed critical values of our test, which may be obtained by simulation in the case where an approximation to one marginal is available or by permuting the data otherwise, facilitate size guarantees, and we provide local power analyses, uniformly over classes of densities whose mutual information satisfies a lower bound. Our ideas may be extended to provide new goodnessoffit tests of normal linear models based on assessing the independence of our vector of covariates and an appropriatelydefined notion of an error vector. The theory is supported by numerical studies on both simulated and real data.

29/10/2018 12:00 PMQueens' Building: Room W316Luke Kelly, University of OxfordLateral trait transfer in phylogenetic inferenceWe are interested in inferring the phylogeny, or shared ancestry, of a set of taxa descended from a common ancestor. Lateral trait transfer is a form of reticulate evolutionary activity whereby species exchange evolutionary traits outside of ancestral relationships. The resulting trait histories are mosaics of the underlying species tree. To address this frequent source of model misspecification, we propose a novel model for species diversification which explicitly controls for the effect of lateral transfer.The parameters of our likelihood are the solution of a sequence of differential equations over a phylogeny and the computational cost of this calculation is exponential in the number of taxa. We exploit symmetries in the differential systems and techniques from numerical analysis to build an efficient approximation scheme to reduce the computational cost of inference by an order of magnitude while remaining exact in a MCMC sense. We illustrate our method on a data set of lexical traits in Eastern Polynesian languages and demonstrate a significantly improved fit over the corresponding method which ignores lateral transfer. (This is joint work with Geoff Nicholls.)

26/11/2018 12:00 PMQueens' Building, Room: W316Emily Lines, QMUL School of GeographyRevealing Hidden Juvenile Tree Dynamics from Count Data Using Approximate Bayesian Computation
The juvenile life stage is a crucial determinant of forest dynamics and a first indicator of changes to species’ ranges under climate change. However, paucity of detailed remeasurement data of seedlings, saplings and small trees means that their demography is not well understood at large scales. In this study we quantify the effects of climate and density dependence on recruitment and juvenile growth and mortality rates of thirteen species measured in the Spanish Forest Inventory. Singlecensus sapling count data is used to constrain demographic parameters of a simple forest juvenile dynamics model using a likelihoodfree parameterisation method, Approximate Bayesian Computation. Our results highlight marked differences between species, and the important role of climate and stand structure, in controlling juvenile dynamics. Recruitment had a humpshaped relationship with conspecific density, and for most species conspecific competition had a stronger negative effect than heterospecific competition. Recruitment and mortality rates were positively correlated, and Mediterranean species showed on average higher mortality and lower growth rates than temperate species. Under climate change our model predicted declines in recruitment rates for almost all species. Defensible predictive models of forest dynamics should include realistic representation of critical early lifestage processes and our approach demonstrates that existing coarse count data can be used to parameterise such models. Approximate Bayesian Computation approaches have potentially wide ecological application, in particular to unlock information about past processes from current observations.

14/12/2017 4:00 PMQueens' W316H. MaruriAguilar, QMULSmoothing the logistic model
Smooth supersaturated polynomials have been used for building emulators in computer experiments. The response surfaces built with this method are simple to interpret and have splinelike properties (Bates et al., 2014). We extend the methodology to build smooth logistic regression models. The approach we follow is to regularize the likelihood with a penalization term that accounts for the roughness of the regression model.
The response surface follows data closely yet it is smooth and does not oscillate. We illustrate the method with simulated data and we also present a recent application to build a prediction rule for psychiatric hospital readmissions of patients with a diagnosis of psychosis. This application uses data from the OCTET clinical trial (Burns et al., 2013).

22/06/2018 12:00 PMLG7, G O Jones BuildingR. C. Weng, National Chengchi UniversityOnline Bayesian inference for latent ability models
Latent ability models relate a set of observed variables to a set of latent ability variables. It includes the paired and multiple comparison models, the item response theory models, etc. In this talk, first I will present an online Bayesian approximate method for online gaming analysis using paired and multiple comparison models. Experiments on game data show that the accuracy of the proposed online algorithm is competitive with state of the art systems such as TrueSkill. Second, an efficient algorithm is proposed for Bayesian parameter estimation for item response theory models. Experiments show that the algorithm works well for real Internet ratings data. The proposed method is based on the WoodroofeStein identity.

24/05/2018 12:15 PMW316, Queens' BuildingV. Vinciotti, Brunel University LondonIdentifying overlapping terrorist cells from the Noordin Top actorevent network
Actorevent data are common in sociological settings, whereby one registers the pattern of attendance of a group of social actors to a number of events. We focus on 79 members of the Noordin Top terrorist network, who were monitored attending 45 events. The attendance or nonattendance of the terrorist to events defines the social fabric, such as group coherence and social communities. The aim of the analysis of such data is to learn about this social structure. Actorevent data is often transformed to actoractor data in order to be further analysed by network models, such as stochastic block models. This transformation and such analyses lead to a natural loss of information, particularly when one is interested in identifying, possibly overlapping, subgroups or communities of actors on the basis of their attendances to events. In this paper we propose an actorevent model for overlapping communities of terrorists, which simplifies interpretation of the network. We propose a mixture model with overlapping clusters for the analysis of the binary actorevent network data, called manet, and develop a Bayesian procedure for inference. After a simulation study, we show how this analysis of the terrorist network has clear interpretative advantages over the more traditional approaches of network analysis

30/03/2017 4:30 PMQueens' W316E. Saenz de Cabezon, University of La RiojaDerivative talking while communicating maths
Derivatives can be presented in a nice, simple and intuitive way that everyone can relate to. In my talk I will not just concentrate on the specific topic of communicating derivatives to the general public but will give other examples, partly taken from the project "The Big Van Theory" that uses comedy as a vehicle to bring science to the general public. For further context, see the pages www.bigvanscience.com/index_en.html(link is external) and www.youtube.com/channel/UCHZ8ya93m7_RD02WsCSZYA(link is external).

06/04/2017 4:30 PMQueens' W316A. Steland, RWTH Aachen UniversityLarge sample approximations and changepoint procedures for quadratic forms of covariance matrices of highdimensional t
New results about large sample approximations for statistical inference and change point analysis of high dimensional vector time series are presented. The results deal with related procedures that can be based on an increasing number of bilinear forms of the sample variancecovariance matrix as arising, for instance, when studying changeinvariance problems for projection statistics and shrinkage covariance matrix estimation.
Contrary to many known results, e.g. from random matrix theory, the results hold true without any constraint on the dimension, the sample size or their ratio, provided the weighting vectors are uniformly l1bounded. Those results are in terms of (strong resp. weak) approximations by Gaussian processes for partial sum and CUSUM type processes, which imply (functional) central limit theorems under certain conditions. It turns out that the approximations by Gaussian processes hold not only without any constraint on the dimension, the sample size or their ratios, but even without any such constraint with respect to the number of bilinear forms. For the unknown variances and covariances of these bilinear forms nonparametric estimators are proposed and shown to be uniformly consistent.
We present related changepoint procedures for the variance of projection statistics as naturally arising in principal component analyses and dictionary learning, amongst others. Further, we discuss how the theoretical results lead to novel distributional approximations and sequential methods for shrinkage covariance matrix estimators in the spirit of Ledoit and Wolf.
This is joint work with Rainer v. Sachs, UC Louvain, Belgium. The work of Ansgar Steland was support by a grant from Deutsche Forschungsgemeinschaft (DFG), grant STE 1034/111.

16/03/2017 4:30 PMQueens' W316S. Liverani, Brunel University LondonModelling highly collinear spatial data
I will present a statistical approach to distinguish and interpret the complex relationship between several predictors and a response variable at the small area level, in the presence of i) high correlation between the predictors and ii) spatial correlation for the response. Covariates which are highly correlated create collinearity problems when used in a standard multiple regression model. Many methods have been proposed in the literature to address this issue. A very common approach is to create an index which aggregates all the highly correlated variables of interest. For example, it is well known that there is a relationship between social deprivation measured through the Multiple Deprivation Index (IMD) and air pollution; this index is then used as a confounder in assessing the effect of air pollution on health outcomes (e.g. respiratory hospital admissions or mortality). However it would be more informative to look specifically at each domain of the IMD and at its relationship with air pollution to better understand its role as a confounder in the epidemiological analyses. In this paper we illustrate how the complex relationships between the domains of IMD and air pollution can be deconstructed and analysed using profile regression, a Bayesian nonparametric model for clustering responses and covariates simultaneously. Moreover, we include an intrinsic spatial conditional autoregressive (ICAR) term to account for the spatial correlation of the response variable.

02/03/2017 4:30 PMQueens' W316G. Hughes, Freelance Data ScientistLitics  an application of data visualisation
I will discuss the power of analytics in a political landscape. How can we revolutionise and communicate politics using analytics and data visualisation?
I will be covering the current downfalls of our current interaction with politics and move on to discuss the power analytics and visuals that could hold if presented well.

07/12/2017 4:00 PMQueens' W316D. S. Robertson, University of CambridgeStatistical inference in responseadaptive trials
Clinical trials typically randomise patients to the different treatment arms using a fixed randomisation scheme, such as equal randomisation. However, such schemes mean that a large number of patients will continue to be allocated to inferior treatments throughout the trial. To address this ethical issue, responseadaptive randomisation schemes have been proposed, which update the randomisation probabilities using the accumulating response data so that more patients are allocated to treatments that are performing well.
A longstanding barrier to using responseadaptive trials in practice, particularly from a regulatory viewpoint, is concern over bias and type I error inflation. In this talk, I will describe recent methodological advances that aim to address both of these concerns.First I give a summary of a paper by Bowden and Trippa (2017) on unbiased estimation for response adaptive trials. The authors derive a simple expression for the bias of the usual maximum likelihood estimator, and propose three procedures for biasadjusted estimation.
I then present recent work on adaptive testing procedures that ensure strong familywise error control. The approach can be used for both fullysequential and block randomised trials, and for general adaptive randomisation rules. We show there can be a high price to pay in terms of power to achieve familywise error control for randomisation schemes with extreme allocation probabilities. However, for proposed Bayesian adaptive randomisation schemes in the literature, our adaptive tests maintain or increase the power of the trial. 
26/10/2017 4:00 PMQueens' W316J. K. Rogers, University of OxfordAnalysis of recurrent events in the presence of dependent censoring
Heart failure is characterised by recurrent hospitalisations and yet often only the first is considered in clinical trial reports. In chronic diseases, such as heart failure, analysing all such hospitalisations gives a more complete picture of treatment benefit.
An increase in heart failure hospitalisations is associated with a worsening condition meaning that a comparison of heart failure hospitalisation rates, between treatment groups, can be confounded by the competing risk of death. Any analyses of recurrent events must take into consideration informative censoring that may be present. The Ghosh and Lin (2002) nonparametric analysis of heart failure hospitalisations takes mortality into account whilst also adjusting for different followup times and multiple hospitalisations per patient. Another option is to treat the incidence of cardiovascular death as an additional event in the recurrent event process and then adopt the usual analysis strategies. An alternative approach is the use of joint modelling techniques to obtain estimates of treatment effects on heart failure hospitalisation rates, whilst allowing for informative censoring.
This talk shall outline the different methods available for analysing recurrent events in the presence of dependent censoring and the relative merits of each method shall be discussed.

01/07/2017 4:30 PMQueens' W316K. Mukherjee, Lancaster UniversityBootstrapping Mestimators in GARCH models
In this talk we discuss a class of Mestimators of parameters in GARCH models. The class of estimators contains least absolute deviation and Huber's estimator as well as the wellknown quasi maximum likelihood estimator. For some estimators, the asymptotic normality results are obtained only under the existence of fractional unconditional moment assumption on the error distribution and some mild smoothness and moment assumptions on the score function. Next we analyse the bootstrap approximation of the distribution of Mestimators. It is seen that the bootstrap distribution (given the data) is a consistent estimate (in probability) of the distribution of the Mestimators. We propose an algorithm for the computation of Mestimates which at the same time is softwarefriendly to compute the bootstrap replicates from the given data. We illustrate our algorithm through simulation study and the analysis of recent financial data.

22/03/2018 4:00 PMW316, Queens' BuildingM. Leonelli, University of GlasgowFlexible approaches for inference on extreme events
Precise knowledge of the tail behaviour of a distribution as well as predicting capabilities about the occurrence of extremes are fundamental in many areas of applications, for instance environmental sciences and finance. Standard inferential routines for extremes require the imposition of arbitrary assumptions which may negatively affect the statistical estimates. The model class of extreme value mixture models, on the other hand, allows for the precise estimation of the tail of a distribution without requiring any arbitrary assumption. After reviewing these models, the talk will discuss two extensions of this approach I have been involved in. First, situations where different extreme structures may be useful to perform inference over the extremes of a time series will be discussed. These are dealt with a novel changepoint approach for extremes, where the changepoints are estimated via Bayesian MCMC routines. Second, an extension of extreme value mixture models to investigate extreme dependence in multivariate applications is introduced and its usefulness is demonstrated using environmental data.

10/05/2018 12:00 PMW316, Queens' BuildingF. Ricciardi, UCLBandwidth selection for the regression discontinuity design: a clustering approach using a Dirichlet process mixture model
The regression discontinuity design (RDD) is a quasiexperimental design that estimates the causal effects of a treatment when its assignment is defined by a threshold value for a continuous assignment variable. The RDD assumes that subjects with measurements within a bandwidth around the threshold belong to a common population, so that the threshold can be seen as a randomising device assigning treatment to those falling just above the threshold and withholding it from those who fall just below.
Bandwidth selection represents a compelling decision for the RDD analysis, since there is a tradeoff between its size and bias and precision of the estimates: if the bandwidth is small, the bias is generally low but so is precision, if the bandwidth is large the reverse is true. A number of methods to select the “optimal” bandwidth have been proposed in the literature, but their use in practice is limited.
We propose a methodology that, tackling the problem from an applied point of view, consider units’ exchangeability, i.e., their similarity with respect to measured covariates, as the main criteria to select subjects for the analysis, irrespectively of their distance from the threshold. We use a clustering approach based on a Dirichlet process mixture model and then evaluate homogeneity within each cluster using posterior distribution for the parameters defining the mixture, including in the final RDD analysis only clusters which show high homogeneity. We illustrate the validity of our methodology using a simulated experiment. 
01/02/2018 4:00 PMW316, Queens' BuildingS. Conde, QMULLoglinear LASSO selection, observational causal inference and Markovstability toxicological communities
This talk will have three differentiated parts. In the first one, I will present some results in which we compare LASSO model selection methods with classical ones in sparse multidimensional contingency tables formed with binary variables with a loglinear modelling parametrization. In the second one, I will talk about Mendelian randomization in the presence of multiple instruments and will present results of an application to a data set with multiple metabolites. In the third one, I will talk about a clustering method that uses highdimensional network theory (Markov Stability), and an application of it to a data set that contains messenger ribonucleic acids (mRNAs) and micro ribonucleic acids (miRNAs) from a toxicological experiment.

18/01/2018 4:00 PMW316, Queens' BuildingR. A. Bailey, University of St AndrewsHasse diagrams as a visual aid for linear models and analysis of variance
The expectation part of a linear model is often presented as an equation with unknown parameters, and the reader is supposed to know that this is shorthand for a whole family of expectation models (for example, is there interaction or not?). I find it helpful to show the family of models on a Hasse diagram. By changing the lengths of the edges in this diagram, we can go a stage further and use it as a visual display of the analysis of variance.

11/01/2018 4:00 PMW316, Queens' BuildingA. J. Mason, LSHTMA Bayesian framework for addressing informative missingness in the analysis of clinical trials
The analyses of randomised controlled trials (RCTs) with missing data typically assume that, after conditioning on the observed data, the probability of missing data does not depend on the patient's outcome, and so the data are ‘missing at random’ (MAR). This assumption is often questionable, for example because patients in relatively poor health may be more likely to dropout. In these cases, methodological guidelines recommend sensitivity analyses to recognise data may be ‘missing not at random’ (MNAR), and call for the development of practical, accessible, approaches for exploring the robustness of conclusions to MNAR assumptions.
We propose a Bayesian framework for this setting, which includes a practical, accessible approach to sensitivity analysis and allows the analyst to draw on expert opinion. To facilitate the implementation of this strategy, we are developing a new webbased tool for eliciting expert opinion about outcome differences between patients with missing versus complete data. The IMPROVE study, a multicentre trial which compares endovascular strategy (EVAR) with open repair for patients with ruptured abdominal aortic aneurysm, was used in the initial development work. In this seminar, we will discuss our proposed framework and demonstrate our elicitation tool, using the IMPROVE trial for illustration.

30/11/2017 4:00 PMQueens' W316S. F. Williamson, Lancaster UniversityA Bayesian adaptive design for clinical trials in rare diseases
Development of treatments for rare diseases is challenging due to the limited number of patients available for participation. Learning about treatment effectiveness with a view to treat patients in the larger outside population, as in the traditional fixed randomised design, may not be a plausible goal. An alternative goal is to treat the patients within the trial as effectively as possible. Using the framework of finitehorizon Markov decision processes and dynamic programming (DP), a novel randomised responseadaptive design is proposed which maximises the total number of patient successes in the trial. Several performance measures of the proposed design are evaluated and compared to alternative designs through extensive simulation studies. For simplicity, a twoarmed trial with binary endpoints and immediate responses is considered. However, further evaluations illustrate how the design behaves when patient responses are delayed, and modifications are made to improve its performance in this more realistic setting.
Simulation results for the proposed design show that: (i) the percentage of patients allocated to the superior treatment is much higher than in the traditional fixed randomised design; (ii) relative to the optimal DP design, the power is largely improved upon and (iii) the corresponding treatment effect estimator exhibits only a very small bias and mean squared error. Furthermore, this design is fully randomised which is an advantage from a practical point of view because it protects the trial against various sources of bias.
Overall, the proposed design strikes a very good balance between the power and patient benefit tradeoff which greatly increases the prospects of a Bayesian banditbased design being implemented in practice, particularly for trials involving rare diseases and small populations.
Keywords: Clinical trials; Rare diseases; Bayesian adaptive designs; Sequential allocation; Bandit models; Dynamic programming; Delayed responses.

23/11/2017 4:00 PMQueens' W316K. DiazOrdaz, LSHTMDoubly robust instrumental variable methods for a trial with nonadherence
We consider estimation of the causal treatment effects in randomised trials with nonadherence, where there is an interest in treatment effects modification by baseline covariates.
Assuming randomised treatment is a valid instrument, we describe two doubly robust (DR) estimators of the parameters of a partially linear instrumental variable model for the average treatment effect on the treated, conditionally on baseline covariate. The first method is a locally efficient gestimator, while the second is a targeted minimum lossbased estimator (TMLE).
These two DR estimators can be viewed as a generalisation of the twostage least squares (TSLS) method in the instrumental variable methodology to a semiparametric model with weaker assumptions. We exploit recent theoretical results to extend the use of dataadaptive machine learning to the gestimator. A simulation study is used to compare the estimators' finitesample performance (1) when fitted using parametric models, and (2) using Super Learner, with the TSLS.
Dataadaptive DR estimators have lower bias and improved precision, when compared to incorrectly specified parametric DR estimators. Finally, we illustrate the methods by obtaining the causal effect on the treated of receiving cognitive behavioural therapy training on painrelated disability, with heterogeneous treatment by depression at baseline, using the COPERS (COping with persistent Pain, Effectiveness Research in Selfmanagement) trial.

16/11/2017 4:00 PMQueens' W316A. J. Gibberd, Imperial College LondonSquashing the Gaussian: regularised estimation of dynamic graphical models
Many modern day datasets exhibit multivariate dependance structure that can be modelled using networks or graphs. For example, in social sciences, biomedical studies, financial applications etc. the association of datasets with latent network structures are ubiquitous. Many of these datasets are timevarying in nature and that motivates the modelling of dynamic networks. In this talk I will present some of our recent research which looks at the challenging task of recovering such networks, even in highdimensional settings.
Our approach studies the canonical Gaussian graphical model whereby patterns of variable dependence are encoded through partial correlation structure. I will demonstrate how regularisation ideas such as the graphical lasso may be implemented when data is drawn i.i.d. but how this may fail in nonstationary settings. I will then present an overview of our work (with Sandipan Roy, UCL) which extends such methods to dynamic settings. By furnishing appropriate convex Mestimators that enforce smoothness and sparsity assumptions on the Gaussian we demonstrate an ability to recover the true underlying network structure. I will present both synthetic experiments and theoretical analysis which shed light on the performance of these methods.

02/11/2017 4:00 PMQueens' W316A. Y. Vasilyev, QMULOptimal control of eyemovements during visual search
We study the problem of optimal oculomotor control during the execution of visual search tasks. We introduce a computational model of human eye movements, which takes into account various constraints of the human visual and oculomotor systems. In the model, the choice of the subsequent fixation location is posed as a problem of stochastic optimal control, which relies on reinforcement learning methods. We show that if biological constraints are taken into account, the trajectories simulated under learned policy share both basic statistical properties and scaling behaviour with human eye movements. We validated our model simulations with human psychophysical eyetracking experiments.

09/02/2017 4:30 PMM203J. M. S. Wason, University of CambridgeNovel designs for trials with multiple treatments and biomarkers
Multiarm trials are increasingly being recommended for use in diseases where multiple experimental treatments are awaiting testing. This is because they allow a shared control group, which considerably reduces the sample size required compared to separate randomised trials. Further gains in efficiency can be obtained by introducing interim analyses (multiarm multistage, MAMS trials). At the interim analyses, a variety of modifications are possible, including changing the allocation to different treatments, dropping of ineffective treatments or stopping the trial early if sufficient evidence of a treatment being superior to control is found. These modifications allow focusing of resources on the most promising treatments, and thereby increase both the efficiency and ethical properties of the trial.
In this talk I will describe some different types of MAMS designs and how they may be useful in different situations. I will also discuss the design of trials that test efficacy of multiple treatments in different patient subgroups. I propose a design that incorporates biological hypotheses about links between treatments and biomarker subgroups effects of treatments, but allows alternative links to be formed during the trial. The statistical properties of this design compare well to alternative approaches available.

12/01/2017 4:30 PMBR 3.02N. Stallard, University of WarwickSeamless phase II/III clinical trials incorporating early outcome data
Most statistical methodology for confirmatory phase III clinical trials focuses on the comparison of a control treatment with a single experimental treatment, with selection of this experimental treatment made in an earlier
exploratory phase II trial. Recently, however, there has been increasing interest in methods for adaptive seamless phase II/III trials that combine the treatment selection element of a phase II clinical trial with the definitive analysis usually associated with phase III clinical trials. A number of methods have been proposed for the analysis of such trials to address the statistical challenge of ensuring control of the type I error rate. These methods rely on the independence of the test statistics used in the different stages of the trial.In some settings the primary endpoint can be observed only after longterm followup, so that at the time of the first interim analysis primary endpoint data are available for only a relatively small proportion of the patients randomised. In this case if shortterm endpoint data are also available, these could be used along with the longterm data to inform treatment selection. The use of such data breaks the assumption of independence underlying existing analysis methods. This talk presents new methods that allow for the use of shortterm data. The new methods control the overall type I error rate, either when the treatment selection rule is prespecified, or when it can be fully flexible. In both cases there is a gain in power from the use of the shortterm endpoint data when the short and longterm endpoints are correlated.

15/12/2016 4:30 PMBR 3.02R. Silva, UCLSome machine learning tools to aid causal inference
Causal inference from observational data requires untestable assumptions. As assumptions may fail, it is important to be able to understand how conclusions vary under different premises. Machine learning methods are particularly good at searching for hypotheses, but they do not always provide ways of expressing a continuum of assumptions from which causal estimands can be proposed. We introduce one family of assumptions and algorithms that can be used to provide alternative explanations for treatment effects. If we have time, I will also discuss some other developments on the integration of observational and interventional data using a nonparametric Bayesian approach.

01/12/2016 4:30 PMBR 3.02M. H. Davies, GSKThe use and abuse of statistics in industry
A personal perspective, gained from nearly 30 years of applying statistical methods in a variety of industries (FMCG, Defence, Paper, Pharmaceuticals and Vaccines).
The emphasis is on the application (not the theory) of statistics to support the manufacturing, quality control and R&D functions.
The objective of this session is to present real life examples/situations to raise awareness and stimulate discussion.
Buzz words: Experimental Design (DoE), Taguchi Methods, LeanSigma, Design for Manufacture (DfM), Process Capability, Statistical Process Control (SPC), Analytical Method Validation and Good Manufacturing Practice (GMP).

24/11/2016 4:30 PMBR 3.02J. Bowden, University of BristolGraphical tools to detect and adjust for invalid instruments in Mendelian randomization
The funnel plot is a graphical visualisation of summary data estimates from a metaanalysis, and is a useful tool for detecting departures from the standard modelling assumptions. Although perhaps not widely appreciated, a simple extension of the funnel plot can help to facilitate an intuitive interpretation of the mathematics underlying a metaanalysis at a more fundamental level, by equating it to determining the centre of mass of a physical system. We exploit this fact to forge new connections between statistical inference and bias adjustment in the evidence synthesis and causal inference literatures. An online web application (named the `MetaAnalyzer') is introduced to further facilitate this physical analogy. Finally, we demonstrate the utility of the MetaAnalyzer as a tool for detecting and adjusting for invalid instruments within the context of Mendelian randomization.

03/11/2016 4:30 PMBR 3.02V. V. Anisimov, University of GlasgowModern trends in predictive modelling clinical trial operations
Statistical design and operation of clinical trials are affected by stochasticity in patient enrolment and various events' appearance. The complexity of large trials and multistate hierarchic structure of various operational processes require developing modern predictive analytical techniques using stochastic processes with random parameters in the empirical Bayesian setting for efficient modelling and predicting trial operation.
Forecasting patient enrolment is one of the bottleneck problems as uncertainties in enrolment substantially affect trial time completion, supply chain and associated costs. An analytic methodology for predictive patient enrolment modelling using a Poissongamma model is developed by Anisimov and Fedorov (2005–2007). This methodology is extended further to riskbased monitoring interim trial performance of different metrics associated with enrolment, screen failures, various events, AE, and detecting outliers.
As the next stage of generalization, to model the complicated hierarchic processes on top of enrolment a new methodology using evolving stochastic processes is proposed. This class of processes provides a rather general and unified framework to describe various operational processes including followup patients, patients' visits, various events and associated costs.
The technique for evaluating predictive distributions, means and credibility bounds for evolving processes is developed (Anisimov, 2016). Some applications to modelling operational characteristics in clinical trials are considered. For these models, predictive characteristics are derived in a closed form, thus, Monte Carlo simulation is not required.
References
1. Anisimov V., Predictive hierarchic modelling of operational characteristics in clinical trials. Communications in Statistics  Simulation and Computation, 45, 05, 2016, 1477–1488. 
27/10/2016 4:30 PMBR 3.02J. E. Griffin, University of KentAdaptive MCMC schemes for variable selection problems
Data sets with many variables (often, in the hundreds, thousands, or more) are routinely collected in many disciplines. This has led to interest in variable selection in regression models with a large number of variables. A standard Bayesian approach defines a prior on the model space and uses Markov chain Monte Carlo methods to sample the posterior. Unfortunately, the size of the space (2^p if there are p variables) and the use of simple proposals in MetropolisHastings steps has led to samplers that mix poorly over models. In this talk, I will describe two adaptive MetropolisHastings schemes which adapt an independence proposal to the posterior distribution. This leads to substantial improvements in the mixing over standard algorithms in large data sets. The methods will be illustrated on simulated and real data with hundreds or thousands of possible variables.

09/06/2016 4:30 PMM103B. Zhang, QMULFunctional mixedeffects analysis of variance for human movement patterns
By using advanced motion capture systems, human movement data can be collected densely over time. We construct a functional mixedeffects model to analyse such kind of data. This model is flexible enough to study functional data which are collected from orthogonal designs. Covariance structure plays a central role in functional data analysis. In this method, withincurve covariance is analysed under stochastic process perspective and betweencurve covariance structure of functional responses is determined by the design. In particular, we are interested in the problem of hypothesis testing and generalize functional F test to the mixedeffects analysis of variance.
We apply this method to analyse movement patterns in patients with cerebral palsy. Hasse diagrams are used to represent the structure of these gait data from an orthogonal block design. In order to assess effects of anklefoot orthoses, which are commonlyprescribed to patients with abnormal gait patterns, pointwise F tests and functional F tests are used. To explore more about how anklefoot orthoses influence human movement, we are observing more gait data in a splitplot design. Randomizations of this design are based on Bailey (2008).

02/06/2016 4:30 PMM103R. Killick, Lancaster UniversityOnline changepoint detection: a new way of thinking
Online changepoint detection has its origins in statistical process control where once a changepoint is detected the process is stopped, the fault rectified and the process monitoring then begins in control again. In modern day applications such as network traffic and medical monitoring it is infeasible to adopt this strategy. In particular the out of control monitoring is often vital to diagnosis of the problem; instead of fault analysis monitoring continues throughout the period of change and a second change is indicated when the process returns to the control state.
Recent offline changepoint detection literature has demonstrated the importance of considering the changepoints globally and not focusing on detecting a single changepoint in the presence of several. In this talk we will argue that this is also the case for online changepoint detection and discuss what is meant by a "global" view in online detection. This presents several problems as the standard definitions of average run length and detection delay are not clearly applicable. Following consideration of this we show the increased accuracy in future (and past) changepoint detections when taking this viewpoint and demonstrate the method on real world applications.

28/04/2016 4:30 PMM103L. I. Pettit, QMULMeasuring discordancy between prior and data by a mixture of conjugate priors
In Bayesian inference the choice of prior distribution is important. The prior represents beliefs and knowledge about the parameter(s). For data from an exponential family a convenient prior is a conjugate one. This can be updated to find the posterior distribution and experts can choose the parameters as equivalent to imaginary samples. This technique can also be used to combine results from different studies. A disadvantage is that we are unaware of any degree of incompatibility between the prior chosen and the data obtained. This could represent overconfidence by selecting too small a variance or indicate differences between studies.
We suggest employing a mixture of conjugate priors which have the same mean but different finite variances. We give a large weight to the component of the mixture with smaller variance. The posterior weight on the first component of the mixture will be a measure of how discordant the data and the expert's prior are or how different are the two studies. We consider choosing the size of the larger variance by considering the difference in information between the two priors. We also investigate the effect of different parameterisations of the parameter of interest. We consider a number of distributions and compare this method for measuring the discordancy with previously suggested diagnostics. This is joint work with Mitra Noosha.

21/04/2016 4:30 PMM103T. Sharia, RHULStochastic approximation and online estimation algorithms
Asymptotic behaviour of a wide class of stochastic approximation procedures will be discussed. This class of procedures has three main characteristics: truncations with random moving bounds, a matrixvalued random stepsize sequence, and a dynamically changing random regression function. A number of examples will be presented to demonstrate the flexibility of this class, with the main emphases on online procedures for parametric statistical estimation. The proposed method ensures an efficient use of auxiliary information in the estimation process, and is consistent and asymptotically efficient under certain regularity conditions.

31/03/2016 4:30 PMM103D. Stowell, QMULCharacterising networks of calling birds from their timing
When you encounter a flock of birds, with individuals calling to each other, it is often clear that the birds are influencing one another through their calls. Can we infer the structure of their social network, simply by analysing the timing of calls? We introduce a modelbased analysis for temporal patterns of animal call timing, originally developed for networks of firing neurons. This has advantages over previous methods in that it can correctly handle commoncause confounds and provides a generative model of call patterns with explicit parameters for the parallel influences between individuals. We illustrate with data recorded from songbirds, to make inferences about individual identity and about patterns of influence in communication networks.

17/03/2016 4:30 PMM103K. J. McConway, The Open UniversityStatistics and the media: a statistician’s view
How should statisticians interact with the media? What should statisticians know about how the media operate? For several years I have worked (occasionally) with journalists, and provided expert statistical comments on press releases and media stories. I will describe my experience of the manysided relationship between researchers, press officers, journalists, and the public they are writing for, from the point of view of the statisticians who are also involved. I will discuss the complicated nature of numbers as facts. Using examples such as the question of whether mobile phones cause brain tumours, I will explain how none of the parties in this relationship makes things easy for the others. Finally I will present a few reasons for being optimistic about the position of statistics in the media.

03/03/2016 4:30 PMM103J. T. Griffin, QMULEstimating malaria vaccine efficacy and predicting populationlevel impact
The final results of a multicentre clinical trial of a vaccine against malaria, RTS,S, were published in 2015. Along with three other groups, we had access to the trial data to use as inputs into mathematical models of malaria transmission. Public health funding bodies and policy makers would like to know how the trial results generalise to other settings. This talk describes how we made use of the data to predict the populationlevel impact across Africa that vaccination might have, and how the uncertainty from various sources was incorporated.

11/02/2016 4:30 PMM103K. Yu, Brunel University LondonTailindex regression for both small sample bias and massive data analysis
Tailindex is an important measure to gauge the heavytailed behavior of a distribution. Tailindex regression is introduced when covariate information is available. Existing models may face two challenges: extreme analysis or tail modelling with small to moderate size data usually results in small sample bias, and on the other hand, the issue of storage and computational efficiency with massive data sets also exists for Tailindex regression. In this talk we present new tailindex regression methods, which have unbiased estimates of both regression coefficients and tailindex under small data, and are able to support online analytical processing (OLAP) without accessing the raw data in massive data analysis.

28/01/2016 4:30 PMM103J. E. Barrett, UCLAdaptive clinical trials: selective recruitment designs
In a selective recruitment design not every patient is recruited onto a clinical trial. Instead, we evaluate how much statistical information a patient is expected to provide (as a function of their covariates) and only recruit patients that will provide a sufficient level of expected information. Patients deemed statistically uninformative are rejected. Allocation to a treatment arm is also done in a manner that maximises the expected information gain.
The benefit of selective recruitment is that a successful trial can potentially be achieved with fewer recruits, thereby leading to economic and ethical advantages. We will explore various methods for quantifying how informative a patient is based on uncertainty sampling, the posterior entropy, the expected generalisation error and variance reduction. The protocol will be applied to both timetoevent outcomes and binary outcomes. Results from experimental data and numerical simulations will be presented.

21/01/2016 4:30 PMM103A. P. Mander, University of CambridgeThe Product of Independent Probability dose Escalation (PIPE) for dual agent dose escalation
Dualagent trials are now increasingly common in oncology research, and many proposed doseescalation designs are available in the statistical literature. Despite this, the translation from statistical design to practical application is slow, as has been highlighted in singleagent phase I trials, where a 3+3 rulebased design is often still used. To expedite this process, new doseescalation designs need to be not only scientifically beneficial but also easy to understand and implement by clinicians. We proposed a curvefree (nonparametric) design for a dualagent trial in which the model parameters are the probabilities of toxicity at each of the dose combinations. We show that it is relatively trivial for a clinician's prior beliefs or historical information to be incorporated in the model and updating is fast and computationally simple through the use of conjugate Bayesian inference. Monotonicity is ensured by considering only a set of monotonic contours for the distribution of the maximum tolerated contour, which defines the doseescalation decision process. Varied experimentation around the contour is achievable, and multiple dose
combinations can be recommended to take forward to phase II. Code for R, Stata and Excel are available for implementation. 
14/01/2016 4:30 PMM203W. Y. Yeung, Lancaster UniversityBayesian adaptive doseescalation procedures utilizing a gain function with binary and continuous responses
The main purpose of doseescalation trials is to identify the dose(s) that are safe and efficacious for further investigations in later studies. Therefore, doselimiting events (DLEs) and indicative responses of efficacy should be considered in the doseescalation procedure.
In this presentation, Bayesian adaptive approaches that incorporate both safety and efficacy will be introduced. A logistic regression model is used for modelling the probabilities of an occurrence of a DLE at their corresponding dose levels while a linear loglog or a nonparametric model is used for efficacy. Escalation decisions are based on the combination of both models through a gain function to balance efficacy utilities versus costs for safety risks. These doseescalation procedures aim to achieve either one objective: estimate the optimal dose, calculated via the gain function and interpreted as the safe dose which gives maximum beneficial therapeutic effect; or to achieve two objectives: estimating both the maximum tolerated dose (MTD), the highest dose that is considered as safe, and the optimal dose accurately at the end of a doseescalation study. The recommended dose(s) obtained under these procedures provide information about the safety and efficacy profile of the novel drug to facilitate later studies. We evaluate the different strategies via simulations based on an example constructed from a real trial. To assess the robustness of the singleobjective approach, scenarios where the efficacy responses of subjects are generated from an Emax model, but treated as coming from a linear loglog model are considered. We also find that the nonparametric model estimates the efficacy responses well for a large range of different underlying true shapes. The dualobjective approaches give promising results in terms of having most of their recommendations made at the two real target doses.

17/12/2015 4:30 PMM203L. Giraitis, QMULTesting for stability of the mean of heteroskedastic time series
Time series models are often fitted to the data without preliminary checks for stability of the mean and variance, conditions that may not hold in much economic and financial data, particularly over long periods. Ignoring such shifts may result in fitting models with spurious dynamics that lead to unsupported and controversial conclusions about time dependence, causality, and the effects of unanticipated shocks. In spite of what may seem as obvious differences between a time series of independent variates with changing variance and a stationary conditionally heteroskedastic (GARCH) process, such processes may be hard to distinguish in applied work using basic time series diagnostic tools. We develop and study some practical and easily implemented statistical procedures to test the mean and variance stability of uncorrelated and serially dependent time series. Application of the new methods to analyze the volatility properties of stock market returns leads to some unexpected surprising findings concerning the advantages of modeling time varying changes in unconditional variance.
Joint work with V. Dalla and P. C. B. Philips

03/12/2015 4:30 PMM203N. E. Fenton, QMULBayesian networks: why smart data is better than big data
Due to relatively recent algorithmic breakthroughs Bayesian networks have become an increasingly popular technique for risk assessment and decision analysis. This talk will provide an overview of successful applications (including transport safety, medical, law/forensics, operational risk, and football prediction). What is common to all of these applications is that the Bayesian network models are built using a combination of expert judgment and (often very limited) data. I will explain why Bayesian networks 'learnt' purely from data  even when 'big data' is available  generally do not work well, and will also explain the impediments to wider use of Bayesian networks.

26/11/2015 4:30 PMM203S. W. Hee, University of WarwickDecisiontheoretic designs for small clinical trials
Small clinical trials are sometimes unavoidable, for example, in the setting of rare diseases, specifically targeted subpopulation and vulnerable population. The most common designs used in these trials are based on the frequentist paradigm with either a large hypothesized effect size or relaxing the type I and/or II error rates. One of the novel designs that has been proposed is the Bayesian decisiontheoretic approach which is more intuitive for trials whose aim is to decide whether or not to conduct further clinical research with the experimental treatment. In this talk, I will start with a review of Bayesian decisiontheoretic designs followed by a more detailed discussion on designing a series of trials using this framework.

12/11/2015 4:30 PMM203M. S. Massa, University of OxfordStatistical modelling with graphical models
Graphical models have been studied and formalised across many communities of researchers (artificial intelligence, machine learning, statistics, to name just a few) and nowadays they represent a powerful tool for tackling many diverse applications. They still represent an exciting area of research and many new types of graphical models have been introduced to accommodate more complex situations arising from more challenging research questions and data available. Even the interpretation of graphical models can be quite different in different contexts. If we think for example of highdimensional settings, the original notion of conditional independence between random variables encoded by the conditional dependence graph is generally lost and the interest is in finding the most important components of thousands of random variables.
In this talk we will present some of the challenges we are faced when using graphical models to address research questions coming from interdisciplinary collaborations. We will present two case studies arising from collaborations with researchers in Biology and Neuropsychology and will try to elucidate some of the new frameworks arising. In particular we will show how graphical models can be very powerful for both an explorative statistical analysis and answering more advanced questions in statistical modelling and prediction.

05/11/2015 4:30 PMM203T. W. Waite, University of ManchesterRandom designs for robustness to functional model misspecification
Statistical design of experiments allows empirical studies in science and engineering to be conducted more efficiently through careful choice of the settings of the controllable variables under investigation. Much conventional work in optimal design of experiments begins by assuming a particular structural form for the model generating the data, or perhaps a small set of possible parametric models. However, these parametric models will only ever be an approximation to the true relationship between the response and controllable variables, and the impact of this approximation step on the performance of the design is rarely quantified.
We consider response surface problems where it is explicitly acknowledged that a linear model approximation differs from the true mean response by the addition of a discrepancy function. The most realistic approaches to this problem develop optimal designs that are robust to discrepancy functions from an infinitedimensional class of possible functions. Typically it is assumed that the class of possible discrepancies is defined by a bound on either (i) the maximum absolute value, or (ii) the squared integral, of all possible discrepancy functions.
Under assumption (ii), minimax prediction error criteria fail to select a finite design. This occurs because all finitely supported deterministic designs have the problem that the maximum, over all possible discrepancy functions, of the integrated mean squared error of prediction (IMSEP) is infinite.
We demonstrate a new approach in which finite designs are drawn at random from a highly structured distribution, called a designer, of possible designs. If we also average over the random choice of design, then the maximum IMSEP is finite. We develop a class of designers for which the maximum IMSEP is analytically and computationally tractable. Algorithms for the selection of minimax efficient designers are considered, and the inherent biasvariance tradeoff is illustrated.
Joint work with Dave Woods, Southampton Statistical Sciences Research Institute, University of Southampton

22/10/2015 5:30 PMM203M. Hamada, Japan Broadcasting CorporationMathematical statistics among different categories of information
In this seminar, a novel concept in mathematical statistics is proposed. Ordinarily, some topological factors such as GromovHausdorrf distance, dilatation and distortion are defined inside one metric space. The proposed idea puts a probability operator with some topological factors over different metric spaces which can project from these to one common measure space. By assuming a compact Polish space, first the original information is projected to a metric space. Then the projected information from the different spaces is mapped to one common space where some topological factors are applied. The inference and estimation can be calculated.
The merit of the proposed idea is to be able to compare values and some qualities of different fields with those for one metric space. This novel concept can let Information of post Big Data become more natural for people.
In the seminar, the situation of the Great Earthquake in Japan on 11 March 2011 is also introduced.

11/06/2015 5:30 PMM203M. Z. Hossain, QMULGeneralized linear mixed models for completely randomized design based on randomization
I will focus on the derivation of a generalized linear mixed model (GLMM) in the context of completely randomized design (CRD) based on randomization ideas for linear models. The randomization approach to derive linear models is adapted to the linktransformed mean responses including random effects with fixed effects.
Typically, the random effects in a GLMM are uncorrelated and assumed to follow a normal distribution mainly for computational simplicity. However, in our case, due to the randomization the random effects are correlated. We develop the likelihood function and an estimation algorithm where we do not assume that the random effects have a normal distribution.
I will present and compare the simulation results of a simple example with GLM (generalized linear model) and HGLM (hierarchical generalized linear model) which is suitable for normally distributed correlated random effects.

04/06/2015 2:35 PMM203H.Y. Liu, QMULGroup sequential monitoring of optimal responseadaptive randomised multiarmed clinical trials

28/05/2015 5:30 PMM203R. L. Hooper, Blizard InstituteThe dogleg design: giving clinical trials more power to their elbow
In 1948 the MRC streptomycin trial established the principles of the modern clinical trial, and for longer still the idea of a control or comparison group recruited concurrently to the intervention group has been recognised as essential to obtaining sound evidence for clinical effectiveness. But must a clinical trial proceed by running an intervention and comparator in parallel? In this seminar I will focus on trials where participants are randomised in clusters. This is common when evaluating health service interventions that are delivered within an organisational unit such as a school or general practice. I will look in particular at trials where the comparator is routine care: these trials effectively ask how individuals' outcomes would compare before and after introducing the new treatment in a cluster. I will discuss some surprisingly efficient alternatives to parallel group trial designs in this case, made possible by delaying introduction of the intervention in some clusters after randomisation, with these clusters continuing in the meantime to receive routine care.

21/05/2015 5:30 PMM203S. S. Villar, University of CambridgeBandit models for the design of Bayesian adaptive clinical trials for rare diseases
The multiarmed bandit problem describes a sequential experiment in which the goal is to achieve the largest possible mean reward by choosing from different reward distributions with unknown parameters. This problem has become a paradigmatic framework to describe the dilemma between exploration (learning about distributions' parameters) and exploitation (earning from distributions that look superior based on limited data), which characterises any data based learning process.
Over the past 40 years banditbased solutions, and particularly the concept of index policy introduced by Gittins and Jones, have been fruitfully developed and deployed to address a wide variety of stochastic scheduling problems arising in practice. Across this literature, the use of bandit models to optimally design clinical trials became a typical motivating application, yet little of the resulting theory has ever been used in the actual design and analysis of clinical trials. In this talk I will illustrate both theoretically and via simulations, the advantages and disadvantages of banditbased allocation rules approaches to clinical trials. Based on that, I will reflect on the reasons why these ideas have not been used in practice and describe a novel implementation of the Gittins index rule that overcomes these difficulties, trading off a small deviation from optimality for a fully randomized, adaptive group allocation procedure which offers substantial improvements in terms of patient benefit, especially relevant for small populations.
This talk is based on recent joint work with Jack Bowden and James Wason.

14/05/2015 5:30 PMM203V. V. Toropov, QMULDevelopment of optimisation techniques for aerospace applications
Current aerospace applications exhibit several features that are not yet adequately addressed by the available optimisation tools:
 Large scale (~1000 design variables) optimisation problems with expensive (10+ hours) response function evaluations
 Discrete optimisation with even moderately expensive response functions
 Optimisation with nondeterministic responses
 Multidisciplinary optimisation in an industrial setting.The presentation discusses recent progress towards addressing these issues identifying general trends and metamodelbased methods for solving large scale optimisation problems.
Issues that have to be addressed to obtain high quality metamodels of computationally expensive responses include establishing appropriate Designs of Experiments (DOE) focusing on the optimum Latin hypercube DOEs and including nested DOEs. Several metamodel types will be reviewed focusing on the ones obtained by the Moving Least Squares method due to its controlled noisesmoothing capability and by the Genetic Programming due to its ability to arrive at explicit functions of design variables. The use of variable fidelity responses for establishing high accuracy metamodels is also considered.
Examples of recent aerospace applications include
 Turbomachinery applications
 Optimisation of composite wing panels
 Topology optimisation and parametric optimisation in the preliminary design of a lattice composite fuselage
 Optimisation and stochastic analysis of a landing system for the ESA ExoMars mission 
07/05/2015 5:30 PMM203J. Q. Shi, Newcastle UniversityGeneralised Gaussian process regression model for nonGaussian functional data
In this talk I will discuss a generalized Gaussian process concurrent regression model for functional data where the functional response variable has a binomial, Poisson or other nonGaussian distribution from an exponential family while the covariates are mixed functional and scalar variables. The proposed model offers a nonparametric generalized concurrent regression method for functional data with multidimensional covariates, and provides a natural framework on modeling common mean structure and covariance structure simultaneously for repeatedly observed functional data. The mean structure provides an overall information about the observations, while the covariance structure can be used to catch up the characteristic of each individual batch. The prior specification of covariance kernel enables us to accommodate a wide class of nonlinear models. The definition of the model, the inference and the implementation as well as its asymptotic properties will be discussed. I will also present several numerical examples with different types of nonGaussian response variables.

30/04/2015 4:45 PMBuilding 58 Room 4121 at the University of SouthamptonD. S. Coad, QMULBias calculations for adaptive generalised linear models
A generalised linear model is considered in which the design variables may be functions of previous responses. Interest lies in estimating the parameters of the model. Approximations are derived for the bias and variance of the maximum likelihood estimators of the parameters. The derivations involve differentiating the fundamental identity of sequential analysis. The normal linear regression model, the logistic regression model and the dilutionseries model are used to illustrate the approximations.

30/04/2015 3:15 PMBuilding 58 Room 4121 at the University of SouthamptonS. G. Gilmour, University of SouthamptonFuture directions for design of experiments
Design and analysis of experiments is sometimes seen as an area of statistics in which there are few new problems. I will argue that modern biological and industrial experiments, often with automatic data collection systems, require advances in the methodology of designed experiments if they are to be applied successfully in practice. The basic philosophy of design will be reexamined in this context. Experiments can now be designed to maximise the information in the data without computational restrictions limiting either the data analysis that can be done or the search for a design. Very large amounts of data may be collected from each experimental unit and various empirical modelling techniques may used to analyse these data. In order to ensure that the data contain the required information, it is vital that attention be paid to the experimental design, the sampling design and any mechanistic information that can be built into the model. The application of these ideas to some particular processes will be used to illustrate the kinds of method that can be developed.

23/04/2015 5:30 PMM203O. Sverdlov, EMD SeronoBayesian design of proofofconcept binary outcome trials
In this talk, I will present a Bayesian approach to the problem of comparing two independent binomial proportions and its application to the design and analysis of proofofconcept clinical trials.
First, I will discuss numerical integration methods to compute exact posterior distribution functions, probability densities, and quantiles of the risk difference, relative risk, and odds ratio. These numerical methods are building blocks for applying exact Bayesian analysis in practice. Exact probability calculations provide improved accuracy compared to normal approximations and are computationally more efficient than simulationbased approaches, especially when these calculations have to be invoked repeatedly as part of another simulation study.
Second, I will show applicability of exact Bayesian calculations in the context of a proofofconcept clinical trial in ophthalmology. A singlestage design and a twostage adaptive design based on posterior predictive probability of achieving proofofconcept based on dual criteria of statistical significance and clinical relevance will be presented. A twostage design allows early stopping for either futility or efficacy, thereby providing a higher level of costefficiency than a singlestage design. A takehome message is that exact Bayesian methods provide an elegant and efficient way to facilitate design and analysis of proofofconcept studies.
Reference:
Sverdlov O, Ryeznik Y, Wu S. (2015). Exact Bayesian inference comparing binomial proportions, with application to proofofconcept clinical trials. Therapeutic Innovation and Regulatory Science 49(1), 163174.

26/03/2015 4:30 PMM203L. Zou, William Harvey Research InstituteComparison of randomization methods for testing the interaction between treatments and stratification factor in logistic
This study was motivated by two ongoing clinical trials run by EMR, to see whether B cell pathotype would cause the response rate to differ by two biological therapies for Rheumatoid Arthritis patients. Both trials used B cell pathotype as a stratification factor in the randomizations, and the effect of interest was the interaction between treatments and B cell pathotype. The B cell pathotype was classified by a synovial biopsy that each patient received before the randomization. The categories were B cell rich, B cell poor and Unknown (if the biopsy result was delayed). The biopsy result of unknown patients would be revealed once it was ready during the trial.
Randomizations studied include complete randomization, covariateadaptive randomization, hierarchical dynamic randomization, permuted block randomization and BeggIglewicz randomization. The comparison was based on simulations using the measures: selection bias, imbalance, power for testing treatment and interaction effects and inefficiency of the randomization. Because the outcome was binary variable whether a patient was responder, logistic regression was the natural choice for the post analysis. Treatment and interaction effects as well as the power to detect their significance were estimated using the logistic model with independent variables: treatments, pathotype and their interaction.

19/03/2015 4:30 PMM203W. P. Bergsma, LSERegression modelling with Ipriors
As is wellknown, the maximum likelihood method overfits regression models when the dimension of the model is large relative to the sample size. To address this problem, a number of approaches have been used, such as dimension reduction (as in, e.g., multiple regression selection methods or the lasso method), subjective priors (which we interpret broadly to include random effects models or Gaussian process regression), or regularization. In addition to the model assumptions, these three approaches introduce, by their nature, further assumptions for the purpose of estimating the model.
The first main contribution of this talk is an alternative method which, like maximum likelihood, requires no assumptions other than those pertaining to the model of interest. Our proposal is based on a new information theoretic Gaussian proper prior for the regression function based on the Fisher information. We call it the Iprior, the 'I' referring to information. The method is no more difficult to implement than random effects models or Gaussian process regression models.
Our second main contribution is a modelling methodology made possible by the Iprior, which is applicable to classification, multilevel modelling, functional data analysis and longitudinal data analysis. For a number of data sets that have previously been analyzed in the literature, we show our methodology performs competitively with existing methods.

05/03/2015 4:30 PMM203J. L. Hutton, University of WarwickChain event graphs for informative missingness
Chain event graphs (CEGs) extend graphical models to address situations in which, after one variable takes a particular value, possible values of future variables differ from those following alternative values. These graphs are a useful framework for modelling discrete processes which exhibit strong asymmetric dependence structures, and are derived from probability trees by merging the vertices in the trees together whose associated conditional probabilities are the same.
We exploit this framework to develop new classes of models where missingness is influential and data are unlikely to be missing at random. Contextspecific symmetries are captured by the CEG. As models can be scored efficiently and in closed form, standard Bayesian selection methods can be used to search over a range of models. The selected maximum a posteriori model can be easily read back to the client in a graphically transparent way.
The efficacy of our methods are illustrated using a longitudinal study from birth to age 25 of children in New Zealand, analysing their hospital admissions aged 1825 years with respect to family functioning, education, and substance abuse aged 1618 years. Of the initial 1265 people, 25% had missing data at age 16, and 20% had missing data on hospital admissions aged 1825 years. More outcome data were missing for poorer scores on social factors. For example, 21% for mothers with no formal education compared to 13% for mothers with tertiary qualifications.
This is joint work with Lorna Barclay and Jim Smith.

26/02/2015 4:30 PMM203I. Kosmidis, UCLModelbased clustering using copulas with applications
The majority of modelbased clustering techniques is based on multivariate Normal models and their variants. This talk introduces and studies the framework of copulabased finite mixture models for clustering applications. In particular, the use of copulas in modelbased clustering offers two direct advantages over current methods:
i) the appropriate choice of copulas provides the ability to obtain a range of exotic shapes for the clusters, and
ii) the explicit choice of marginal distributions for the clusters allows the modelling of multivariate data of various modes (discrete, continuous, both discrete and continuous) in a natural way.Estimation in the general case can be performed using standard EM, and, depending on the mode of the data, more efficient procedures can be used that can fully exploit the copula structure. The closure properties of the mixture models under marginalisation will be discussed, and for continuous, realvalued data parametric rotations in the sample space will be introduced, with a parallel discussion on parameter identifiability depending on the choice of copulas for the components. The exposition of the methodology will be accompanied by the analysis of real and artificial data.
This is joint work with Dimitris Karlis at the Athens University of Economics and Business.
Related preprint: http://arxiv.org/abs/1404.4077

19/02/2015 4:30 PMM203A. Koloydenko, RHULPositive definite matrices, Procrustes analysis, and other nonEuclidean approaches to statistical analysis of diffusion
Symmetric positive semidefinite (SPD) matrices have recently seen several new applications, including Diffusion Tensor Imaging (DTI) in MRI, covariance descriptors and structure tensors in computer vision, and kernels in machine learning.
Depending on the application, various geometries have been explored for statistical analysis of SPDvalued data. We will focus on DTI, where the naive Euclidean approach was generally criticised for its “swelling” effect in interpolation, and violations of positive definiteness in extrapolation and estimation. The affine invariant and logEuclidean Riemannian metrics were subsequently proposed to remedy the above deficiencies. However, practitioners have recently argued that these geometric approaches are an overkill in some relevant noise models.
We will examine a couple of related alternative approaches that in a sense reside in between the two aforementioned extremes. These alternatives are based on the square root Euclidean and Procrustes sizeandshape metrics. Unlike the Riemannian approach, our approaches, we think, operate more naturally with respect to the boundary of the cone of SPD matrices. In particular, we prove that the Procrustes metric, when used to compute weighted Frechet averages, preserves ranks. We also establish and prove a key relationship between these two metrics, as well as inequalities ranking traces (mean diffusivity) and determinants of the interpolants based on the Riemannian, Euclidean, and our alternative metrics. Remarkably, traces and determinants of our alternative interpolants compare differently. A general proof of the determinant inequality was just developed and may also be of value to the more general matrix analysis community.
Several experimental illustrations will be shown based on synthetic and real human brain DT MRI data.
No special background in statistical analysis on nonEuclidean manifolds is assumed.
This is a joint work with Prof Ian Dryden (University of Nottingham) and Dr Diwei Zhou (Loughborough University), with a more recent contribution by Dr Koenraad Audenaert (RHUL).

05/02/2015 4:30 PMM203B. L. Sturm, QMULOut of the barn and into the yard, and other colourful results from my recent paroxysm about the practice of evaluation
I call attention to what I call the “crisis of evaluation” in music information retrieval (MIR) research. Among other things, MIR seeks to address the variety of needs for music information of listeners, music recording archives, and music companies. A large portion of MIR research has thus been devoted to the automated description of music in terms of genre, mood, and other meaningful terms. However, my recent work reveals four things: 1) many published results unknowingly use datasets with faults that render them meaningless; 2) stateoftheart (“high classification accuracy”) systems are fooled by irrelevant factors; 3) most published results are based upon an invalid evaluation design; and 4) a lot of work has unknowingly built, tuned, tested, compared and advertised “horses” instead of solutions. (The true story of the horse Clever Hans provides the most appropriate illustration.) I argue why these problems have occurred, and how we can address them by adopting the formal design and evaluation of experiments, and other best practices.
Relevant publications:
[1] B. L. Sturm, “Classification accuracy is not enough: On the evaluation of music genre recognition systems,” J. Intell. Info. Systems, vol. 41, no. 3, pp. 371–406, 2013.
http://link.springer.com/article/10.1007%2Fs108440130250y(link is external)[2] B. L. Sturm, “A simple method to determine if a music information retrieval system is a “horse”,” IEEE Trans. Multimedia, vol. 16, no. 6, pp. 1636–1644, 2014.
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6847693(link is external)[3] B. L. Sturm, “The state of the art ten years after a state of the art: Future research in music information retrieval,” J. New Music Research, vol. 43, no. 2, pp. 147–172, 2014.
http://www.tandfonline.com/doi/abs/10.1080/09298215.2014.894533#.VMDT0KZ... 
22/01/2015 4:30 PMM203O. Volkov, QMULOptimal relaxed designs of experiments
A relaxed design is a continuous design whose replications can be any nonnegative real number. The talk introduces the method of relaxed designs and identifies its applications to sample size determination, costefficient design, constrained design and multistage Bayesian design. The main focus is on applications that could be intractable with standard optimal design.

15/01/2015 4:30 PMM203S. Lunagomez, Harvard UniversityValid inference from nonignorable network sampling designs
Consider a population where subjects are susceptible to a disease (e.g. AIDS). The objective is to perform inferences on a population quantity (like the prevalence of HIV on a highrisk subpopulation, e.g. intravenous drug abusers) via sampling mechanisms based on a social network (linktracing designs, RDS). We develop a general framework for making Bayesian inference on the population quantity that: models the uncertainty in the underlying social network using a random graph model, incorporates dependence among the individual responses according to the social network via a Markov Random Field, models the uncertainty regarding the sampling on the social network, and deals with the nonignorability of the sampling design. The proposed framework is general in the sense that it allows a wide range of different specifications for the components of the model we just mentioned. Samples from the posterior distribution are obtained via Bayesian model averaging. Our model is compared with standard methods in simulation studies and it is applied to real data.

11/12/2014 4:30 PMM203M. Mauch, QMULMaking sense and science out of musical data
I will give an overview of my work in music informatics research (MIR) with some applications to singing research and tracking the evolution of music. I first will give a very highlevel overview of my work, starting with my Dynamic Bayesian Network approach to chord recognition, a system for lyricstoaudio alignment (SongPrompter), and some other shiny applications of Music Informatics (Songle.jp, Last.fm Driver's Seat). Secondly, I will talk about some scientific applications of music informatics, including the study of singing intonation and intonation drift as well as the evolution of music both in the lab and in the real charts.

04/12/2014 4:30 PMM203B. Calderhead, Imperial College LondonA general construction for parallelising MetropolisHastings algorithms
Markov chain Monte Carlo methods are essential tools for solving many modern day statistical and computational problems, however a major limitation is the inherently sequential nature of these algorithms. In this talk I'll present some work I recently published in PNAS on a natural generalisation of the MetropolisHastings algorithm that allows for parallelising a single chain using existing MCMC methods. We can do so by proposing multiple points in parallel, then constructing and sampling from a finite state Markov chain on the proposed points such that the overall procedure has the correct target density as its stationary distribution. The approach is generally applicable and straightforward to implement. I'll demonstrate how this construction may be used to greatly increase the computational speed and statistical efficiency of a variety of existing MCMC methods, including MetropolisAdjusted Langevin Algorithms and Adaptive MCMC. Furthermore, I'll discuss how it allows for a principled way of utilising every integration step within Hamiltonian Monte Carlo methods; our approach increases robustness to the choice of algorithmic parameters and results in increased accuracy of Monte Carlo estimates with little extra computational cost.

27/11/2014 4:30 PMM203T. Jaki, Lancaster UniversityTreatment selection in multiarm, multistage clinical studies
Adaptive designs that are based on groupsequential approaches have the benefit of being efficient as stopping boundaries can be found that lead to good operating characteristics with test decisions based solely on sufficient statistics. The drawback of these so called “preplanned adaptive” designs is that unexpected design changes
are not possible without impacting the error rates. “Flexible adaptive designs”, and in particular designs based on pvalue combination, on the other hand can cope with a large number of contingencies at the cost of reduced efficiency.In this presentation we focus on so called multiarm multistage trials which compare several active treatments against control at a series of interim analyses. We will focus on the methods by Stallard and Todd [1] and Magirr et al. [2], two different approaches which are based on groupsequential ideas, and discuss how these “preplanned
adaptive designs” can be modified to allow for flexibility. We then show how the added flexibility can be used for treatment selection and evaluate the impact on power in a simulation study. The results show that a combination of a well chosen preplanned design and an application of the conditional error principle to allow flexible treatment selection results in an impressive overall procedure.
______________________
[1] Stallard, N, & Todd, S. 2003. Sequential designs for phase III clinical trials incorporating treatment selection. Statistics in Medicine, 22, 689703.
[2] Magirr, D, Jaki, T, & Whitehead, J. 2012. A generalised Dunnett test for multiarm, multistage clinical studies with treatment selection. Biometrika, 99, 494501. 
20/11/2014 4:30 PMM203G. A. Young, Imperial College LondonInference in the presence of nuisance parameters
Two routes most commonly proposed for accurate inference on a scalar interest parameter in the presence of a (possibly highdimensional) nuisance parameter are parametric simulation (`bootstrap') methods, and analytic procedures based on normal approximation to adjusted forms of the signed root likelihood ratio statistic. Both methods yield, under some null hypothesis of interest, pvalues which are uniformly distributed to error of thirdorder in the available sample size. But, given a specific inference problem, what is the formal relationship between pvalues calculated by the two approaches? We elucidate the extent to which the two methodologies actually just give the same inference.

06/11/2014 3:30 PMM103D. Woods, University of SouthamptonBayesian design of experiments and Gaussian process models
The design of many experiments can be considered as implicitly Bayesian, with prior knowledge being used informally to aid decisions such as which factors to vary and the choice of plausible causal relationships between the factors and measured responses. Bayesian methods allow uncertainty in such decisions to be incorporated into design selection through prior distributions that encapsulate information available from scientific knowledge or previous experimentation. Further, a design may be explicitly tailored to the aim of the experiment through a decisiontheoretic approach with an appropriate loss function.
We will present novel methodology for two problems in this area, related through the application of Gaussian process (GP) regression models. Firstly, we consider Bayesian design for prediction from a GP model, as might be used for the collection of spatial data or for a computer experiment to interrogate a numerical model. Secondly, we address Bayesian design for parametric regression models, and demonstrate the application of GP emulators to mitigate the computational issues that have traditionally been a barrier to the application of these designs.

06/11/2014 3:00 PMM103W. Just, QMULDetecting phase synchronisation in time series data sets
Synchronisation phenomena in their various disguises are among the most prominent features in coupled dynamical structures. Within this talk we first introduce how the vague notion of a phase can be given a more precise meaning using what has been coined as analytic signal processing. This approach then allows to distinguish different types of synchronisation phenomena, and in particular to detect synchronisation of the phase of signals where amplitudes remain uncorrelated. These ideas are finally applied to data sets to explore whether phase synchronisation plays a role in the interpretation of physiological movement data.
Attachment Size Slides for talk [PDF 1,147KB] 1.12 MB 
30/10/2014 4:30 PMM203I. Andrianakis, London School of Hygiene and Tropical MedicineCalibration of an individual based HIV computer model using emulation and history matching
Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in physics, engineering, biology and other disciplines. The utility of these models depends on how well they are calibrated to empirical data. Their calibration is hindered however, both by large numbers of input and output parameters and by run times that increase with the model's complexity. In this talk we present a calibration method called History Matching, which is iterative and scales well with the dimensionality of the problem. History matching is based on the concept of an emulator, which is a Bayesian representation of our beliefs about the model, given the runs that are available to us. Capitalising on the efficiency of the emulator, History Matching iteratively discards regions of the input space that are unlikely to provide a good match to the empirical data, and is based on successive runs of the computer model in narrowing areas of the input space, which are known as waves. This calibration technique can be embedded in a comprehensive error modelling framework, that takes into account various sources of uncertainty, due to the parameters, the model itself, the observations etc. A calibration example of a high dimensional HIV model will be used to illustrate the method.

23/10/2014 5:30 PMM203P. R. Curtis, QMULEmulation with smooth supersaturated models: solvability, stability, sensitivity and design
Smooth supersaturated models are a class of emulators with a supersaturated polynomial basis, that is there are more model terms than design points. In this talk I will give some key results regarding the structure and solvability of these models as well as some insights regarding the numeric stability of fitting these large models. Sensitivity analysis using Sobol indices is often used to reduce the parameter space of expensive computer experiments and a simple formula is given for computing these indices for a smooth supersaturated model. Finally, I present the results of some simulation studies exploring ways to use the emulated response surface to generate new design points.

29/05/2014 5:30 PMM103Kabir Soeny, School of Mathematical Sciences, QMULOptimization of Dose Regimens under Pharmacokinetic and Pharmacodynamic Constraints
Following the correct selection of a therapy based on the indication, an optimal dose regimen is the most important determinant of therapeutic success of a medical therapy. After giving an introduction to the Efficient Dosing (ED) algorithm developed by us to compute dose regimens which ensure that the blood concentration of the drug in the body is kept close to the target level, I will show how the algorithm can be applied to the Pharmacodynamic models for infectious diseases. The optimized dose regimens satisfy three conditions: (1) minimize the concentration of the antiinfective drug lying outside the therapeutic window (if any), (2) ensure a target reduction in viral load, and (3) minimize drug exposure once the goal of viral load reduction has been achieved. The algorithm can also be used to compute the number of doses required for treatment.

15/05/2014 5:30 PMM103John Paul Gosling, School of Mathematics, University of LeedsSubjective judgements in skin sensitisation hazard assessments
One key quantity of interest in skin sensitisation hazard assessment is the mean threshold for skin sensitisation for some defined population (called the sensitising potency). Before considering the sensitising potency of the chemical, hazard assessors consider whether the chemical has the potential to be a skin sensitiser in humans. Bayesian belief network approaches to this part of the assessment, which handles the disparate lines of evidence within a probabilistic framework, have been applied successfully. The greater challenge comes in the quantification of uncertainty about the sensitising potency.
To make inferences about sensitising potency, we used a Bayes linear framework to model hazard assessors' expectations and uncertainties and to update those beliefs in the light of some competing data sources. In producing a tool for synthesising multiple lines of evidence and estimating hazard, we developed a transparent mechanism to help defend and communicate risk management decisions. In this talk, I will attempt to describe the principles of this Bayesian modelling and formal processes for capturing expert knowledge. And, hopefully, I will be able to highlight their applicability where fast decisions are needed and data are sparse.

08/05/2014 5:30 PMM103Yoshifumi Ukita, Yokohama College of Commerce and Wolfson College, Cambridge (visitor)Models based on orthonormal systems for experimental design
In this talk, models based on orthonormal systems for experimental design are presented. In such models, it is possible to use fast Fourier Transforms (FFT) to calculate the parameters, which are independent, and which are complex numbers expressed as Fourier coefficients.
Theorems for the relation between the Fourier coefficients and the effect of each factor are also given. Using these theorems, the effect of each factor can be easily obtained from the computed Fourier coefficients. The paper finally shows that the analysis of variance can be used on the proposed models without the need to calculate the degrees of freedom.

27/03/2014 4:30 PMM103Leon Danon, School of Mathematical Sciences, QMULCollective behaviour in social systems
Human social systems show unexpected patterns when studied from a collective point of view. In this talk I will present a few examples of collective behaviour in social systems: human movement patterns, social encounter networks and music collaboration networks, all of which are data driven. I'll try to make the talk short and aim to start a discussion.

13/03/2014 4:00 PMM103Altea LorenzoArribas, Biomathematics & Statistics Scotland (BioSS), The James Hutton Institute, AberdeenCumulative link mixed models and the partial proportional odds assumption
This talk will focus on the challenges faced in mixed modelling with ordinal response variables. Topics covered will include: the advantages of an ordinal approach versus a widely used in practice more generic continuous approach; the implications of the proportional odds assumption and more flexible approaches such as the partial proportional odds assumption; and the implementation of mixed models in this context. Both simulations and applications to real data regarding perceptions on environmental matters will be shown.

06/03/2014 4:30 PMM103Stella Hadjantoni, School of Economics and Finance, QMULMethods for the reestimation of largescale linear models after adding and deleting observations
It is often computationally infeasible to reestimate afresh a largescale model when a small number of observations is sequentially modified. Furthermore, in some cases a dataset is too large and might not be able to fit in a computer's memory and in such cases out of core algorithms need to be developed. Similarly data might not be available at once and recursive estimation strategies need to be applied. Within this context the aim is to design computationally efficient and numerically stable algorithms. Initially, the reestimation of the generalized least squares (GLS) solution after observations are deleted, known as downdating, is examined. The new method to estimate the downdated general linear model (GLM), updates the original GLM with the imaginary deleted observations. This results to a nonpositive definite dispersion matrix which comprises complex covariance values. This updatedGLM with imaginary values has been proven to derive the same GLS estimator as that of solving afresh the original GLM after downdating.
The estimation of the downdatedGLM is formulated as a generalized linear least squares problem (GLLSP). The solution of the GLLSP derives the GLS estimator even when the dispersion matrix is singular. The main computational tool is the generalized QR decomposition which is employed based on hyperbolic Householder transformations, however, no complex arithmetic is used in practice. The special case of computing the GLS estimator of the downdatedSUR (seemingly unrelated regressions) model is considered. The method is extended to the problem of concurrently adding and deleting observations from the model. The special structure of the matrices and properties of the SUR model are efficiently exploited in order to reduce the computational burden of the estimation algorithm. The proposed algorithms are applied to synthetic and real data. Their performance when compared with algorithms that estimate the same model afresh confirms their computational efficiency. 
27/02/2014 4:30 PMM103Vassilios Stathopoulos, Centre for computational statistics and machine learning, UCLBat call identification with Gaussian process multinomial probit regression and a dynamic time warping kernel
We study the problem of identifying bat species from echolocation calls in order to build automated bioacoustic monitoring algorithms. We employ the Dynamic Time Warping algorithm which has been successfully applied for bird flight calls identification and show that classification performance is superior to hand crafted call shape parameters used in previous research. This highlights that generic bioacoustic software with good classification rates can be constructed with little domain knowledge. We conduct a study with field data of 21 bat species from the north and central Mexico using a multinomial probit regression model with Gaussian process prior and a full EP approximation of the posterior of latent function values. Results indicate high classification accuracy across almost all classes while misclassification rate across families of species is low highlighting the common evolutionary path of echolocation in bats.

13/02/2014 4:30 PMM103David Siegmund, Department of Statistics, Stanford UniversityDetection of Genomic Signals by Resequencing
Several problems of genomic analysis involve detection of local genomic signals. When
the data are generated by sequence based methods, the variability of read depth at different
positions on the genome suggests point process models involving nonhomogeneous Poisson
processes, or perhaps negative binomial processes if there is excess variability. We discuss a
number of examples, and consider in detail a model for detection of insertions and deletions
(indels) based on paired end reads.
This is joint research with Nancy Zhang and Benjamin Yakir. 
06/02/2014 4:30 PMM103Peter Congdon, The School of Geography, QMULMeasuring spatial clustering in disease patterns
The talk considers a cluster detection methodology which describes the cluster status of each area, and provides alternative/complementary perspectives to spatial scan cluster detection. The focus is on spatial health risk patterns (area disease prevalence, area mortality, etc) when area relative risks are unknown parameters. The method provides additional insights with regard to cluster centre areas vs. cluster edge areas. The method also considers both low risk clustering and high risk clustering in an integrated perspective, and measures high/low risk outlier status. The application of the method is considered with simulated data (and known spatial clustering), and with real examples, both univariate and bivariate.

30/01/2014 4:30 PMM103Javier Rubio, Department of Statistics, The University of WarwickModelling of skewness and kurtosis with double twopiece distributions
In this talk, I will present a brief summary of several classes of univariate flexible distributions employed to model skewness and kurtosis. We will discuss a simple classication of these distributions in terms of their tail behaviour. This classication motivates the introduction of a new family of distributions (double two{piece distributions), which is obtained by using a transformation dened on the family of unimodal symmetric continuous distributions containing a shape parameter. The proposed distributions contain five interpretable parameters that control the mode, as well as the scale and shape in each direction. Fourparameter subfamilies of this class of transformations are also discussed. It is also presented an interpretable scale and location invariant benchmark prior as well as conditions for the existence of the corresponding posterior distribution. Finally, the use of this sort of models is illustrated with a real data example.

17/12/2013 4:30 PMM203Alexandra Piryatinska, Department of Mathematics, San Francisco State UniversityDetection of changes in the generating mechanism of time series via the epsiloncomplexity of continuous functions
A novel methodology for the detection of abrupt changes in the generating mechanisms (stochastic, deterministic or mixed) of a time series, without any prior knowledge about them, will be presented. This methodology has two components: the first is a novel concept of the epsiloncomplexity, and the second is a method for the change point detection. In the talk, we will give the definition of the epsiloncomplexity of a continuous function defined on a compact segment. We will show that for the Holder class of functions there exists an effective characterization of the epsiloncomplexity. The results of simulations and applications to the electroencephalogram data and financial time series will be presented. (The talk is based on joint work with Boris Darkhovsky at the Russian Academy of Sciences.)

05/12/2013 4:30 PMM203Peter Challenor, Exeter UniversityClimate, Models and Uncertainty
tba

21/11/2013 4:30 PMM203Erica Thompson, Centre for Climate Change Economics and Policy, LSEStatistical challenges in climate change research
Climate change research methods, particularly those aspects involving projection of future climatic conditions, depend heavily upon statistical techniques but are still at an early stage of development. I will discuss what I see as the key statistical challenges for climate research, including the problems of too little and too much data, the principles of inference from model output, and the relationship of statistics with dynamics (physics). With reference to some specific examples from the latest IPCC report and beyond, I will show that there is a need for statisticians to become more involved with climate research and to do so in a manner that clarifies, rather than obscures, the role and influence of physical constraints, of necessary simplifying assumptions, and of subjective expert judgement.

07/11/2013 3:45 PMSeminar to be held in southampton UniversitySteven Gilmour (Southampton) and Luzia Trinca (UEP)Multistratum Designs for Statistical Inference
It is increasingly realised that many industrial experiments involve some factors whose levels are harder to reset than others, leading to multistratum structures. Designs are usually chosen to optimise the point estimation of fixed effects parameters, such as polynomial terms in a response surface model, using criteria such as D or Aoptimality. Gilmour and Trinca (2012) introduced the DP and APoptimality criteria, which optimise interval estimation, or equivalently hypothesis testing, by ensuring that unbiased (pure) error estimates can be obtained. We now extend these ideas to multistratum structures, by adapting the stratumbystratum algorithm of Trinca and Gilmour (2014) to ensure optimal interval estimation in the lowest stratum. It turns out that, in most practical situations, this also ensures that adequate pure error estimates are available in the higher strata. Several examples show that good practical designs can be obtained, even with fairly small run sizes.

07/11/2013 2:15 PMSeminar to be held in Southampton UniversityHugo MaruriAguilar, School of Mathematical Sciences, QMULOptimal design for smooth supersaturated models (SSM)
Smoooth supersaturated models (SSM) are interpolation models in which the underlying model size, and typically the degree, is higher than would normally be used in statistics, but where the extra degrees of freedom are used to make the model smooth.
I will describe the methodology, discuss briefly the role of orthogonal polynomials and then address two design problems. The first is selection of knots and the second a more traditional design problem using SSM to obtain the kernels of interest for Doptimality.
This is joint work with Ron Bates (RollsRoyce), Peter Curtis (QMUL) and Henry Wynn (LSE).

31/10/2013 4:30 PMM203Haeran Cho, Department of Mathematics, University of BristolModelling and forecasting daily electricity loads via curve linear regression
We study the problem of modelling and the shortterm forecasting of electricity loads. Regarding the electricity load on each day as a curve, we propose to model the dependence between successive daily curves via curve linear regression. The key ingredient in curve linear regression modelling is the dimension reduction based on a singular value decomposition in a Hilbert space, which reduces the curve regression problem to several ordinary (i.e. scalar) linear regression problems. We illustrate the method by performing oneday ahead forecasting of the electricity loads consumed by the customers of EDF between 2011 and mid2012, where we also compare our method with other available models.
This is a joint work with Yannig Goude, Xavier Brossat and Qiwei Yao.

17/10/2013 4:30 PMM203Maria Vazquez, Department of Public Health, University of OxfordControl charts applied to the management of bipolar disorder patients
Control charts are well known tools in industrial statistical process control. They are used to distinguish between random error and systematic variability. The use of these tools in medicine has only started in recent years.
In this Seminar we present a project in which we explore the ability of Shewhart's control rules to predict severe manic and depressive episodes in bipolar disorder patients. In our study, we consider three types of control charts and a variety of scenarios using real data.

10/10/2013 5:30 PMM203Roberto Fontana, Dipartimento di Scienze Matematiche Politecnico di TorinoSaturated designs: some applications
In the first part of the talk we study saturated fractions of factorial designs under the perspective of Algebraic Statistics. Exploiting the identification of a fraction with a binary contingency table, we define a criterion to check whether a fraction is saturated or not with respect to a given model. The proposed criterion is based on combinatorial algebraic objects, namely the circuit basis of the toric ideal associated to the design matrix of the model. It is a joint work with Fabio Rapallo (Universit`a del Piemonte Orientale, Italy) and Maria Piera Rogantin (Italy).
In the second part of the talk we study optimal saturated designs, mainly Doptimal designs. Efficient algorithms for searching for optimal saturated designs are widely available. They maximize a given efficiency measure (such as Doptimality) and provide an optimum design. Nevertheless, they do not guarantee a global optimal design. Indeed, they start from an initial random design and find a local optimal design. If the initial design is changed the optimum found will, in general, be different. A natural question arises. Should we stop at the design found or should we run the algorithm again in search of a better design? This paper uses very recent methods and software for discovery probability to support the decision to continue or stop the sampling. A software tool written in SAS has been developed.

06/06/2013 5:30 PMM203Peter Kimani, Warwick Medical SchoolConditionally unbiased estimation in adaptive seamless designs
In order to accelerate drug development, adaptive seamless designs (ASDs) have been
proposed. In this talk, I will consider twostage ASDs, where in stage 1, data are
collected to perform treatment selection or subpopulation selection. In stage 2,
additional data are collected to perform confirmatory analysis for the selected
treatments or subpopulations. Unlike the traditional testing procedures, for ASDs,
stage 1 data are also used in the confirmatory analysis. Although ASDs are efficient,
using stage 1 data both for selection and confirmatory analysis poses statistical
challenges in making inference.I will focus on point estimation at the end trials that use ASDs. Estimation is
challenging because multiple hypotheses are considered at stage 1, and the experimental
treatment (or the subpopulation) that appears to be the most effective is selected
which may lead to bias. Estimators derived need to account for this fact. In this talk,
I will describe estimators we have developed. 
23/05/2013 5:30 PMM203Angela Noufaily, Department of Mathematics And Statistics, The Open UniversityAn improved algorithm for outbreak detection in multiple surveillance systems
In England and Wales, a largescale multiple statistical surveillance system for infectious disease outbreaks
has been in operation for nearly two decades. This system uses a robust quasiPoisson regression algorithm to
identify aberrances in weekly counts of isolates reported to the Health Protection Agency. In this paper, we
review the performance of the system with a view to reducing the number of false reports, while retaining good
power to detect genuine outbreaks. We undertook extensive simulations to evaluate the existing system in a
range of contrasting scenarios. We suggest several improvements relating to the treatment of trends, seasonality,
reweighting of baselines and error structure. We validate these results by running the existing and proposed
new systems in parallel on real data. We find that the new system greatly reduces the number of alarms while
maintaining good overall performance and in some instances increasing the sensitivity. 
02/05/2013 5:30 PMM203Rosemary Bailey, QMUL/University of St AndrewsCircular designs balanced for neighbours at distances one and two
We consider experiments where the experimental units are arranged in a circle or in a single line in space or time. If neighbouring treatments may affect the response on an experimental unit, then we need a model which includes the effects of direct treatments, left neighbours and right neighbours. It is desirable that each ordered pair of treatments occurs just once as neighbours and just once with a single unit in between. A circular design with this property is equivalent to a special type of quasigroup.
In one variant of this, selfneighbours are forbidden. In a further variant, it is assumed that the leftneighbour effect is the same as the rightneighbour effect, so all that is needed is that each unordered pair of treatments occurs just once as neighbours and just once with a single unit in between.
I shall report progress on finding methods of constructing the three types of design.

02/05/2013 4:30 PMM203Marion Chatfield/Simon Bate University of Southampton/ Glaxo Smith KlineUsing the experimental design and its randomisation to construct a mixed model
In many areas of scientific research complex experimental designs are now routinely used. With the advent of mixed model algorithms, implemented in many statistical software packages, the analysis of data generated from such experiments has become more accessible. However, failing to correctly identify the experimental design used can lead to incorrect model selection and misleading inferences. A procedure is described that identifies the structure of the experimental design and, given the randomisation, generates a maximal mixed model. This model is determined before the experiment is conducted and provides a starting point for the final statistical analysis. The whole process can be illustrated using a generalisation of the Hasse diagram called the Terms Relationships diagram. Most parts of the algorithm have been implemented in a program written in R. It is shown that the model selection process can be simplified by placing experimental design (crossed/nested structure and randomisation) at the centre of a systematic procedure.

22/04/2013 4:30 PM130 Wolfson InstituteSteve Coad, School of Mathematical Sciences, QMULInference following adaptive biased coin designs
Suppose that two treatments are being compared in a clinical trial. Then, if complete randomisation is used, the next patient is equally likely to be assigned to one of the two treatments. So this randomisation rule does not take into account the previous treatment assignments, responses and covariate vectors, and the current patient's covariate vector. The use of an adaptive biased coin which takes some or all of this information into account can lead to a more powerful trial.
The different types of such designs which are available are reviewed and the consequences for inference discussed. Issues related to both point and interval estimation will be addressed.

28/03/2013 4:30 PMM203Ivonne Solís, MRC Human Nutrition ResearchGraphical models with latent variables and their application in developmental psychology
We present a novel strategy of statistical inference for graphical models with latent Gaussian variables, and observed variables that follow nonstandard sampling distributions. We restrict our attention to those graphs in which the latent variables have a substantive interpretation. In addition, we adopt the assumption that the distribution of the observed variables may be meaningfully interpreted as arising after marginalising over the latent variables. We illustrate the method with two studies that investigate developmental changes in cognitive functions of young children in one case and of cognitive decline of Alzheimer’s patients in the other. These studies involve the assessment of competing causal models for several psychological constructs; and the observed measurements are gathered from the administration of batteries of tasks subject to complicated sampling protocols.

21/03/2013 4:30 PMM203Magdalena Chudy, EECS QMULOn the relation between bowing gesture and tone production in classical cello performance. Searching for effective metho
In this presentation I would like to introduce a multimodal database which was created within the
scopes of my PhD study on “Cello Performer Modelling Using Timbre Features”.The database consists of bowing gestures and music samples of six cello players recorded on two different
cellos. The gesture and audio measurements were collected in order to identify performerdependent sound
features of the players performing on the same instrument and to investigate a potentially existing
correlation between the individual sound features and specific bowing control parameters necessary for
production of desired richness of tone. The current study goal is to find such combinations of respective
bowing gestures and acoustical features which can be seen as patterns and are able to characterise each
player in the database.Following the data presentation I would like to state some other research questions that clearly emerge
and open a discussion on analysis methods which could help to answer them. 
14/03/2013 3:30 PMM203Karla DíazOrdaz, LSHTMHandling missing values in hierarchical clinical trial data
Missing data are common in clinical trials but often analysis is based on “completecases”. Completecase analyses (which delete observations with missing information on any studied covariate) are inefficient and may be biased. Methodological guidelines recommend using multiple imputation (MI). However, for MI to provide valid inferences, the imputation model must recognise the study design. In this talk, we will survey current missing data practice in the clinical trials literature and describe current good practice methodology for hierarchical data.
Using real data from a cluster randomized trial as an example, we see how treatment effects can be sensitive to the choice of method to address the missing data problem. We finish by presenting a few results from a large simulation study, designed to compare the performance of : (a) Multilevel MI that accounts for clustering through cluster random effects, (b) MI that that includes a fixed effect for each cluster and (c) singlelevel MI that ignores clustering.

28/02/2013 4:30 PMM203Clifford Lam, Department of Statistics, LSERegularization of Spatial Panel Time Series
In this talk we introduce the need for the estimation of
crosssectional dependence, or "network" of a panel of time series. In
spatial econometrics and other disciplines, the socalled spatial weight
matrix in a spatial lag model is always assumed known, when it is still
on debate if results of estimation can be sensitive to such assumed
known weight matrices. Since these weight matrices are often sparse, we
propose to regularize it from the data using a wellknown technique by
now  the adaptive LASSO. The technique in quantifying time dependence
is relatively new for statistics and time series literatures.
Nonasymptotic inequalities, as well as asymptotic sign consistency for
the weight matrices elements are presented with explicit rates of
convergence spelt out. A block coordinate descent algorithm is presented
together with results from simulation experiments and a real data
analysis. 
21/02/2013 4:30 PMM203Dr Mark Strong, Section of Public Health, School of Health and Related Research, The University of SheffieldHealth Economic Model Error and the Expected Value of Model Improvement
George Box famously said “All models are wrong, some are useful”. The challenges
are to determine which models are useful and to quantify how wrong is “wrong”.
In this talk I will explore the problem of determining model adequacy in the
context of health economic decision making.In health economics, models are used to predict the costs and health benefits
under the competing treatment options (e.g. drug A versus drug B).The decision problem is typically of the following form. An expensive new drug,
A, has arrived on the market. Should the NHS use it? How much additional health
will society gain if the NHS uses new drug A over existing drug B? What will the
extra cost be? What healthcare activity will be displaced if we use drug A
rather than drug B? Will this be a good use of scarce healthcare resources?
I will describe a general approach to determining model adequacy that is based
on quantifying the “expected value of model improvement”. I will illustrate the
method in a case study. 
24/01/2013 4:30 PMM203Ruby ChildsMaking the best out of things
I was once a QMUL Maths student, quite lost on what I wanted to do next. I strived to get into Investment banking and it wasn't all that it seems, so I had to make a change. I now programme; I'm now creative.
Not everyone knows what to do after finishing University or how to make the best out of themselves. Join me to talk about tips of how to do well in University and how to do well after, from my own mistakes.
Slides for the seminar are available following the link: http://prezi.com/_hpds1p9jrqx/makingthebestoutofthings/

17/01/2013 4:30 PMM203Alexis Boukouvalas Aston Research Centre for Healthy Ageing (ARCHA) Aston UniversityOptimal Design for Stochastic emulation with heteroscedastic Gaussian Process models
We examine optimal design for parameter estimation of Gaussian process regression models under inputdependent noise. Such a noise model leads to heteroscedastic models as opposed to homoscedastic models where the noise is assumed to be constant. Our motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators as statistical surrogates. In the case of stochastic simulators, the simulator may be evaluated repeatedly for a given parameter setting allowing for replicate observations in the experimental design. Our findings are applicable however in the wider context of design for Gaussian process regression and kriging where the parameter variance is sought to be minimised. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates, that is we seek designs that enable us to best learn about the Gaussian process model.
We construct heteroscedastic Gaussian process representations and propose an experimental design technique based on an extension of Fisher information to heteroscedastic models. We empirically show that the although a strict ordering of the Fisher information to the maximum likelihood parameter variance is not exact, the approximation error is reduced as the ratio of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology model, the replicateonly optimal designs are shown to outperform both replicateonly and nonreplicate spacefilling designs as well as nonreplicate optimal designs. We consider both local and Bayesian Doptimal designs in our experiments.

Nested rowcolumn designs for nearfactorial experiments with two treatment factors and one control treatment

15/11/2012 4:30 PMM203Fatima Jichi Medical Statistician, UCL School of Life and Medical SciencesGrowth Mixture Modelling of Child Behaviour in a Study of Children Receiving Multidimensional Treatment Foster Care in
The study aims to evaluate the response of children to a new treatment, Multidimensional Treatment Foster Care in England (MTFCE). Trajectories of child behaviour were studied over time to identify subgroups of treatment response.
Growth Mixture Modelling (GMM) was used to find subgroups in the data. A GMM describes longitudinal measures of a single outcome measure as being driven by a set of subjectvarying continuous unobserved or latent variables  the socalled growth factors. The growth factors define the individual trajectories. GMM estimates mean growth curves for each class, and individual variation around these growth curves. This allows us to find clusters in the data. Starting characteristics of children were included into the GMM to see if these predicted class membership. Class membership was also checked to see if it predicted outcomes of interest. 
08/11/2012 4:30 PMM203Hugo MaruriAguilar, School of Mathematical Sciences, Queen Mary, University of LondonComputer simulators
Computer simulators
A computer experiment consists of simulation of a computer model
which is expected to mimic or represent some aspect of reality.
The analysis of computer simulations is a relatively recent newcomer
in the bag of tools available for the statistics practitioner.
Although simulations do not neccesarily represent reality, it is
possible to gain knowledge about a certain phenomena through the
analysis of such simulations, and the role of the statistician is
to design efficient experiments to explore the parameter region
and to model with areasonable degree of accuracy the response.I intend to guide the talk through a series of examples derived
from practice, ranging from analysis of airplane blades to the
sensitivity of parameters in a model for disease spread. The
main example will be based on the analysis for a model of the
evolution of rotavirus in a population. 
31/05/2012 5:30 PMM203Richard StevensSenior StatisticianDepartment of Primary Health CareUniversity of OxfordStatistical models for monitoring chronic disease
When setting a monitoring programme for conditions such as diabetes, hypertension,
high blood pressure, kidney disease or HIV, one aspect  the interval between
monitoring tests  is often made by consensus rather than from evidence. The
difficulty with randomized trials in this area is easily demonstrated. Oxford's
Monitoring and Diagnosis group, and collaborators, have used longitudinal modelling
to show that overfrequent monitoring leads to a kind of 'multiple testing' problem
and hence to overtreatment. This talk will discuss the methods we use and
illustrate them with a clinical example. 
17/05/2012 4:45 PMSeminar to be held in Southampton University, Building 54 Room 10037Heiko GrossmannSchool of Mathematical Sciences, QMULAnalysis of variance for dummies with the AutomaticAnova package
The analysis of variance (Anova) is one of the most popular statistical methods for analysing data. It is most powerful when applied to data from designed experiments. Statistics courses for biologists and other scientists usually explain the underlying theory for simple designs such as the completely randomized design, the randomized complete block design or twofactor factorial designs. However, in applications usually much more advanced designs are used which involve complicated crossing and nesting structures as well as random and fixed effects. Although when designing an experiment scientists can often rely on their intuition, analysing the collected data frequently represents a major challenge to the nonexpert.
This talk presents the AutomaticAnova package which has been designed as a userfriendly Mathematica package that enables researchers to analyse complicated Anova models without requiring much statistical background. It is based on RA Bailey's theory of orthogonal designs which covers a wide range of models and in particular all designs that can be obtained by iterative crossing and nesting of factors. The theory distinguishes between block factors, which have random effects, and treatment factors whose effects are fixed and so in general the models are mixed effects models. For these designs the Anova table can be derived in an elegant way by using Hasse diagrams.
The AutomaticAnova package provides a graphical user interface which has been implemented by using Mathematica's GUIKit. Input data are submitted in the form of a Microsoft Excel spreadsheet and essentially the user only has to specify which columns in the spreadsheet represent block and treatment factors respectively, and whether only main effects or main effects and interactions should be included in the analysis. Dynamic enabling/disabling of dialogs and controls minimizes the risk of providing incorrect input information. The most important feature of the AutomaticAnova package is then, however, that the model for the analysis of variance is automatically inferred from the structure of the design in the Excel file. In particular, no model formula needs to be specified for the analysis. It is believed that this aspect of the package's functionality will be highly attractive to practitioners. The output includes the Anova table, estimated variance components and Hasse diagrams and can be saved as a PDF file.
Another feature of the package is that it can be used at the planning stage of an experiment when no response data are yet available to see what the analysis would like. That is, having only specified the design in the form of a spreadsheet the packages provides the socalled skeleton analysis of variance which shows the breakdown of the sum of squares and corresponding degrees of freedom. Also, when no response data are available the package uses Mathematica's symbolic capabilities to derive analytical formulae for the estimators of the variance components.
The presentation will demonstrate the use of the package and describe the principles underlying its implementation. Several examples will be used to illustrate how the AutomaticAnova package can help the nonstatistician to analyse complicated Anova designs without having to worry too much about statistics. 
17/05/2012 3:15 PMSeminar to be held in Southampton University, Building 54 Room 10037Kalliopi Mylona University of SouthamptonAnalysing data from optimal mixedlevel supersaturated designs using group screening
Supersaturated designs (SSDs) are used for screening out the important factors from a
large set of potentially active variables. The huge advantage of these designs is that
they reduce the experimental cost drastically, but their critical disadvantage is the
high degree of confounding among factorial effects. In this contribution, we focus on
mixedlevel factorial designs which have different numbers of levels for the factors.
Such designs are often useful for experiments involving both qualitative and quantitative
factors. When analyzing data from SSDs, as in any decision problem, errors of various
types must be balanced against cost. In SSDs, there is a cost of declaring an inactive
factor to be active (i.e. making a Type I error), and a cost of declaring an active
effect to be inactive (i.e. making a Type II error). Type II errors are usually considered
much more serious than Type I errors. We present a group screening method for analysing
data from E(f_{NOD})optimal mixedlevel supersaturated designs possessing the equal
occurrence property. Based on the idea of the group screening methods, the f factors
are subdivided into g ?groupfactors?. The ?groupfactors? are then studied using the
penalized likelihood methods involving a factorial design with orthogonal or nearorthogonal
columns. The penalized likelihood methods indicate which ?group factors? have a large
effect and need to be studied in a followup experiment. We will compare various methods
in terms of Type I and Type II error rates using a simulation study.
Keywords and phrases: Group screening method, Data analysis, Penalized least squares,
Supersaturated design. 
03/05/2012 5:30 PMM203Lynn R. LaMotte, Louisiana State University Health Sciences Center, New OrleansStatistical questions in estimating postmortem interval from insect evidence
Insect evidence around a decomposing body can provide a biological clock
by which the time of exposure can be estimated. As decomposition progresses,
flyy larvae grow and go through distinct developmental stages, and a
succession of insect species visits the scene.
Viewed broadly, the question, how long the body has been exposed, fits
into the framework of inverse prediction. However, insect evidence is
both quantitative and categorical. Size data are multivariate, and their
magnitudes, variances, and correlations change with age. Presence/absence
of important species manifests categorically, but the number of distinct
categories can number in the thousands.
The statistical challenge is to devise an approach that can provide a
credible, defensible estimate of postmortem interval based on such data.
In this talk I shall present the setting and describe joint work I have
undertaken with Jeffrey D. Wells, a forensic entomologist, to address
this question. 
24/04/2012 5:30 PMM203Nicolas SAVYUniversité Paul Sabatier  Toulouse 3On the use of Fleming and Harrington's test to detect late effects in clinical trials
In this work, we deal with the question of detection of late effects in the setting of clinical trials. The most natural test for detecting this kind of effects was introduced by Fleming and Harrington. However, this test depends on a parameter, that, is the context of clinical trials, must be chosen a priori.
We examine the reasons why this test is adapted to the detection of late effects by studying its optimality in terms Pitman Asymptotic Relative Efficiency. We give an explicit form of the function describing alternatives for which the test is optimal. Moreover, we will observe, by means of a simulations study, this test is not very sensitive to the value of the parameter, which is very reassuring for its use in clinical trials.

29/03/2012 5:30 PMM203Professor Byron Jones, Biometrical FellowStatistical Methodology GroupNovartis Pharm AGBaselSwitzerlandModelBased Bayesian Adaptive Dose Finding Designs for a Phase II Trial
After giving a brief overview of the different phases of drug development,
I will present a case study that describes the planning of a dosefinding study
for a compound that was in early clinical development at the time of the study.
Data from a previous trial with the same primary endpoint was available for a
marketed drug that had the same pharmacological mechanism, which provided
strong prior information for some characteristics of the new compound, including
the shape of the doseresponse relationship. The design used for this trial included
an adaptive element where the allocation of doses to the patients was changed
after an interim analysis. In this talk I will compare the performance different adaptive
designs and compare them to a corresponding nonadaptive design. I will also compare
the performance of Bayesian and modelbased maximum likelihood estimation relative
to the use of simple pairwise comparisons of treatment means. 
22/03/2012 4:30 PMM203Steve CoadSchool of Mathematical Sciences, QMULEstimation following Adaptively Randomised Clinical Trials
Suppose that two treatments are being compared in a clinical trial
in which responseadaptive randomisation is used. Upon termination of
the trial, interest lies in estimating parameters of interest.
Although the usual estimators will be approximately unbiased for
trials with moderate to large numbers of patients, their biases may
be appreciable for small to moderatesized trials and the corresponding
confidence intervals may also have coverage probabilities far from the
nominal values. An adaptive twoparameter model is studied in which
there is a parameter of interest and a nuisance parameter. Corrected
confidence intervals based on the signed root transformation are
constructed for the parameter of interest which have coverage probabilities
close to the nominal values for trials with a small number of patients.
The accuracy of the approximations is assessed by simulation for two examples.
An extension of the approach to higher dimensions is discussed. 
15/03/2012 4:30 PMM203Mohammad Lutfor RahmanSchool of Mathematical Sciences, QMULMultiStratum and SplitPlot Designs in Two Industrial Experiments
Hardtoset factors lead to splitplot type designs and mixed models. Mixed models are used to
analyze multistratum designs as each stratum may have random effects on the responses. It is
usual to use residual maximum likelihood (REML) to estimate random effects and generalized
least squares (GLS) to estimate fixed effects. However, a typical property of REMLGLS estimation
is that it gives highly undesirable and misleading conclusions in nonorthogonal splitplot
designs with few main plots. More specifically, the variance components are often estimated
poorly using maximum likelihood (ML) methods when there are few main plots. To overcome the
problem a Bayesian method considering informative priors for variance components and using
Markov chain Monte Carlo (MCMC) sampling would be an alternative approach.
In the current study we have implemented MCMC techniques in two industrial experiments. During
binary data analysis, we have faced convergence problems frequently. Perhaps these are due to
separation problems in the data. In future, we will define a design criterion that will minimize
the problem of separation. 
01/03/2012 4:30 PMM203Marco Geraci Institute of Child Health, University College LondonQuantile inference for complex survey data with missing values
The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. The analysis is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias.
In this talk I will address some issues that arise when the target of the inference is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is considered. A novel multiple imputation method based on sequential quantile regressions (QR) is developed. Such method is able to preserve the distributional relationships in the data, including conditional skewness and kurtosis, and to successfully handle bounded outcomes. The motivating example concerns the analysis of birthweight determinants in a large cohort of British children. 
23/02/2012 4:30 PMM203Miguel JuarezUniversity of SheffieldFrom time course gene expression to gene regulatory networks
The accelerated development of highthroughput technologies has enabled understanding of how biological systems function at a molecular level, for instance by unraveling the interaction structure of genes responsible for carrying out a given process. Systems biology has the potential to enhance knowledge acquisition and facilitate the reverse engineering of global regulatory networks using gene expression time course experiments.
In this talk I will present some models we have developed for estimating a gene interaction network from time course experimental data. The basic structure of these models is governed by a dynamic Bayesian network, which allows us to include expert biological information as well. Given the complexity of model fit, we resort to numerical methods for model estimation.
I will exemplify gene network inference using experimental data from the metabolic change in Streotomyces coelicolor and the circadian clock in Arabidopsis thaliana. 
16/02/2012 4:30 PMM203Mirela DomijanWarwick Systems Biology CentreAn overview of several methods to analyse dynamics of chemical reaction networks
In order to make sense of many biological processes, it is crucial to understand
the dynamics of the underlying chemical reactions. Chemical reaction systems are
known to exhibit some interesting and complex dynamics, such as multistability
(a situation where two or more stable equilibria coexist) or oscillations.
Here we take the deterministic approach and assume that the reactions obey the
law of massaction, so the systems are described by ODEs with specific polynomial
structure. For such systems, this polynomial structure allows us to gain
surprisingly deep insights into systems' dynamics.
In my talk I will overview several methods for analysing these specific chemical
reaction networks, encompassing algebraic geometry, bifurcation theory and graph
theory. 
03/03/2011 4:30 PMM203Serge Guillas Department of Statistical Science, University College LondonBayesian calibration and emulation of geophysical computer models
In this talk, we demonstrate a procedure for calibrating and emulating complex computer
simulation models having uncertain inputs and internal parameters, with application to
the NCAR ThermosphereIonosphereElectrodynamics General Circulation Model (TIEGCM),
and illustrate preliminary findings for Computational Fluid Dynamics and tsunami wave
modelling. In the case of TIEGCM, we compare simulated magnetic perturbations with
observations at two ground locations for various combinations of calibration parameters.
These calibration parameters are: the amplitude of the semidiurnal tidal perturbation
in the height of a constantpressure surface at the TIEGCM lower boundary, the local
time at which this maximises and the minimum nighttime electron density.
A fully Bayesian approach, that describes correlations in time and in the calibration
input space is implemented. A Markov Chain Monte Carlo (MCMC) approach leads to potential
optimal values for the amplitude and phase (within the limitations of the selected
data and calibration parameters) but not for the minimum nighttime electron density.
The procedure can be extended to include additional data types and calibration parameters. 
03/06/2010 5:30 PMM203Teo Sharia Department of Mathematics Royal Holloway, University of LondonOnline parameter estimation procedures with application to estimating autoregressive parametersSeminar series:
A wide class of online estimation procedures will be proposed for the general statistical model.
In particular, new procedures for estimating autoregressive parameters in $AR(m)$ models will be
considered. The proposed method allows for incorporation of auxiliary information into the estimation
process, and is consistent and asymptotically efficient under certain regularity conditions. Also,
these procedures are naturally online and do not require storing all the data.
Two important special cases will be considered in detail: linear procedures and likelihood procedures
with the LS truncations. A specific example will also be presented to briefly discuss some practical
aspects of applications of the procedures of this type. 
20/05/2010 5:30 PMJouni Kuha Department of StatisticsLondon School of EconomicsThe role of education in social mobility  Path analysis for discrete variablesSeminar series:
Classical path analysis provides a simple way of expressing the observed
association of two variables as the sum of two terms which can with good
reason be described as the "direct effect" of one variable on the other
and the "indirect effect" via a third, intervening variable. This result
is used for linear models for continuous variables. It would often be of
interest to have a similar effect decomposition for cases where some of
the variables are discrete and modelled using nonlinear models. One
such problem occurs in the study of social mobility, where the aim is to
decompose the association between a person's own and his/her parents'
social classes into an indirect effect attributable to associations
between education and class, and a direct effect not due to differences
in education.
Extending the idea of linear path analysis to nonlinear models
requires, first, an extended definition of what is meant by total,
direct and indirect effects and, second, a way of calculating sample
estimates of these effects and their standard errors. One solution to
these questions is presented in this talk. The method is applied to data
from the UK General Household Survey, illustrating the magnitude of the
contribution of education to social mobility in Britain in recent
decades.[This is joint work with John Goldthorpe (Nuffield College, Oxford)]

13/05/2010 5:00 PMJoint meeting QMULS3RI, Statistics Department, Southampton UniversityBarbara Bogacka School of Mathematical SciencesQueen Mary, University of LondonFirst in Human dose selection studies  a lesson from the TGN1412 trialSeminar series:
In 2006 the TGN1412 clinical trial was suddenly aborted due to a very strong cytotoxic
reaction of the six volunteers who were treated with the drug candidate. An Expert
Scientific Group on Clinical Trials as well as the RSS Working Party wrote reports on
what happened, how this could have been avoided and what to recommend for future trials
of this kind. I will present some work related to designs of such trials, in particular
my work on adaptive design of experiments. I will also present some recommendations of
the RSS Working Group (Senn et al. 2007).Senn, S., Amin, D., Bailey, R.A., Bird, S.M., Bogacka, B., Colman, P., Garrett, A., Grieve, A., Lachmann, P. (2007).
Statistical Issues in firstinman studies. JRSS A. 
06/05/2010 5:30 PMM203Ioannis Kosmidis Department of StatisticsThe University of WarwickThe reduction of bias in GLMs with emphasis on models with categorical responsesSeminar series:
For estimation in exponential family models, Kosmidis & Firth (2009, Biometrika) show
how the bias of the maximum likelihood estimator may be reduced by appropriate adjustments
to the efficient score function. In this presentation the main results of that study are
discussed, complemented by recent work on the easy implementation and the beneficial
sideeffects that bias reduction can have in the estimation of some wellused generalised
linear models for categorical responses. The construction of confidence intervals to
accompany the biasreduced estimates is discussed. 
25/03/2010 4:30 PMM203Stefano Conti Health Protection AgencyDimensions of Design Space: A DecisionTheoretic Approach to Optimal Research DesignSeminar series:
Bayesian decision theory can be used not only to establish the optimal sample
size and its allocation in a single clinical study, but also to identify an optimal
portfolio of research combining different types of study design. Within a single
study, the highest societal payoff to proposed research is achieved when its
sample sizes, and allocation between available treatment options, are chosen to
maximise the Expected Net Benefit of Sampling (ENBS). Where a number of
different types of study informing different parameters in the decision problem
could be conducted, the simultaneous estimation of ENBS across all dimensions
of the design space is required to identify the optimal sample sizes and allocations
within such a research portfolio. This is illustrated through a simple
example of a decision model of zanamivir for the treatment of influenza. The
possible study designs include:
i) a single trial of all the parameters;
ii) a clinical trial providing evidence only on clinical endpoints;
iii) an epidemiological study of natural history of disease and
iv) a survey of quality of life.
The possible combinations, samples sizes and allocation between trial arms are
evaluated over a range of costeffectiveness thresholds. The computational challenges
are addressed by implementing optimisation algorithms to search the
ENBS surface more efficiently over such large dimensions. 
18/03/2011 4:30 PMM203Eleni Bakra MRC Biostatistics Unit, CambridgeTempered simplex samplerSeminar series:
Usual Markov chain Monte Carlo (MCMC) methods use a single Markov chain to sample
from the distribution of interest. If the target distribution is described by isolated
modes then it may be difficult for these methods to jump between the modes and for this
reason, the mixing is slow. Usually different starting positions are used to find out
isolated modes but this is not always feasible especially when the modes are difficult
to find or there is a big number of them. In this talk, I avoid these problems by
introducing a new population MCMC sampler, the tempered simplex sampler. The tempered
simplex sampler uses a tempering ladder to promote mixing while a population of Markov
chains is regarded under each temperature. The sampler proceeds by first updating the
Markov chains under each temperature using ideas from the NelderMead simplex method
and then, by exchanging different populations of Markov chains under different
temperatures. The performance of the tempered simplex sampler is outlined on several
examples. 
18/02/2010 4:30 PMM203Heiko Grossmann School of Mathematical SciencesQueen Mary UniversityAnalysis of an experiment on bumblebee personalitySeminar series:
This talk follows up on a presentation given by Helene Muller from QM's School
of Biological and Chemical Sciences at a statistics study group meeting in
January 2009.
The problem is to devise an appropriate analysis for investigating if bumblebees
behave in a consistent way. The dataset consists of N=729 observations which
represent repeated measurements on 81 bees under various experimental conditions.
A modelling strategy for these data is presented, which yields to fitting a nested
linear mixed model to the BoxCox transformed responses. The results from the
corresponding analysis appear to be very satisfying and allow a classification
of bees into consistent and inconsistent ones. This is joint work with Helene
Muller and Lars Chittka. 
21/01/2010 4:30 PMM203Mitra Noosha Queen MaryQueen Mary graduate students seminarDiscordance between prior and data using conjugate priorsSeminar series:
In Bayesian Inference the choice of prior is very important to
indicate our beliefs and knowledge. However, if these initial beliefs
are not well elicited, then the data may not conform to our
expectations. The degree of discordancy between the observed data and
the proper prior is of interest. Pettit and Young (1996) suggested a
Bayes Factor to find the degree of discordancy. I have extended their
work to further examples.
I try to find explanations for Bayes Factor behaviour. As an
alternative I have looked at a mixture prior consisting of the
elicited prior and another with the same mean but a larger variance.
The posterior weight on the more diffuse prior can be used as a
measure of the prior and data discordancy and also gives an automatic
robust prior. I discuss various examples and show this new measure is
well correlated with the Bayes factor approach. 
10/12/2009 4:30 PMM203SEMINAR CANCELLED

12/12/2009 4:30 PMM203A.I. Bejan Cambridge UniversityInference and Optimal Experimental Design for Random Graph ModelsSeminar series:
We consider inference and optimal design problems for finite clusters from bond percolation on the integer lattice Z^{d} or, equivalently, for SIR epidemics evolving on a bounded or unbounded subset of Z^{d} with constant life times. The bond percolation probability p is considered to be unknown, possibly depending, through the experimental design, on other parameters. We consider inference under each of the following two scenarios:
 The observations consist of the set of sites which are ever infected, so that the routes by which infections travel are not observed (in terms of the bond percolation process, this corresponds to a knowledge of the connected component containing the initially infected sitethe location of this site within the component not being relevant to inference for p).
 All that is observed is the size of the set of sites which are ever infected.
This is a joint work with Professor Gavin Gibson and Dr Stan Zachary, both with HeriotWatt University, Edinburgh.

26/11/2009 4:30 PMM203M.J. Costa University of WarwicktbaSeminar series:

19/11/2009 4:30 PMM203S.G. Gilmour Queen MaryAnalysing Categorical Data from MultiStratum DesignsSeminar series:

23/11/2011 12:00 PM203Jessica EnrightTBA

12/11/2009 4:30 PMM203W. Yeung Queen Mary Queen Mary graduate students seminatbaSeminar series:

29/10/2009 4:30 AMM203H. MaruriAguilar Queen MaryDesigns for computer experimentsSeminar series:
When modelling a computer experiment, the deviation between model and simulation data is due only to the bias (discrepancy) between the model for the computer experiment and the deterministic (albeit complicated) computer simulation. For this reason, replications in computer experiments add no extra information and the experimenter is more interested in efficiently exploring the design region.
I'll present a survey of designs useful for exploring the design region and for modelling computer simulations.

22/10/2010 5:30 PMM203K. AnayaIzquierdo Open UniversitySensitivity analysis, cuts and geometrySeminar series:
Sensitivity analysis in statistical science studies how scientifically relevant changes in the way we formulate problems affect answers to our questions of interest. New advances in statistical geometry allow us to build a rigorous framework in which to investigate these problems and develop insightful computational tools, including new diagnostic measures and plots.
This talk will be about statistical model elaboration using sensitivity analysis aided with geometry. Throughout we assume there is a working parametric model. The key idea here is to explore discretisations of the data, at which point multinomial distributions become universal (all possible models are cov ered). The resulting structure is wellsuited to discussing practically important statistical topics, such as exponential families and generalised linear models. The theory of cuts in exponential families allows clean inferential separation between interest and nuisance parameters and provides a basis for appropriate model elaboration. Examples are given where the resulting sensitivity analyses indicate the need for specific model elaboration or data reexamination.

08/10/2009 5:45 PMM203K.G. Russell University of Wollongong and S3RI Joint meeting with the Southampton Statistical Sciences Research InstDOptimal designs for Poisson regressionSeminar series:
The quality of incompleteblock designs is commonly assessed by the A, D, and Eoptimality criteria. If there exists a balanced incompleteblock design for the given parameters, then it is optimal on all these criteria. It is therefore natural to use the proxy criteria of (almost) equal replication and (almost) equal concurrences when choosing a block design.
However, work over the last decade for block size 2 has shown that when the number of blocks is near the lower limit for estimability of all treatment contrasts then the Dcriterion favours very different designs from the A and Ecriteria. In fact, the A and Eoptimal designs are far from equireplicate and are amongst the worst on the Dcriterion.
I shall report on current work which extends these results to all block sizes. Thus the problem is not blocks of size 2; it is low replication.

08/10/2009 4:15 PMM203R.A. Bailey Queen Mary Joint meeting with the Southampton Statistical Sciences Research InstituteConflicts between optimality criteria for block designs with low replicationSeminar series:
The quality of incompleteblock designs is commonly assessed by the A, D, and Eoptimality criteria. If there exists a balanced incompleteblock design for the given parameters, then it is optimal on all these criteria. It is therefore natural to use the proxy criteria of (almost) equal replication and (almost) equal concurrences when choosing a block design.
However, work over the last decade for block size 2 has shown that when the number of blocks is near the lower limit for estimability of all treatment contrasts then the Dcriterion favours very different designs from the A and Ecriteria. In fact, the A and Eoptimal designs are far from equireplicate and are amongst the worst on the Dcriterion.
I shall report on current work which extends these results to all block sizes. Thus the problem is not blocks of size 2; it is low replication.
For 2024, the talks are held on Wednesdays at 14:0015:00pm in room MB503 on floor 5 of the School of Mathematical Sciences Building, Queen Mary University of London.
The seminar is organised in a hybrid fashion. Attendance can be either inperson or via zoom using that link.
The current seminar organisers are Arthur Guillaumin and Kostas Papafitsoros