The slides from my NIPS 2013 tutorial on Approximate Bayesian Computation are now available here.
Biographical Information
I am currently a
lecturer of statistics in the School of Mathematical Sciences at the University of
Nottingham. Previously I was a RCUK funded research associate on the
Managing Uncertainty in Complex Models project (MUCM) based at Sheffield University, working under the supervision of
Jeremy Oakley and Tony O'Hagan, and before that I was a PhD student of Simon Tavaré in the Department of Applied Maths and Theoretical Physics (DAMTP) at the
University of Cambridge. My PhD thesis was on the Bayesian inference of primate divergence times. As an undergraduate,
I studied maths at Downing College, Cambridge, for three years before taking the part III maths tripos.
Between my undergraduate and PhD studies I spent a year as an
ALevel teacher at the Cambridge Centre for Sixth Form Studies (CCSS), teaching ALevel physics and maths.
I also spent six months as a doortodoor salesman with the Southwestern Company, a remarkable experience that I wouldn't recommend to anyone.
Should you want one, you can find a copy of my CV, here.
Research Interests
My main research interests lie in the field of Bayesian statistics and its application. In particular, I am interested in the statistical challenges posed by computer experiments.
The main areas I am interested in are listed below. I would be happy to supervise PhD students in any of these areas; interested students should contact me for details of potential projects. You can also see a more specific list of potential projects here (although this list is only for guidance about the types of problems I'm interested in).
 Approximate Bayesian Computation (ABC)
 Computer experiments
 Statistical challenges in climate science
 Bayesian approaches to palaeontology
 Branching Processes
I would be happy to supervise PhD students in any of these areas,
A list of publications can be found here.
Approximate Bayesian Computation
Over the past two decades, the increase in the availability of inexpensive,
high speed computers has reshaped many
approaches to statistics. Markov Chain Monte Carlo has made inference
possible in previously intractable problems, however, it is not without its
problems.
I am interested in studying how to approximate posterior distributions
when explicit calculations of the likelihood is either impossible or
computationally prohibitive, but where we are able to simulate
observations from the model.
A simple way of doing this is called Approximate Bayesian Computation
or ABC and is based
on the rejection method.
If we let θ represent
the model parameters, x the observed data, and S(x) some summary statistic
of the data, then
ABC can then be summarized as follows:
 Choose θ from the prior
 Simulate data x' from the model with parameter θ, and calculate
S'=S(x').
 Accept θ if simulated statistic S' is close to the summary of the
real data S=S(x).
i.e accept if ρ(S, S') <ε, where ρ represents a carefully chosen
metric.
The advantages of ABC are that it is extremely easy to code and it removes
some of the problems faced when using MCMC; for example, there are no mixing
parameters
to fine tune, accepted observations are independent, and we can easily write
code that can run in parallel on a cluster.
However, as suggested by its name, ABC is only approximate, in the sense
that we don't compute
samples
from the true posterior distribution f(θ  x), but from the
distribution f(θ  ρ(S,S')<ε).
Some of the many questions till to be answered about ABC are, how
close is this approximate distribution to the true posterior, what are the
effects of using a summary statistic S which is not sufficient for θ,
and how should we
choose the metric ρ?
 R.D. Wilkinson.
Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. In submission.
Answers the question what distribution are we really sampling from when we do ABC?
 R.D. Wilkinson and S. Tavaré.
Estimating the primate divergence time using conditioned birthanddeath processes. Theoretical Population Biology, 75 (2009), pp. 278285. doi:10.1016/j.tpb.2009.02.003 .
A nice application of ABC to a complex stochastic model.
 R.D. Wilkinson.
Bayesian Inference of Primate Divergence Times. PhD Thesis. Department of Applied Probability and Theoretical Physics, University of Cambridge, (2007).
Chapters 3, 4, 5 and 6 all give details and extensions of the ABC method and chapter 3 serves as an introduction to the subject
 The slides and a video of an introductory talk on ABC are available here
Computer experiments
Computer experiments are now commonplace in nearly all areas of science. From a statistical point of view, the main question is, how we can learn about a physical system from a simulation of it?
For deterministic simulators, if we run the model multiple times with the same input, we will get the same output each time. Because there is no natural variation we must introduce and account for uncertainty ourselves, in order to make predictions with sensible error bounds. An excellent source of information on some of the methods for dealing with computer experiments can be found in the MUCM toolkit.
I am interested in two main problems:
 Dealing with code uncertainty  if the simulator is expensive to run, we will have to conduct all inference using a small ensemble of model runs. One approach to dealing with this, is to build an emulator of the simulator. In other words, we build a cheap statistical model of the expensive computer model.
Some work in this area:
 L. Bastos and R.D. Wilkinson.
Análise Estatística de Simuladores (Statistical Analysis of Computer Experiments).
Simpósio Nacional de Probabilidade e Estatística 19o (SINAPE), (2010).
This is a short book Leonardo Bastos and I wrote as an introduction to computer experiments. It is currently only available in Portuguese, but we hope to produce an English version soon. There are also lecture slides that accompany the notes:
 Introduction to computer experiments (in English)
 Gaussian process emulators (Portuguese)
 Design of experiments and multioutput emulators (Portuguese)
 Calibration (English)
 Validation and sensitivity analysis (Portuguese)
 Approximate Bayesian computation (English)
 R.D. Wilkinson.
Bayesian calibration of expensive multivariate computer experiments. To appear in Large scale inverse problems and quantification of uncertainty, to be published September 2010, Wiley Series in Computational Statistics. Edited by L. T. Biegler, G. Biros, O. Ghattas, M. Heinkenschloss, D. Keyes, B. K. Mallick, L. Tenorio, B. Van Bloemen Waanders and K. Wilcox.
We introduce the principal component emulator, which can be used to emulate models with high dimensional output. We also show how to calibrate the model using the PCA emulator.
 D.M. Ricciuto, R. Tonkonojenkov, N. Urban, R.D. Wilkinson, D. Matthews, K.J. Davis, and K. Keller.
Assimilation of glohal carbon cycle observations into an Earth system model to estimate uncertain terrestrial carbon cycle parameters. In submission.
We calibrate the UVic climate model using the PCA emulator described in the paper above.
 Model error  if we accept that a model is imperfect, then it is natural to ask whether we can learn in what way the model goes wrong, and whether we can correct for the error. Related to this is the question of whether we can learn the appropriate degree of uncertainty we must add to our inference in order to make sensible predictions.
This work has close links to some of my ABC work.
I have funding to employ a research associate for a short 6 month period to look at methodology to learn the model discrepancy (model error) in a simple rainfallrunoff model, building on the methodology developed by Jeremy Oakley and me. If you are interested, please contact me. I hope to appoint someone to start sometime between Sept 2010 and Jan 2011.
Statistical challenges in climate science
Climate science is a source of important computer experiments. Climate models tend to be expensive and deterministic, typically depending on many unknown parameters. The models are also imperfect, in the sense that there are physical processes missing from the models and approximating assumptions need to be made in order to solve the equations. I am interested in work which incorporates statistical modelling into the analysis in order to produce predictions that account for the various uncertainties we know to exist (code uncertainty, parametric uncertainty, model error). It is important to produce probabilistic predictions, as deterministic predictions (which scientists understand to be approximate) are often interpreted (by skeptics) as undermining all of climate science if they do not occur exactly as promised. Some papers:
 P.B. Holden, N.R. Edwards, K.I.C. Oliver, T.M. Lenton and R.D. Wilkinson.
A probabilistic calibration of climate sensitivity in GENIE1. To appear, Climate Dynamics.
 D.M. Ricciuto, R. Tonkonojenkov, N. Urban, R.D. Wilkinson, D. Matthews, K.J. Davis, and K. Keller.
Assimilation of glohal carbon cycle observations into an Earth system model to estimate uncertain terrestrial carbon cycle parameters. In submission.
I have funding to employ a research associate for a short 6 month period to look at methodology to learn the model discrepancy (model error) in a simple rainfallrunoff model, building on the methodology developed by Jeremy Oakley and me. If you are interested, please contact me. I hope to appoint someone to start sometime between Sept 2010 and Jan 2011.
Bayesian approaches in palaeontology
I am interested in stochastic modelling in palaeontological applications. These problems are often characterised by only having a limited amount of noisy data that we must exploit as best we can.


Some papers:
 R.D. Wilkinson, M. Steiper, C. Soligo, R.D. Martin, Z. Yang, and S. Tavaré.
Dating primate divergences through an integrated analysis of palaeontological and molecular data. In press, Systematic Biology.
We combine the posteriors found in chapter 4 of my thesis with genetic data from extant primates to find estimates of the primate divergence time that incorporate both genetic and fossil data. The main findings are illustrated in the figure above.
 R.D. Wilkinson and S. Tavaré.
Estimating the primate divergence time using conditioned birthanddeath processes. Theoretical Population Biology, 75 (2009), pp. 278285. doi:10.1016/j.tpb.2009.02.003 .
A paper based on chapters 7 and 8 of my PhD thesis.
 R.D. Wilkinson.
Bayesian Inference of Primate Divergence Times. PhD Thesis. Department of Applied Probability and Theoretical Physics, University of Cambridge, (2007).
My thesis contains most of the results in the above two papers, plus other cases that haven't been published. Chapters of particular interest might be chapters 4 and 6, where the basic primate model is extended in various directions.
Branching processes
To be constructed soon.
Last updated July 2010