Richard Wilkinson

Me Research
Teaching
Preprints
CV
Contact


News:

  • Slides from my talk at the PANACEA meeting in Trondheim, on Emulating computer simulators with high dimensional input and output.
  • The slides from my NIPS 2013 tutorial on Approximate Bayesian Computation are now available here.


    Biographical Information

      I am currently a lecturer of statistics in the School of Mathematical Sciences at the University of Nottingham. Previously I was a RCUK funded research associate on the Managing Uncertainty in Complex Models project (MUCM) based at Sheffield University, working under the supervision of Jeremy Oakley and Tony O'Hagan, and before that I was a PhD student of Simon Tavaré in the Department of Applied Maths and Theoretical Physics (DAMTP) at the University of Cambridge. My PhD thesis was on the Bayesian inference of primate divergence times. As an undergraduate, I studied maths at Downing College, Cambridge, for three years before taking the part III maths tripos. Between my undergraduate and PhD studies I spent a year as an A-Level teacher at the Cambridge Centre for Sixth Form Studies (CCSS), teaching A-Level physics and maths. I also spent six months as a door-to-door salesman with the Southwestern Company, a remarkable experience that I wouldn't recommend to anyone.

      Should you want one, you can find a copy of my CV, here.


    Research Interests

      My main research interests lie in the field of Bayesian statistics and its application. In particular, I am interested in the statistical challenges posed by computer experiments. The main areas I am interested in are listed below. I would be happy to supervise PhD students in any of these areas; interested students should contact me for details of potential projects. You can also see a more specific list of potential projects here (although this list is only for guidance about the types of problems I'm interested in).

      1. Approximate Bayesian Computation (ABC)
      2. Computer experiments
      3. Statistical challenges in climate science
      4. Bayesian approaches to palaeontology
      5. Branching Processes

      I would be happy to supervise PhD students in any of these areas, A list of publications can be found here.


    Approximate Bayesian Computation

      Over the past two decades, the increase in the availability of inexpensive, high speed computers has reshaped many approaches to statistics. Markov Chain Monte Carlo has made inference possible in previously intractable problems, however, it is not without its problems. I am interested in studying how to approximate posterior distributions when explicit calculations of the likelihood is either impossible or computationally prohibitive, but where we are able to simulate observations from the model. A simple way of doing this is called Approximate Bayesian Computation or ABC and is based on the rejection method.

      If we let θ represent the model parameters, x the observed data, and S(x) some summary statistic of the data, then ABC can then be summarized as follows:

      1. Choose θ from the prior
      2. Simulate data x' from the model with parameter θ, and calculate S'=S(x').
      3. Accept θ if simulated statistic S' is close to the summary of the real data S=S(x). i.e accept if ρ(S, S') <ε, where ρ represents a carefully chosen metric.
      The advantages of ABC are that it is extremely easy to code and it removes some of the problems faced when using MCMC; for example, there are no mixing parameters to fine tune, accepted observations are independent, and we can easily write code that can run in parallel on a cluster. However, as suggested by its name, ABC is only approximate, in the sense that we don't compute samples from the true posterior distribution f(θ | x), but from the distribution f(θ | ρ(S,S')<ε). Some of the many questions till to be answered about ABC are, how close is this approximate distribution to the true posterior, what are the effects of using a summary statistic S which is not sufficient for θ, and how should we choose the metric ρ?

      1. R.D. Wilkinson. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. In submission.

        Answers the question what distribution are we really sampling from when we do ABC?

      2. R.D. Wilkinson and S. Tavaré. Estimating the primate divergence time using conditioned birth-and-death processes. Theoretical Population Biology, 75 (2009), pp. 278-285. doi:10.1016/j.tpb.2009.02.003 .

        A nice application of ABC to a complex stochastic model.

      3. R.D. Wilkinson. Bayesian Inference of Primate Divergence Times. PhD Thesis. Department of Applied Probability and Theoretical Physics, University of Cambridge, (2007).
      4. Chapters 3, 4, 5 and 6 all give details and extensions of the ABC method and chapter 3 serves as an introduction to the subject

      5. The slides and a video of an introductory talk on ABC are available here



    Computer experiments

      Computer experiments are now commonplace in nearly all areas of science. From a statistical point of view, the main question is, how we can learn about a physical system from a simulation of it? For deterministic simulators, if we run the model multiple times with the same input, we will get the same output each time. Because there is no natural variation we must introduce and account for uncertainty ourselves, in order to make predictions with sensible error bounds. An excellent source of information on some of the methods for dealing with computer experiments can be found in the MUCM toolkit.

      I am interested in two main problems:

      1. Dealing with code uncertainty - if the simulator is expensive to run, we will have to conduct all inference using a small ensemble of model runs. One approach to dealing with this, is to build an emulator of the simulator. In other words, we build a cheap statistical model of the expensive computer model.
      2. Some work in this area:
        1. L. Bastos and R.D. Wilkinson. Análise Estatística de Simuladores (Statistical Analysis of Computer Experiments). Simpósio Nacional de Probabilidade e Estatística 19o (SINAPE), (2010).

          This is a short book Leonardo Bastos and I wrote as an introduction to computer experiments. It is currently only available in Portuguese, but we hope to produce an English version soon. There are also lecture slides that accompany the notes:

          1. Introduction to computer experiments (in English)
          2. Gaussian process emulators (Portuguese)
          3. Design of experiments and multi-output emulators (Portuguese)
          4. Calibration (English)
          5. Validation and sensitivity analysis (Portuguese)
          6. Approximate Bayesian computation (English)
        2. R.D. Wilkinson. Bayesian calibration of expensive multivariate computer experiments. To appear in Large scale inverse problems and quantification of uncertainty, to be published September 2010, Wiley Series in Computational Statistics. Edited by L. T. Biegler, G. Biros, O. Ghattas, M. Heinkenschloss, D. Keyes, B. K. Mallick, L. Tenorio, B. Van Bloemen Waanders and K. Wilcox.

          We introduce the principal component emulator, which can be used to emulate models with high dimensional output. We also show how to calibrate the model using the PCA emulator.

        3. D.M. Ricciuto, R. Tonkonojenkov, N. Urban, R.D. Wilkinson, D. Matthews, K.J. Davis, and K. Keller. Assimilation of glohal carbon cycle observations into an Earth system model to estimate uncertain terrestrial carbon cycle parameters. In submission.

          We calibrate the UVic climate model using the PCA emulator described in the paper above.

      3. Model error - if we accept that a model is imperfect, then it is natural to ask whether we can learn in what way the model goes wrong, and whether we can correct for the error. Related to this is the question of whether we can learn the appropriate degree of uncertainty we must add to our inference in order to make sensible predictions. This work has close links to some of my ABC work.
      I have funding to employ a research associate for a short 6 month period to look at methodology to learn the model discrepancy (model error) in a simple rainfall-runoff model, building on the methodology developed by Jeremy Oakley and me. If you are interested, please contact me. I hope to appoint someone to start sometime between Sept 2010 and Jan 2011.



    Statistical challenges in climate science

      Climate science is a source of important computer experiments. Climate models tend to be expensive and deterministic, typically depending on many unknown parameters. The models are also imperfect, in the sense that there are physical processes missing from the models and approximating assumptions need to be made in order to solve the equations. I am interested in work which incorporates statistical modelling into the analysis in order to produce predictions that account for the various uncertainties we know to exist (code uncertainty, parametric uncertainty, model error). It is important to produce probabilistic predictions, as deterministic predictions (which scientists understand to be approximate) are often interpreted (by skeptics) as undermining all of climate science if they do not occur exactly as promised. Some papers:

      1. P.B. Holden, N.R. Edwards, K.I.C. Oliver, T.M. Lenton and R.D. Wilkinson. A probabilistic calibration of climate sensitivity in GENIE-1. To appear, Climate Dynamics.
      2. D.M. Ricciuto, R. Tonkonojenkov, N. Urban, R.D. Wilkinson, D. Matthews, K.J. Davis, and K. Keller. Assimilation of glohal carbon cycle observations into an Earth system model to estimate uncertain terrestrial carbon cycle parameters. In submission.
      I have funding to employ a research associate for a short 6 month period to look at methodology to learn the model discrepancy (model error) in a simple rainfall-runoff model, building on the methodology developed by Jeremy Oakley and me. If you are interested, please contact me. I hope to appoint someone to start sometime between Sept 2010 and Jan 2011.



    Bayesian approaches in palaeontology

      I am interested in stochastic modelling in palaeontological applications. These problems are often characterised by only having a limited amount of noisy data that we must exploit as best we can. Primate genetic and fossil posteriors
      Some papers:

      1. R.D. Wilkinson, M. Steiper, C. Soligo, R.D. Martin, Z. Yang, and S. Tavaré. Dating primate divergences through an integrated analysis of palaeontological and molecular data. In press, Systematic Biology.

        We combine the posteriors found in chapter 4 of my thesis with genetic data from extant primates to find estimates of the primate divergence time that incorporate both genetic and fossil data. The main findings are illustrated in the figure above.

      2. R.D. Wilkinson and S. Tavaré. Estimating the primate divergence time using conditioned birth-and-death processes. Theoretical Population Biology, 75 (2009), pp. 278-285. doi:10.1016/j.tpb.2009.02.003 .

        A paper based on chapters 7 and 8 of my PhD thesis.

      3. R.D. Wilkinson. Bayesian Inference of Primate Divergence Times. PhD Thesis. Department of Applied Probability and Theoretical Physics, University of Cambridge, (2007).
      4. My thesis contains most of the results in the above two papers, plus other cases that haven't been published. Chapters of particular interest might be chapters 4 and 6, where the basic primate model is extended in various directions.



    Branching processes

      To be constructed soon.



    Last updated July 2010