## Archive for Bayesian data analysis

## souvenirs de Luminy

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags amazon associates, applied Bayesian analysis, Bayesian data analysis, case studies, CIRM, Jean Morlet Chair, Kerrie Mengersen, Lecture Notes in Mathematics, Luminy, Marseille, Pierre Pudlo, Société Mathématique de France, Springer-Verlag, Université Aix Marseille on July 6, 2020 by xi'an## five postdoc positions in top UK universities & Bayesian health data science

Posted in Statistics with tags academic position, Bayesian data analysis, Bayesian statistics, Cambridge University, data science, EPSRC, health sciences, Lancaster University, postdoctoral position, research associate, University of Bristol, University of Oxford, University of Warwick, Warwick Data Science Institute on March 30, 2018 by xi'an**T**he EPSRC programme New Approaches to Bayesian Data Science: Tackling Challenges from the Health Sciences, directed by Paul Fearnhead, is offering five 3 or 4 year PDRA positions at the Universities of Bristol, Cambridge, Lancaster, Oxford, and Warwick. Here is the complete call:

Salary: £29,799 to £38,833

Closing Date: Thursday 26 April 2018

Interview Date: Friday 11 May 2018

We invite applications for Post-Doctoral Research Associates to join the New Approaches to Bayesian Data Science: Tackling Challenges from the Health Sciences programme. This is an exciting, cross-disciplinary research project that will develop new methods for Bayesian statistics that are fit-for-purpose to tackle contemporary Health Science challenges: such as real-time inference and prediction for large scale epidemics; or synthesizing information from distinct data sources for large scale studies such as the UK Biobank. Methodological challenges will be around making Bayesian methods scalable to big-data and robust to (unavoidable) model errors.

This £3M programme is funded by EPSRC, and brings together research groups from the Universities of Lancaster, Bristol, Cambridge, Oxford and Warwick. There is either a 4 or a 3 year position available at each of these five partner institutions.

You should have, or be close to completing, a PhD in Statistics or a related discipline. You will be experienced in one or more of the following areas: Bayesian statistics, computational statistics, statistical machine learning, statistical genetics, inference for epidemics. You will have demonstrated the ability to develop new statistical methodology. We are particularly keen to encourage applicants with strong computational skills, and are looking to put together a team of researchers with skills that cover theoretical, methodological and applied statistics. A demonstrable ability to produce academic writing of the highest publishable quality is essential.

Applicants must apply through Lancaster University’s website for the Lancaster, Oxford, Bristol and Warwick posts. Please ensure you state clearly which position or positions you wish to be considered for when applying. For applications to the MRC Biostatistics Unit, University of Cambridge vacancy please go to their website.

Candidates who are considering making an application are strongly encouraged to contact Professor Paul Fearnhead (p.fearnhead@lancaster.ac.uk), Sylvia Richardson (sylvia.richardson@mrc-bsu.cam.ac.uk), Christophe Andrieu (c.andrieu@bristol.ac.uk), Chris Holmes (c.holmes@stats.ox.ac.uk) or Gareth Roberts (Gareth.O.Roberts@warwick.ac.uk) to discuss the programme in greater detail.

We welcome applications from people in all diversity groups.

## latest issue of Significance

Posted in Statistics with tags Bayesian data analysis, birthrate, Karl Popper, Royal Statistical Society, Significance on March 20, 2017 by xi'an**T**he latest issue of Significance is bursting with exciting articles and it is a shame I do not receive it any longer (not that I stopped subscribing to the RSS or the ASA, but it simply does not get delivered to my address!). For instance, a tribune by Tom Nicolls (from whom I borrowed this issue for the weekend!) on his recent assessment of false positive in brain imaging [I covered in a blog entry a few months ago] when checking the cluster inference and the returned p-values. And the British equivalent of Gelman et al. book cover on the seasonality of births in England and Wales, albeit witout a processing of the raw data and without mention being made of the Gelmanesque analysis: the only major gap in the frequency is around Christmas and New Year, while there is a big jump around September (also there in the New York data).

A neat graph on the visits to four feeders by five species of birds. A strange figure in Perils of Perception that [which?!] French people believe 31% of the population is Muslim and that they are lacking behind many other countries in terms of statistical literacy. And a rather shallow call to Popper to running decision-making in business statistics.

## Statistical rethinking [book review]

Posted in Books, Kids, R, Statistics, University life with tags Amazon, Bayes theorem, Bayesian data analysis, Bayesian Essentials with R, book review, CHANCE, code, convergence diagnostics, E.T. Jaynes, generalised linear models, golem, maths, matrix algebra, MCMC algorithms, mixtures of distributions, Monte Carlo Statistical Methods, Prague, R, robots, STAN, statistical modelling, Statistical rethinking on April 6, 2016 by xi'anStatistical Rethinking: A Bayesian Course with Examples in R and Stan is a new book by Richard McElreath that CRC Press sent me for review in CHANCE. While the book was already discussed on Andrew’s blog three months ago, and [rightly so!] enthusiastically recommended by Rasmus Bååth on Amazon, here are the reasons why I am quite impressed by Statistical Rethinking!

“Make no mistake: you will wreck Prague eventually.” (p.10)

While the book has a lot in common with Bayesian Data Analysis, from being in the same CRC series to adopting a pragmatic and weakly informative approach to Bayesian analysis, to supporting the use of STAN, it also nicely develops its own ecosystem and idiosyncrasies, with a noticeable Jaynesian bent. To start with, I like the highly personal style with clear attempts to make the concepts memorable for students by resorting to external concepts. The best example is the call to the myth of the golem in the first chapter, which McElreath uses as an warning for the use of statistical models (which almost are anagrams to golems!). Golems and models [and robots, another concept invented in Prague!] are man-made devices that strive to accomplish the goal set to them without heeding the consequences of their actions. This first chapter of Statistical Rethinking is setting the ground for the rest of the book and gets quite philosophical (albeit in a readable way!) as a result. In particular, there is a most coherent call against hypothesis testing, which by itself justifies the title of the book. Continue reading

## Bayesian Data Analysis [BDA3 – part #2]

Posted in Books, Kids, R, Statistics, University life with tags Andrew Gelman, Bayesian data analysis, Bayesian model choice, Bayesian predictive, finite mixtures, graduate course, hierarchical Bayesian modelling, rats, STAN on March 31, 2014 by xi'an**H**ere is the second part of my review of Gelman et al.’ *Bayesian Data Analysis* (third edition):

“When an iterative simulation algorithm is “tuned” (…) the iterations will not in general converge to the target distribution.” (p.297)

**P**art III covers advanced computation, obviously including MCMC but also model approximations like variational Bayes and expectation propagation (EP), with even a few words on ABC. The novelties in this part are centred at Stan, the language Andrew is developing around Hamiltonian Monte Carlo techniques, a sort of BUGS of the 10’s! (And of course Hamiltonian Monte Carlo techniques themselves. A few (nit)pickings: the book advises important resampling without replacement (p.266) which makes some sense when using a poor importance function but ruins the fundamentals of importance sampling. Plus, no trace of infinite variance importance sampling? of harmonic means and their dangers? In the Metropolis-Hastings algorithm, the proposal is called the jumping rule and denoted by J_{t}, which, besides giving the impression of a Jacobian, seems to allow for time-varying proposals and hence time-inhomogeneous Markov chains, which convergence properties are much hairier. (The warning comes much later, as exemplified in the above quote.) Moving from “burn-in” to “warm-up” to describe the beginning of an MCMC simulation. Being somewhat 90’s about convergence diagnoses (as shown by the references in Section 11.7), although the book also proposes new diagnoses and relies much more on effective sample sizes. Particle filters are evacuated in hardly half-a-page. Maybe because Stan does not handle particle filters. A lack of intuition about the Hamiltonian Monte Carlo algorithms, as the book plunges immediately into a two-page pseudo-code description. Still using physics vocabulary that put *me* (and maybe only *me*) off. Although I appreciated the advice to check analytical gradients against their numerical counterpart.

“In principle there is no limit to the number of levels of variation that can be handled in this way. Bayesian methods provide ready guidance in handling the estimation of the unknown parameters.” (p.381)

**I** also enjoyed reading the part about modes that stand at the boundary of the parameter space (Section 13.2), even though I do not think modes are great summaries in Bayesian frameworks and while I do not see how picking the prior to avoid modes at the boundary avoids the data impacting the prior, *in fine*. The variational Bayes section (13.7) is equally enjoyable, with a proper spelled-out illustration, introducing an unusual feature for Bayesian textbooks. (Except that sampling without replacement is back!) Same comments for the Expectation Propagation (EP) section (13.8) that covers brand new notions. (Will they stand the test of time?!)

“Geometrically, if β-space is thought of as a room, the model implied by classical model selection claims that the true β has certain prior probabilities of being in the room, on the floor, on the walls, in the edge of the room, or in a corner.” (p.368)

**P**art IV is a series of five chapters about regression(s). This is somewhat of a classic, nonetheless Chapter 14 surprised me with an elaborate election example that dabbles in advanced topics like causality and counterfactuals. I did not spot any reference to the *g*-prior or to its intuitive justifications and the chapter mentions the lasso as a regularisation technique, but without any proper definition of this “popular non-Bayesian form of regularisation” (p.368). In French: with not a single equation! Additional novelty may lie in the numerical prior information about the correlations. What is rather crucially (cruelly?) missing though is a clearer processing of variable selection in regression models. I know Andrew opposes any notion of a coefficient being exactly equal to zero, as ridiculed through the above quote, but the book does not reject model selection, so why not in this context?! Chapter 15 on hierarchical extensions stresses the link with exchangeability, once again. With another neat election example justifying the progressive complexification of the model and the cranks and toggles of model building. (I am not certain the reparameterisation advice on p.394 is easily ingested by a newcomer.) The chapters on robustness (Chap. 17) and missing data (Chap. 18) sound slightly less convincing to me, esp. the one about robustness as I never got how to make robustness agree with my Bayesian perspective. The book states “we do not have to abandon Bayesian principles to handle outliers” (p.436), but I would object that the Bayesian paradigm compels us to define an alternative model for those outliers and the way they are produced. One can always resort to a drudging exploration of which subsample of the dataset is at odds with the model but this may be unrealistic for large datasets and further tells us nothing about how to handle those datapoints. The missing data chapter is certainly relevant to such a comprehensive textbook and I liked the survey illustration where the missing data was in fact made of missing questions. However, I felt the multiple imputation part was not well-presented, fearing readers would not understand how to handle it…

“You can use MCMC, normal approximation, variational Bayes, expectation propagation, Stan, or any other method. But your fit must be Bayesian.” (p.517)

**P**art V concentrates the most advanced material, with Chapter 19 being mostly an illustration of a few complex models, slightly superfluous in my opinion, Chapter 20 a very short introduction to functional bases, including a basis selection section (20.2) that implements the “zero coefficient” variable selection principle refuted in the regression chapter(s), and does not go beyond splines (what about wavelets?), Chapter 21 a (quick) coverage of Gaussian processes with the motivating birth-date example (and two mixture datasets I used eons ago…), Chapter 22 a more (too much?) detailed study of finite mixture models, with no coverage of reversible-jump MCMC, and Chapter 23 an entry on Bayesian non-parametrics through Dirichlet processes.

“In practice, for well separated components, it is common to remain stuck in one labelling across all the samples that are collected. One could argue that the Gibbs sampler has failed in such a case.” (p.535)

**T**o get back to mixtures, I liked the quote about the label switching issue above, as I was “one” who argued that the Gibbs sampler fails to converge! The corresponding section seems to favour providing a density estimate for mixture models, rather than component-wise evaluations, but it nonetheless mentions the relabelling by permutation approach (if missing our 2000 JASA paper). The section about inferring on the unknown number of components suggests conducting a regular Gibbs sampler on a model with an upper bound on the number of components and then checking for empty components, an idea I (briefly) considered in the mid-1990’s before the occurrence of RJMCMC. Of course, the prior on the components matters and the book suggests using a Dirichlet with fixed sum like 1 on the coefficients for all numbers of components.

“14. Objectivity and subjectivity: discuss the statement `People tend to believe results that support their preconceptions and disbelieve results that surprise them. Bayesian methods tend to encourage this undisciplined mode of thinking.’¨ (p.100)

**O**bviously, this being a third edition begets the question, *what’s up, doc?!,* i.e., what’s new [when compared with the second edition]? Quite a lot, even though I am not enough of a Gelmanian exegist to produce a comparision table. Well, for a starter, David Dunson and Aki Vethtari joined the authorship, mostly contributing to the advanced section on non-parametrics, Gaussian processes, EP algorithms. Then the Hamiltonian Monte Carlo methodology and Stan of course, which is now central to Andrew’s interests. The book does include a short Appendix on running computations in R and in Stan. Further novelties were mentioned above, like the vision of weakly informative priors taking over noninformative priors but I think this edition of *Bayesian Data Analysis* puts more stress on clever and critical model construction and on the fact that it can be done in a Bayesian manner. Hence the insistence on predictive and cross-validation tools. The book may be deemed somewhat short on exercices, providing between 3 and 20 mostly well-developed problems per chapter, often associated with datasets, rather than the less exciting counter-example above. Even though Andrew disagrees and his students at ENSAE this year certainly did not complain, I personally feel a total of 220 exercices is not enough for instructors and self-study readers. (At least, this reduces the number of email requests for solutions! Esp. when 50 of those are solved on the book website.) But this aspect is a minor quip: overall this is truly the reference book for a graduate course on Bayesian statistics and not only Bayesian data analysis.