Archive for the Mountains Category

Bayes Rules! [book review]

Posted in Books, Kids, Mountains, pictures, R, Running, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on July 5, 2022 by xi'an

Bayes Rules! is a new introductory textbook on Applied Bayesian Model(l)ing, written by Alicia Johnson (Macalester College), Miles Ott (Johnson & Johnson), and Mine Dogucu (University of California Irvine). Textbook sent to me by CRC Press for review. It is available (free) online as a website and has a github site, as well as a bayesrule R package. (Which reminds me that both our own book R packages, bayess and mcsm, have gone obsolete on CRAN! And that I should find time to figure out the issue for an upgrading…)

As far as I can tell [from abroad and from only teaching students with a math background], Bayes Rules! seems to be catering to early (US) undergraduate students with very little exposure to mathematical statistics or probability, as it introduces basic probability notions like pmf, joint distribution, and Bayes’ theorem (as well as Greek letters!) and shies away from integration or algebra (a covariance matrix occurs on page 437 with a lot . For instance, the Normal-Normal conjugacy derivation is considered a “mouthful” (page 113). The exposition is somewhat stretched along the 500⁺ pages as a result, imho, which is presumably a feature shared with most textbooks at this level, and, accordingly, the exercises and quizzes are more about intuition and reproducing the contents of the chapter than technical. In fact, I did not spot there a mention of sufficiency, consistency, posterior concentration (almost made on page 113), improper priors, ergodicity, irreducibility, &tc., while other notions are not precisely defined, like ESS, weakly informative (page 234) or vague priors (page 77), prior information—which makes the negative answer to the quiz “All priors are informative”  (page 90) rather confusing—, R-hat, density plot, scaled likelihood, and more.

As an alternative to “technical derivations” Bayes Rules! centres on intuition and simulation (yay!) via its bayesrule R package. Itself relying on rstan. Learning from example (as R code is always provided), the book proceeds through conjugate priors, MCMC (Metropolis-Hasting) methods, regression models, and hierarchical regression models. Quite impressive given the limited prerequisites set by the authors. (I appreciated the representations of the prior-likelihood-posterior, especially in the sequential case.)

Regarding the “hot tip” (page 108) that the posterior mean always stands between the prior mean and the data mean, this should be made conditional on a conjugate setting and a mean parameterisation. Defining MCMC as a method that produces a sequence of realisations that are not from the target makes a point, except of course that there are settings where the realisations are from the target, for instance after a renewal event. Tuning MCMC should remain a partial mystery to readers after reading Chapter 7 as the Goldilocks principle is quite vague. Similarly, the derivation of the hyperparameters in a novel setting (not covered by the book) should prove a challenge, even though the readers are encouraged to “go forth and do some Bayes things” (page 509).

While Bayes factors are supported for some hypothesis testing (with no point null), model comparison follows more exploratory methods like X validation and expected log-predictive comparison.

The examples and exercises are diverse (if mostly US centric), modern (including cultural references that completely escape me), and often reflect on the authors’ societal concerns. In particular, their concern about a fair use of the inferred models is preminent, even though a quantitative assessment of the degree of fairness would require a much more advanced perspective than the book allows… (In that respect, Exercise 18.2 and the following ones are about book banning (in the US). Given the progressive tone of the book, and the recent ban of math textbooks in the US, I wonder if some conservative boards would consider banning it!) Concerning the Himalaya submitting running example (Chapters 18 & 19), where the probability to summit is conditional on the age of the climber and the use of additional oxygen, I am somewhat surprised that the altitude of the targeted peak is not included as a covariate. For instance, Ama Dablam (6848 m) is compared with Annapurna I (8091 m), which has the highest fatality-to-summit ratio (38%) of all. This should matter more than age: the Aosta guide Abele Blanc climbed Annapurna without oxygen at age 57! More to the point, the (practical) detailed examples do not bring unexpected conclusions, as for instance the fact that runners [thrice alas!] tend to slow down with age.

A geographical comment: Uluru (page 267) is not a city!, but an impressive sandstone monolith in the heart of Australia, a 5 hours drive away from Alice Springs. And historical mentions: Alan Turing (page 10) and the team at Bletchley Park indeed used Bayes factors (and sequential analysis) in cracking the Enigma, but this remained classified information for quite a while. Arianna Rosenbluth (page 10, but missing on page 165) was indeed a major contributor to Metropolis et al.  (1953, not cited), but would not qualify as a Bayesian statistician as the goal of their algorithm was a characterisation of the Boltzman (or Gibbs) distribution, not statistical inference. And David Blackwell’s (page 10) Basic Statistics is possibly the earliest instance of an introductory Bayesian and decision-theory textbook, but it never mentions Bayes or Bayesianism.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

day five at ISBA 22

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , on July 4, 2022 by xi'an

Woke up even earlier today! Which left me time to work on switching to Leonard Cohen’s song titles for my slide frametitles this afternoon (last talk of the whole conference!), run once again to Mon(t) Royal as all pools are closed (Happy Canada Day!, except to “freedom convoy” antivaxxxers.) Which led to me meeting a raccoon by the side of the path (and moroons feeding wildlife).

Had an exciting time at the morning session, where Giacomo Zanella (formerly Warwick) talked on a mixture approach to leave-one-out predictives, with pseudo-harmonic mean representation, averaging inverse density across all observations. Better than harmonic? Some assumptions allow for finite variance, although I am missing the deep argument (in part due to Giacomo’s machine-gun delivery pace!) Then Alicia Corbella (Warwick) presented a promising entry into PDMP by proposing an automated zig-zag sampler. Pointing out on the side to Joris Bierkens’ webpage on the state-of-the-art PDMP methodology. In this approach, joint with with my other Warwick colleagues Simon Spencer and Gareth Roberts, the zig-zag sampler relies on automatic differentiation and sub-sampling and bound derivation, with “no further information on the target needed”. And finaly Chris Carmona presented a joint work with Geoff Nicholls that is merging merging cut posteriors and variational inference to create a meta posterior. Work and talk were motivated by a nice medieval linguistic problem where the latent variables impact the (convergence of the) MCMC algorithm [as in our k-nearest neighbour experience]. Interestingly using normalising [neural spline] flows. The pseudo-posterior seems to depend very much on their modularization rate η, which penalises how much one module influences the next one.

In the aft, I attended sort of by chance [due to a missing speaker in the copula session] to the end of a session on migration modelling, with a talk by Jason Hilton and Martin Hinsch focussing on the 2015’s mass exodus of Syrians through the Mediterranean,  away from the joint evils of al-Hassad and ISIS. As this was a tragedy whose modelling I had vainly tried to contribute to, I was obviously captivated and frustrated (leaning of the IOM missing migrant project!) Fitting the agent-based model was actually using ABC, and most particularly our ABC-PMC!!!

My own and final session had Gareth (Warwick) presenting his recent work with Jun Yang and Kryzs Łatuszyński (Warwick) on the stereoscopic projection improvement over regular MCMC, which involves turning the target into a distribution supported by an hypersphere and hence considering a distribution with compact support and higher efficiency. Kryzs had explained the principle while driving back from Gregynog two months ago. The idea is somewhat similar to our origaMCMC, which I presented at MCqMC 2016 in Stanford (and never completed), except our projection was inside a ball. Looking forward the adaptive version, in the making!

And to conclude this subjective journal from the ISBA conference, borrowing this title by (Westmount born) Leonard Cohen, “Hey, that’s not a way to say goodbye”… To paraphrase Bilbo Baggins, I have not interacted with at least half the participants half as much as I would have liked. But this was still a reunion, albeit in the new Normal. Hopefully, the conference will not have induced a massive COVID cluster on top of numerous scientific and social exchanges! The following days will tell. Congrats to the ISBA 2022 organisers for achieving a most successful event in these times of uncertainty. And looking forward the 2024 next edition in Ca’Foscari, Venezia!!!

 

day four at ISBA 22

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , on July 3, 2022 by xi'an

Woke up an hour later today! Which left me time to work on [shortening] my slides for tomorrow, run to Mon(t) Royal, and bike to St-Viateur Bagels for freshly baked bagels. (Which seemed to be missing salt, despite my low tolerance for salt in general.)

Terrific plenary lecture by Pierre Jacob in his Susie Bayarri’s Lecture about cut models!  Offering a very complete picture of the reasons for seeking modularisation, the theoretical and practical difficulties with the approach, and some asymptotics as well. Followed a great discussion by Judith on cut posteriors separating interest parameters from nuisance parameters, especially in semi-parametric models. Even introducing two priors on the same parameters! And by Jim Berger, who coauthored with Susie the major cut paper inspiring this work, and illustrated the concept on computer experiments (not falling into the fallacy pointed out by Martin Plummer at MCMski(v) in Chamonix!).

Speaking of which, the Scientific Committee for the incoming BayesComp²³ in Levi, Finland, had a working meeting to which I participated towards building the programme as it is getting near. For those interested in building a session, they should make preparations and take advantage of being together in Mon(t)réal, as the call is coming out pretty soon!

Attended a session on divide-and-conquer methods for dependent data, with Sanvesh Srivastava considering the case of hidden Markov models and block processing the observed sequence. Which is sort of justified by the forgettability of long-past observations. I wonder if better performances could be achieved otherwise as the data on a given time interval gives essentially information on the hidden chain at other time periods.

I was informed this morn that Jackie Wong, one speaker in our session tomorrow could not make it to Mon(t)réal for visa reasons. Which is unfortunate for him, the audience and everyone involved in the organisation. This reinforces my call for all-time hybrid conferences that avoid penalising (or even discriminating) against participants who cannot physically attend for ethical, political (visa), travel, health, financial, parental, or any other, reasons… I am often opposed the drawbacks of lower attendance, risk of a deficit, dilution of the community, but there are answers to those, existing or to be invented, and the huge audience at ISBA demonstrates a need for “real” meetings that could be made more inclusive by mirror (low-key low-cost) meetings.

Finished the day at Isle de Garde with a Pu Ehr flavoured beer, in a particularly lively (if not jazzy) part of the city…

day three at ISBA 22

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on July 1, 2022 by xi'an

Still woke up early too early [to remain operational for the poster session], finalised the selection of our MASH 2022/3 students, then returned to the Jean-Drapeau pool, which was  even more enjoyable in a crisp bright blue morning (and hardly anyone in my lane).

Attended a talk by Li Ma, who reviewed complexifying stick-breaking priors on the weights and introduced a balanced tree stick mechanism (why same depth?) (with links to Jara & Hanson 2010 and Stefanucci & Canale 2021). Then I listened to Giovanni Rebaubo creating clustering Gibbs-type processes along graphs, I sorted of dozed and missed the point as it felt as if the graph turned from a conceptual connection into a physical one! Catherine Forbes talked about a sequential version of stochastic variational approximation (published in St&Co) exploiting the update-one-at-a-time feature of Bayesian construction, except that each step relies on the previous approximation, meaning that the final—if fin there is!—approximation can end up far away from the optimal stochastic variational approximation. Assessing the divergence away from the target (in real time and tight budget would be nice).

After a quick lunch where I tasted seaweed-shell gyozas (!), I went to the generalised Bayesian inference session on Gibbs posteriors, [sort of] making up for the missed SAVI workshop! With Alice Kirichenko (Warwick) deriving information complexity bounds under misspecification, plus deriving an optimal value for the [vexing] coefficient η [in the Gibbs posterior], and Jack Jewson (ex-Warwick), raising the issue of improper models within Gibbs posteriors, although the reference or dominating measure is a priori arbitrary in these settings. But missing the third talk, about Gibbs posteriors again, and Chris Homes’ discussion, to attend part of the Savage (thesis) Award, with finalists Marta Catalano (Warwick faculty), Aditi Shenvi (Warwick student), and John O’Leary (an academic grand-children of mine’s as Pierre Jacob was his advisor). What a disappointment to have to wait for Friday night to hear the outcome!

I must confess to some  (French-speaker) énervement at hearing Mon(t)-réal massacred as Mon-t-real..! A very minor hindrance though, when put in perspective with my friend and Warwick colleague Gareth Roberts forced to evacuate his hotel last night due to a fire in basement, fortunately unscathed but ruining Day 3 for him… (Making me realise the conference hotel itself underwent a similar event 14 years ago.)

day two at ISBA 22

Posted in Mountains, pictures, Running, Statistics, Travel with tags , , , , , , , , , , , , , , , , , , , on June 30, 2022 by xi'an

Still woke up early too early, which let me go for a long run in Mont Royal (which felt almost immediately familiar from earlier runs at MCM 2017!) at dawn and at a pleasant temperature (but missed the top bagel bakery on the way back!). Skipped the morning plenary lectures to complete recommendation letters and finishing a paper submission. But had a terrific lunch with a good friend I had not seen in Covid-times, at a local branch of Kinton Ramen which I already enjoyed in Vancouver as my Airbnb was located on top of it.

I chaired the afternoon Bayesian computations session with Onur Teymur presenting the general spirit of his Neurips 21 paper on black box probabilistic numerics. Mentioning that a new textbook on the topic by Phillip Henning, Michael Osborne, and Hans Kersting had appeared today! The second talk was by Laura Bondi who discussed an ABC model choice approach to assess breast cancer screening. With enough missing data (out of 78051 women followed over 12 years) to lead to an intractable likelihood. Starting with vanilla ABC using 32 summaries and moving to our random forest approach. Unsurprisingly concluding with different top models, but not characterising the identifiability provided by the choice of the summaries. The third talk was by Ryan Chan (fresh Warwick PhD recipient), about a Fusion divide-and-conquer approach that avoids the approximation of earlier approaches. In particular he uses a clever accept-reject algorithm to generate a product of densities using the component densities. A nice trick that Murray explained to me while visiting in Paris lg ast month. (The approach appears to be parameterisation dependent.) The final talk was by Umberto Picchini and in a sort the synthetic likelihood mirror of Massi’s talk yesterday, in the sense of constructing a guided proposal relying on observed summaries. If not comparing both approaches on a given toy like the g-and-k distribution.

%d bloggers like this: