the T-shirts I love [book/closet review]

When I first heard of Haruki Murakami’s book on tee-shirts, I found the concept sufficiently intriguing to start looking for the book and I eventually found on Amazon a cheap used sale that got delivered to a friend in the US (who was most perplexed by my choice!). Having gone through the book and its 110 photos of tee-shirts, I am feeling like I had a light late-evening conversation with the author and a window into the reasons why he keeps and seeks so many tees. This is a translation from Japanese, so I cannot say how colloquial Murakami was in the original, but this is most enjoyable (in a very light sense!). Having worn tee-shirts for all of my adult life (and none during my childhood), albeit not with any comparable collection, by far!, I can relate with some categories like

  1. race tees (which have now almost completely vanished, being replaced with synthetic running tops), of which my favourite is the 1988 Skunk Cabbage Classic tee celebrating the 5k race organised every year by the Finger Lakes Runners Club
  2. beer tees, like my favourites advertising Yellowstone’s Moose Drool brown ale [and supposedly dyed in the beer?!] and Salt Lake City Full Suspension [with the fantastically ironic motto Beers you can believe in!]
  3. bars/pubs tees, like the one I bought at the Clachaig Inn, Glencoe
  4. institution tees, with my favourite being the iconic U of T Austin ochre shirt with a longhorn skull
  5. and, to diverge from Murakami’s surfing section, mountaineering places/brand tees, of which the homemade þe Norse Farce is the obvious selection!

And neither shared tee spotted within the published 110 selected ones, nor any one I would desperately seek.

Poisson-Belgium 0-0

“Statistical match predictions are more accurate than many people realize (…) For the upcoming Qatar World Cup, Penn’s model suggests that Belgium (…) has the highest chances of raising the famous trophy, followed by Brazil”

Even Nature had to get entries on the current football World cup, with a paper on data-analytics reaching football coaches and teams. This is not exactly prime news, as I remember visiting the Department of Statistics of the University of Glasgow in the mid 1990’s and chatting with a very friendly doctoral student who was consulting for the Glasgow Rangers (or Celtics?!) on the side at the time. And went back to Ireland to continue with a local team (Galway?!).

The paper reports on different modellings, including one double-Poisson model by (PhD) Matthew Penn from Oxford and (maths undergraduate) Joanna Marks from Warwick, which presumably resemble the double-Poisson version set by Leonardo Egidi et al. and posted on Andrews’ blog a few days ago. Following an earlier model by my friends Karlis & Ntzoufras in 2003. While predictive models can obviously fail, this attempt is missing Belgium, Germany, Switzerland, Mexico, Uruguay, and Denmark early elimination from the cup. One possible reason imho is that national teams do not play that often when players are employed by different clubs in many counties, hence are hard to assess, but I cannot claim any expertise or interest in the game.

Finite mixture models do not reliably learn the number of components

When preparing my talk for Padova, I found that Diana Cai, Trevor Campbell, and Tamara Broderick wrote this ICML / PLMR paper last year on the impossible estimation of the number of components in a mixture.

“A natural check on a Bayesian mixture analysis is to establish that the Bayesian posterior on the number of components increasingly concentrates near the truth as the number of data points becomes arbitrarily large.” Cai, Campbell & Broderick (2021)

Which seems to contradict [my formerly-Glaswegian friend] Agostino Nobile  who showed in his thesis that the posterior on the number of components does concentrate at the true number of components, provided the prior contains that number in its support. As well as numerous papers on the consistency of the Bayes factor, including the one against an infinite mixture alternative, as we discussed in our recent paper with Adrien and Judith. And reminded me of the rebuke I got in 2001 from the late David McKay when mentioning that I did not believe in estimating the number of components, both because of the impact of the prior modelling and of the tendency of the data to push for more clusters as the sample size increased. (This was a most lively workshop Mike Titterington and I organised at ICMS in Edinburgh, where Radford Neal also delivered an impromptu talk to argue against using the Galaxy dataset as a benchmark!)

“In principle, the Bayes factor for the MFM versus the DPM could be used as an empirical criterion for choosing between the two models, and in fact, it is quite easy to compute an approximation to the Bayes factor using importance sampling” Miller & Harrison (2018)

This is however a point made in Miller & Harrison (2018) that the estimation of k logically goes south if the data is not from the assumed mixture model. In this paper, Cai et al. demonstrate that the posterior diverges, even when it depends on the sample size. Or even the sample as in empirical Bayes solutions.

Ring of Steall 2022

statistical aspects of climate change [discuss]

As part of its annual conference in Aberdeen, Scotland, the RSS is organising a discussion meeting on two papers presented on Wednesday 14 September 2022, 5.00PM – 7.00PM (GMT+1), with free on-line registration.

Two papers will be presented:

‘Assessing present and future risk of water damage using building attributes, meteorology, and topography’ by Heinrich-Mertsching et al.​
‘The importance of context in extreme value analysis with application to extreme temperatures in the USA and Greenland’ by Clarkson et al.​

“The Discussion Meeting at this year’s RSS conference in Aberdeen will feature two papers on the Statistical Aspects of Climate Change. The Discussion Meetings Committee chose this topic area motivated by the UN Climate Change Conference (COP26) held in Glasgow last year and because climate changes and the environment is one of the RSS’s six current campaigning priorities for 2022.

You are welcome to listen to the speakers and join in the discussion of the papers which follows the presentations. All the proceedings will be published in a forthcoming issue of Journal of the Royal Statistical Society, Series C (Applied Statistics) .”

Dr Shirley Coleman, Chair and Honorary Officer for Discussion Meetings

