Archive for Bayesian statistics

I’m getting the point

Posted in Statistics with tags , , , , , , on February 14, 2019 by xi'an

A long-winded X validated discussion on the [textbook] mean-variance conjugate posterior for the Normal model left me [mildly] depressed at the point and use of answering questions on this forum. Especially as it came at the same time as a catastrophic outcome for my mathematical statistics exam.  Possibly an incentive to quit X validated as one quits smoking, although this is not the first attempt

AIQ [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , , , on January 11, 2019 by xi'an

AIQ was my Christmas day read, which I mostly read while the rest of the household was still sleeping. The book, written by two Bayesians, Nick Polson and James Scott, was published before the ISBA meeting last year, but I only bought it on my last trip to Warwick [as a Xmas present]. This is a pleasant book to read, especially while drinking tea by the fire!, well-written and full of facts and anecdotes I did not know or had forgotten (more below). Intended for a general audience, it is also quite light, from a technical side, rather obviously, but also from a philosophical side. While strongly positivist about the potential of AIs for the general good, it cannot be seen as an antidote to the doomlike Superintelligence by Nick Bostrom or the more factual Weapons of Maths Destruction by Cathy O’Neal. (Both commented on the ‘Og.)

Indeed, I find the book quite benevolent and maybe a wee bit too rosy in its assessment of AIs and the discussion on how Facebook and Russian intervention may have significantly to turn the White House Orange is missing [imho] the viral nature of the game, when endless loops of highly targeted posts can cut people from the most basic common sense. While the authors are “optimistic that, given the chance, people can be smart enough”, I do reflect on the sheer fact that the hoax that Hillary Clinton was involved in a child sex ring was ever considered seriously by people. To the point of someone shooting at the pizza restaurant. And I hence am much less optimistic at the ability for a large enough portion of the population, not even the majority, to keep a critical distance from the message carried by AI driven media. Similarly, while Nick and James point out (rather late in the book) that big data (meaning large data) is not necessarily good data for being unrepresentative at the population at large, they do not propose (in the book) highly convincing solutions to battle bias in existing and incoming AIs. Leading to a global worry that AIs may do well for a majority of the population and discriminate against a minority by the same reasoning. As described in Cathy O’Neal‘s book, and elsewhere, proprietary software does not even have to explain why it discriminates. More globally, the business school environment of the authors may have prevented them from stating a worry on the massive power grab by the AI-based companies, which genetically grow with little interest in democracy and states, as shown (again) by the recent election or their systematic fiscal optimisation. Or by the massive recourse to machine learning by Chinese authorities towards a social credit system grade for all citizens.

“La rage de vouloir conclure est une des manies les plus funestes et les plus stériles qui appartiennent à l’humanité. Chaque religion et chaque philosophie a prétendu avoir Dieu à elle, toiser l’infini et connaître la recette du bonheur.” Gustave Flaubert

I did not know about Henrietta Leavitt’s prediction rule for pulsating stars, behind Hubble’s discovery, which sounds like an astronomy dual to Rosalind Franklin’s DNA contribution. The use of Bayes’ rule for locating lost vessels is also found in The Theorem that would not die. Although I would have also mentioned its failure in locating Malaysia Airlines Flight 370. I had also never heard the great expression of “model rust. Nor the above quote from Flaubert. It seems I have recently spotted the story on how a 180⁰ switch in perspective on language understanding by machines brought the massive improvement that we witness today. But I cannot remember where. And I have also read about Newton missing the boat on the precision of the coinage accuracy (was it in Bryson’s book on the Royal Society?!), but with less neutral views on the role of Newton in the matter, as the Laplace of England would have benefited from keeping the lax measures of assessment.

Great to see friendly figures like Luke Bornn and Katherine Heller appearing in the pages. Luke for his work on the statistical analysis of basketball games, Katherine  for her work on predictive analytics in medicine. Reflecting on the missed opportunities represented by the accumulation of data on any patient throughout their life that is as grossly ignored nowadays as it was at Nightingale‘s time. The message of the chapter [on “The Lady with the Lamp”] may again be somewhat over-optimistic: while AI and health companies see clear incentives in developing more encompassing prediction and diagnostic techniques, this will only benefit patients who can afford the ensuing care. Which, given the state of health care systems in the most developed countries, is an decreasing proportion. Not to mention the less developed countries.

Overall, a nice read for the general public, de-dramatising the rise of the machines!, and mixing statistics and machine learning to explain the (human) intelligence behind the AIs. Nothing on the technical side, to be sure, but this was not the intention of the authors.

Binomial vs Bernoulli

Posted in Books, Statistics with tags , , , , on December 25, 2018 by xi'an

An interesting confusion on X validated where someone was convinced that using the Bernoulli representation of a sequence of Bernoulli experiments led to different posterior probabilities of two possible models than when using their Binomial representation. The confusion actually stemmed from using different conditionals, namely N¹=4,N²=1 in the first case (for a model M¹ with two probabilities p¹ and p²) and N¹+N²=5 in the second case (for a model M² with a single probability p⁰). While (N¹,N²) is sufficient for the first model and N¹+N² is sufficient for the second model, P(M¹|N¹,N²) is not commensurable to P(M²|N¹+N²)! Another illustration of the fickleness of the notion of sufficiency when comparing models.

at CIRM [jatp]

Posted in Mountains, pictures, Running, Travel with tags , , , , , , , , , , , , , , , , , on October 21, 2018 by xi'an

a jump back in time

Posted in Books, Kids, Statistics, Travel, University life with tags , , , , , , , , , , , on October 1, 2018 by xi'an

As the Department of Statistics in Warwick is slowly emptying its shelves and offices for the big migration to the new building that is almost completed, books and documents are abandoned in the corridors and the work spaces. On this occasion, I thus happened to spot a vintage edition of the Valencia 3 proceedings. I had missed this meeting and hence the volume for, during the last year of my PhD, I was drafted in the French Navy and as a result prohibited to travel abroad. (Although on reflection I could have safely done it with no one in the military the wiser!) Reading through the papers thirty years later is a weird experience, as I do not remember most of the papers, the exception being the mixture modelling paper by José Bernardo and Javier Giròn which I studied a few years later when writing the mixture estimation and simulation paper with Jean Diebolt. And then again in our much more recent non-informative paper with Clara Grazian.  And Prem Goel’s survey of Bayesian software. That is, 1987 state of the art software. Covering an amazing eighteen list. Including versions by Zellner, Tierney, Schervish, Smith [but no MCMC], Jaynes, Goldstein, Geweke, van Dijk, Bauwens, which apparently did not survive the ages till now. Most were in Fortran but S was also mentioned. And another version of Tierney, Kass and Kadane on Laplace approximations. And the reference paper of Dennis Lindley [who was already retired from UCL at that time!] on the Hardy-Weinberg equilibrium. And another paper by Don Rubin on using SIR (Rubin, 1983) for simulating from posterior distributions with missing data. Ten years before the particle filter paper, and apparently missing the possibility of weights with infinite variance.

There already were some illustrations of Bayesian analysis in action, including one by Jay Kadane reproduced in his book. And several papers by Jim Berger, Tony O’Hagan, Luis Pericchi and others on imprecise Bayesian modelling, which was in tune with the era, the imprecise probability book by Peter Walley about to appear. And a paper by Shaw on numerical integration that mentioned quasi-random methods. Applied to a 12 component Normal mixture.Overall, a much less theoretical content than I would have expected. And nothing about shrinkage estimators, although a fraction of the speakers had worked on this topic most recently.

At a less fundamental level, this was a time when LaTeX was becoming a standard, as shown by a few papers in the volume (and as I was to find when visiting Purdue the year after), even though most were still typed on a typewriter, including a manuscript addition by Dennis Lindley. And Warwick appeared as a Bayesian hotpot!, with at least five papers written by people there permanently or on a long term visit. (In case a local is interested in it, I have kept the volume, to be found in my new office!)