Archive for Henri Poincaré

probabilistic numerics [book review]

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , , , , , , , , , on July 28, 2023 by xi'an

Probabilistic numerics: Computation as machine learning is a 2022 book by Philipp Henning, Michael Osborne, and Hans Kersting that was sent to me by CUP (upon my request and almost free of charge, as I had to pay custom charges, thanks to Brexit!). With the important message of bringing statistical tools to numerics. I remember Persi Diaconis calling for (such) actions in the 1980’s (and even reading a paper of his on the topic along with George Casella in Ithaca while waiting for his car to get serviced!).

From a purely aesthetic view point, the book reads well, offers a beautiful cover and sells for a quite reasonable price for an academic book. Plus it is associated with a website containing draft version of the book. Code, links to courses, research, conferences are also available there. Just a side remark that it enjoys very wide margins that may have encouraged an inflation of footnotes (but also exercises). Except when formulas get in the way (as e.g. on p.40).

The figure below is an excerpt from the introduction that sets the scene of probabilistic numerics involving algorithms as agents, gathering data and making decisions, with an obvious analogy with standard Bayesian decision theory. Modelling uncertainty missing from the picture (if not from the book, as explained later by the authors as an argument against attaching the label Bayesian to the field). Also referring to Henri Poincaré for the origination of the prior vs posterior uncertainty about a mathematical quantity. Followed by early works from the Russian school of probability, somewhat ignored until the machine-learning revolution and a 2012 NIPS workshop organised by the authors. (I participated to a follow-up workshop at NIPS 2015.)

In this nicely written section, I have an objection to the authors’ argument that a frequentist, as opposed to a Bayesian, “has the loss function in mind from the outset” (p.9), since the loss function is logically inseparable from the prior and considered from the onset. I also like very much the conclusion to that introduction, namely that the main messages (from the book) are that (verbatim)

  • classical methods are probabilist (p.10)
  • numerical methods are autonomous agents (p.11)
  • numerics should not be random (if not a rejection of the concept of Monte Carlo methods, p.1, but probabilistic numerics being opposed to stochastic numerics, p.67)
  • numerics must report calibrated uncertainty (p.12)
  • imprecise computation is to be embraced (p.12)
  • probabilistic numerics consolidates numerical computation and statistical inference (p.13)
  • probabilistic numerical algorithms are already adding value (p.13)
  • pipelines of computation demand harmonisation

“Is it still reasonable to labour under computational constraints conceived in the 1940s?” (p.113)

“rather than being equally good for any number of dimensions, Monte Carlo is perhaps better thought of as being equally bad” (p.110)

Chapter I is a 40p infodump (!) on mathematical concepts needed for the following parts. Chapter II is about integration, opposing again PN and Monte Carlo (with strange remark that MCMC does not achieve √N convergence rate, p.72). In the sense that the later is frequentist in that it does not use a prior [unless considering a limiting improper version as in Section 12.2, an intriguing concept in this setup as I wonder whether or not improper priors can at all be contemplated] on the object of interest and hence that the stochasticity does not reflect uncertainty but rather the impact of the simulated sample. Advocating Bayesian quadrature (with some weird convergence graphs exhibiting a high variability with the number of iterations that apparently is not discussed) and bringing in the fascinating perspective of model choice in that framework (leading to compute a posterior probability for each model!). Being evidently biased towards Monte Carlo, I find the opposition in Chapter 12 unnecessarily antagonistic, while presenting Monte Carlo methods as a form of minimax solution, the more because quasi-Monte Carlo methods are hardly discussed (or dismissed). As illustrated by the following picture (p.115) and the above quotes. (And I won’t even go into the absurdity of §12.3 trashing pseudo-random generators as “painfully dumb”.)

Chapter III is a sort of dual of Chapter II for linear algebra numerics, primarily solving linear equations by Gaussian solvers, which introduces new concepts like Krylov sequences, although it sounds quite specific (for an outsider like me). Chapters IV and V deal with the more ambitious prospect of optimisation. Reconsidering classics and expanding into Bayesian optimisation, using Gaussian process priors and defining specific loss functions. Bringing in a strong link with machine learning tools and goals. [citation typo on p.277]. Chapter VII addresses the resolution of ODEs by a Bayesian state space model representation and (again!) Gaussian processes. Reaching to mentioning inverse problems and offering a short finale on prospective steps for interested readers.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]

in defense of subjectivity [sound the gong]

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , on October 13, 2022 by xi'an

When browsing the IMS Bulletin [01 October] a few days ago, I saw that Ruobin Gong (from Rutgers) had written a tribune about Subjectivism. In response to [IMS President] Krysz Burdzy’s presidential address at the IMS Meeting in London a few months earlier. Address that I had missed and where he was calling for the end of the term subjective in statistics… (While ironically attending the Bayesian conference in Montréal!) Given the tone of his Search for Certainty book, which Andrew and Larry and I discussed a while ago, I am not at all surprised by another go at Bayesian statistics, but I will not indulge into another response, since Krysz found my earlier review “venomous”! Especially since Ruobin has produced a deeply argument ed and academically grounded criticism of the presidential address (which, if I may mention it, sounds rather rambling away from statistics). In particular, Ruobin introduces Objectivity³ as “an interpreted characterization of the scientific object”, which reminds me of Nietzsche’s aphorism about physics. And where personal and collegial inputs are plusses, even though they could be qualified to be “subjective”. This was also Poincaré’s argument for Bayesian reasoning. In conclusion, I think that the London call to cease using the term in statistics was neither timely (as the subjective-versus-objective debate has sort of dried out) nor appropriate (in that it clashed with the views of part of the IMS community).

a journal of the plague year² [not there yet]

Posted in Books, Kids, pictures, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , on November 27, 2021 by xi'an

Returned to Warwick once more, with “traffic-as-usual” at Charles de Gaulle airport, including a single border officer for the entire terminal, a short-timed fright that I actually needed a PCR test on top of my vaccine certificate to embark, due to wrong signage, a one-hour delay at departure due to foggy conditions in B’ham, and another ½ hour delay at arrival due to a shortage of staff and hence no exit stairs available! And got a tense return to B’ham as the taxi line in Warwick had vanished!

Read the first novel of P. Djèlí-Clark A Master of Djinn after reading a series of short stories and novellas of his, taking place in the same fantastic Cairo of the early 1900’s. This was enjoyable, mostly, again thanks to well-constructed characters (apart from the arch-villain) and the appeal of the magical Cairo imagined by the author. I did not feel the appearances of Raymond Poincaré or von Birsmark were really needed, though. Also kindled A history of what comes next, by Sylvain Neuvel, which I got as a free (Tor) book. Which is an interesting take on the space race, with a pair of (super-)women behind the entire thing. And a lot of connections to the actual history. I somehow got tired in the middle, even though I finished the book during my commuting to and from Warwick.

Watched within a week My Name, a dark Korean TV drama,  as I found it very good and rather original (albeit with some similarities with the excellent Jeju-based Night in Paradise). The storyline is one of a young woman, Ji Woo, seeking revenge on her father’s killer, by joining the criminal gang her father was part of and infiltrating the police (not really  a spoiler!). At the beginning, this sounded like gang glorification, hence rather unappealing, but soon things proved to be quite different from how they appeared first. The scenario is of course most unrealistic, especially the (brutal and gory) fights where the heroine takes down endless rows of gang members and where the participants almost always recover from knife injuries that should have been fatal or at least permanently damaging. And the ineffectiveness of the police in stopping the drug dealers. However, when watched as a theatrical performance, the main characters in My Name, most especially Ji Woo, are well-constructed and ambiguous enough to make this descent into darkness worth watching. (Given the conclusion of the series, I cannot imagine a second season being made.) Also had a short go at Night Teeth, which proved a complete waste of time!

conditioning on insufficient statistics in Bayesian regression

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on October 23, 2021 by xi'an

“…the prior distribution, the loss function, and the likelihood or sampling density (…) a healthy skepticism encourages us to question each of them”

A paper by John Lewis, Steven MacEachern, and Yoonkyung Lee has recently appeared in Bayesian Analysis. Starting with the great motivation of a misspecified model requiring the use of a (thus necessarily) insufficient statistic and moving to their central concern of simulating the posterior based on that statistic.

Model misspecification remains understudied from a B perspective and this paper is thus most welcome in addressing the issue. However, when reading through, one of my criticisms is in defining misspecification as equivalent to outliers in the sample. An outlier model is an easy case of misspecification, in the end, since the original model remains meaningful. (Why should there be “good” versus “bad” data) Furthermore, adding a non-parametric component for the unspecified part of the data would sound like a “more Bayesian” alternative. Unrelated, I also idly wondered at whether or not normalising flows could be used in this instance..

The problem in selecting a T (Darjeeling of course!) is not really discussed there, while each choice of a statistic T leads to a different signification to what misspecified means and suggests a comparison with Bayesian empirical likelihood.

“Acceptance rates of this [ABC] algorithm can be intolerably low”

Erm, this is not really the issue with ABC, is it?! Especially when the tolerance is induced by the simulations themselves.

When I reached the MCMC (Gibbs?) part of the paper, I first wondered at its relevance for the mispecification issues before realising it had become the focus of the paper. Now, simulating the observations conditional on a value of the summary statistic T is a true challenge. I remember for instance George Casella mentioning it in association with a Student’s t sample in the 1990’s and Kerrie and I having an unsuccessful attempt at it in the same period. Persi Diaconis has written several papers on the problem and I am thus surprised at the dearth of references here, like the rather recent Byrne and Girolami (2013), Florens and Simoni (2015), or Bornn et al. (2019). In the present case, the  linear model assumed as the true model has the exceptional feature that it leads to a feasible transform of an unconstrained simulation into a simulation with fixed statistics, with no measure theoretic worries if not free from considerable efforts to establish the operation is truly valid… And, while simulating (θ,y) makes perfect sense in an insufficient setting, the cost is then precisely the same as when running a vanilla ABC. Which brings us to the natural comparison with ABC. While taking ε=0 may sound as optimal for being “exact”, it is not from an ABC perspective since the convergence rate of the (summary) statistic should be roughly the one of the tolerance (Fearnhead and Liu, Frazier et al., 2018).

“[The Borel Paradox] shows that the concept of a conditional probability with regard to an isolated given hypothesis whose probability equals 0 is inadmissible.” A. Колмого́ров (1933)

As a side note for measure-theoretic purists, the derivation of the conditional of y given T(y)=T⁰ is arbitrary since the event has probability zero (ie, the conditioning set is of measure zero). See the Borel-Kolmogorov paradox. The computations in the paper are undoubtedly correct, but this is only one arbitrary choice of a transform (or conditioning σ-algebra).

maison Poincaré

Posted in Travel, University life with tags , , , , , , , , on November 15, 2020 by xi'an