What should have been the last puzzle in Le Monde competition turned out to be an anticlimactic fizzle on how many yes-no questions are needed to identify an integer between 1 and 1025=2¹⁰+1 and an extension to replies possibly being lies…

What is much more exciting is that voting puzzle #1021 got cancelled because the authors of this puzzle thought the cascading majority rule would produce the optimal solution and it does not! (As exhibited by my R code.) So here is an open problem to ponder about! (And another puzzle in the pipeline to complete the competition.)

As our son is doing an internship in Luxembourg City this semester, we visited him last weekend and took the opportunity to visit the Museum of Modern Art (or MUDAM) there. The building itself is quite impressive, inserted in the walls of the 18th Century Fort Thüngen designed by Vauban, with a very luminous and airy building designed by Ming Pei. The main exhibit at the MUDAM is a coverage of the work on Su-Mei Tse, an artist from Luxembourg I did not know but whom vision I find both original and highly impressive, playing on scales and space, from atoms to planets… With connections to Monet’s nympheas. And an almost raw rendering of rock forms that I appreciate most particularly!

The bottom floor also contains an extensive display of the political drawings of Ad Reinhardt, who is more (?) famous for his black-on-black series…

Arnak Dalayan and Avetik Karagulyan (CREST) arXived a paper the other week on a focussed study of the Langevin algorithm [not MALA] when the gradient of the target is incorrect. With the following improvements [quoting non-verbatim from the paper]:

an extension of convergence results for error-prone evaluations of the gradient of the target (i.e., the gradient is replaced with a noisy version, under some moment assumptions that do not include unbiasedness);

a new second-order sampling algorithm termed LMCO’, with improved convergence properties.

What is particularly interesting to me in this setting is the use in all these papers of a discretised Langevin diffusion (a.k.a., random walk with a drift induced by the gradient of the log-target) without the original Metropolis correction. The results rely on an assumption of [strong?] log-concavity of the target, with “user-friendly” bounds on the Wasserstein distance depending on the constants appearing in this log-concavity constraint. And so does the adaptive step. (In the case of the noisy version, the bias and variance of the noise also matter. As pointed out by the authors, there is still applicability to scaling MCMC for large samples. Beyond pseudo-marginal situations.)

“…this, at first sight very disappointing behavior of the LMC algorithm is, in fact, continuously connected to the exponential convergence of the gradient descent.”

The paper concludes with an interesting mise en parallèle of Langevin algorithms and of gradient descent algorithms, since the convergence rates are the same.

I am off to Venezia this afternoon for a Franco-Italian workshop organised by my friends Monica Billio, Roberto Casarin, and Matteo Iacopini, from the Department of Economics of Ca’ Foscari, almost exactly a year after my previous trip there for ESOBE 2016. (Except that this was before!) Tomorrow, I will give both a tutorial [for the second time in two weeks!] and a talk on ABC, hopefully with some portion of the audience still there for the second part!

Another chance occurrence led me to read that not so recent book by Kazuo Ishiguro, taking advantage of my short nights while in Warwick. [I wrote this post before the unexpected Nobelisation of the author.] As in earlier novels of his, the strongest feeling is one of melancholia, of things that had been or had supposed to have been and are no longer. Especially the incomparable The Remains of the Day… In the great tradition of the English [teen] novel, this ideal universe is a boarding school, where a group of students bond and grow up, until they face the real world. The story is told with a lot of flashbacks and personal impressions of the single narrator, which made me uncertain of the reality behind her perception and recasting. And of her role and actions within that group, since they always appear more mature and sensible than the others’. The sinister features of this boarding school and the reasons why these children are treated differently emerge very very slowly through the book and the description of their treatment remains unclear till the end of the book. Purposely so. However, once one understands the very reason for their existence, the novels looses its tension, as the perpetual rotation of their interactions gets inconsequential when faced with their short destinies. While one can get attached to the main characters, the doom awaiting them blurs the relevance of their affairs and disputes. Maybe what got me so quickly distanced from the story is the complacency of these characters and the lack of rebellion against their treatment, unless of course it was the ultimate goal of Ishiguro to show that readers, as the “normal” characters in the story, would come to treat the other ones as not completely human… While the final scene about souvenirs and memories sounding like plastic trash trapped on barbed wires seems an easy line, I appreciated the slow construct of the art pieces of Tommy and the maybe too obvious link with their own destiny.

When searching for reviews about this book, I discovered a movie had been made out this book, in 2011, with the same title. And of which I had never heard either..! [Which made me realise the characters were all very young when they died.]

Last week a colleague from Warwick forwarded us a short argumentation by Donald Macnaughton (a “Toronto-based statistician”) about switching the name of our field from Statistics to Data Science. This is not the first time I hear of this proposal and this is not the first time I express my strong disagreement with it! Here are the naughtonian arguments

Statistics is (at least in the English language) endowed with several meanings from the compilation of numbers out of a series of observations to the field, to the procedures proposed by the field. This is argued to be confusing for laypeople. And missing the connection with data at the core of our field. As well as the indication that statistics gathers information from the data. Data science seems to convey both ideas… But it is equally vague in that most scientific fields if not all rely on data and observations and the structure exploitation of such data. Actually a lot of so-called “data-scientists” have specialised in the analysis of data from their original field, without voluntarily embarking upon a career of data-scientist. And not necessarily acquiring the proper tools for incorporating uncertainty quantification (aka statistics!).

Statistics sounds old-fashioned and “old-guard” and “inward-looking” and unattractive to young talents, while they flock to Data Science programs. Which is true [that they flock] but does not mean we [as a field] must flock there as well. In five or ten years, who can tell this attraction of data science(s) will still be that strong. We already had to switch our Master names to Data Science or the like, this is surely more than enough.

Data science is encompassing other areas of science, like computer science and operation research, but this is not an issue both in terms of potential collaborations and gaining the upper ground as a “key part” in the field. Which is more wishful thinking than a certainty, given the existing difficulties in being recognised as a major actor in data analysis. (As for instance in a recent grant evaluation in “Big Data” where the evaluation committee involved no statistician. And where we got rejected.)

A very rich issue of Nature I received [late] just before leaving for Warwick with a series of reviews on quantum computing, presenting machine learning as the most like immediate application of this new type of computing. Also including irate letters and an embarassed correction of an editorial published the week before reflecting on the need (or lack thereof) to remove or augment statues of scientists whose methods were unethical, even when eventually producing long lasting advances. (Like the 19th Century gynecologist J. Marion Sims experimenting on female slaves.) And a review of a book on the fascinating topic of Chinese typewriters. And this picture above of a flooded playground that looks like a piece of abstract art thanks to the muddy background.

“Quantum mechanics is well known to produce atypical patterns in data. Classical machine learning methods such as deep neural networks frequently have the feature that they can both recognize statistical patterns in data and produce data that possess the same statistical patterns: they recognize the patterns that they produce. This observation suggests the following hope. If small quantum information processors can produce statistical patterns that are computationally difficult for a classical computer to produce, then perhaps they can also recognize patterns that are equally difficult to recognize classically.” Jacob Biamonte et al., Nature, 14 Sept 2017

One of the review papers on quantum computing is about quantum machine learning. Although like Jon Snow I know nothing about this, I find it rather dull as it spends most of its space on explaining existing methods like PCA and support vector machines. Rather than exploring potential paradigm shifts offered by the exotic nature of quantum computing. Like moving to Bayesian logic that mimics a whole posterior rather than produces estimates or model probabilities. And away from linear representations. (The paper mentions a O(√N) speedup for Bayesian inference in a table, but does not tell more, which may thus be only about MAP estimators for all I know.) I also disagree with the brave new World tone of the above quote or misunderstand its meaning. Since atypical and statistical cannot but clash, “universal deep quantum learners may recognize and classify patterns that classical computers cannot” does not have a proper meaning. The paper contains a vignette about quantum Boltzman machines that finds a minimum entropy approximation to a four state distribution, with comments that seem to indicate an ability to simulate from this system.