Archive for the Books Category

MUDAM

Posted in Books, Kids, pictures, Travel with tags , , , , , , , on October 22, 2017 by xi'an

As our son is doing an internship in Luxembourg City this semester, we visited him last weekend and took the opportunity to visit the Museum of Modern Art (or MUDAM) there. The building itself is quite impressive, inserted in the walls of the 18th Century Fort Thüngen designed by Vauban, with a very luminous and airy building designed by Ming Pei. The main exhibit at the MUDAM is a coverage of the work on Su-Mei Tse, an artist from Luxembourg I did not know but whom vision I find both original and highly impressive, playing on scales and space, from atoms to planets… With connections to Monet’s nympheas. And an almost raw rendering of rock forms that I appreciate most particularly!

The bottom floor also contains an extensive display of the political drawings of Ad Reinhardt, who is more (?) famous for his black-on-black series…

Langevin on a wrong bend

Posted in Books, Statistics with tags , , , , , , , on October 19, 2017 by xi'an

Arnak Dalayan and Avetik Karagulyan (CREST) arXived a paper the other week on a focussed study of the Langevin algorithm [not MALA] when the gradient of the target is incorrect. With the following improvements [quoting non-verbatim from the paper]:

  1. a varying-step Langevin that reduces the number of iterations for a given Wasserstein precision, compared with recent results by e.g. Alan Durmus and Éric Moulines;
  2. an extension of convergence results for error-prone evaluations of the gradient of the target (i.e., the gradient is replaced with a noisy version, under some moment assumptions that do not include unbiasedness);
  3. a new second-order sampling algorithm termed LMCO’, with improved convergence properties.

What is particularly interesting to me in this setting is the use in all these papers of a discretised Langevin diffusion (a.k.a., random walk with a drift induced by the gradient of the log-target) without the original Metropolis correction. The results rely on an assumption of [strong?] log-concavity of the target, with “user-friendly” bounds on the Wasserstein distance depending on the constants appearing in this log-concavity constraint. And so does the adaptive step. (In the case of the noisy version, the bias and variance of the noise also matter. As pointed out by the authors, there is still applicability to scaling MCMC for large samples. Beyond pseudo-marginal situations.)

“…this, at first sight very disappointing behavior of the LMC algorithm is, in fact, continuously connected to the exponential convergence of the gradient descent.”

The paper concludes with an interesting mise en parallèle of Langevin algorithms and of gradient descent algorithms, since the convergence rates are the same.

back to ca’ Foscari, Venezia

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , on October 16, 2017 by xi'an

I am off to Venezia this afternoon for a Franco-Italian workshop organised by my friends Monica Billio, Roberto Casarin, and Matteo Iacopini, from the Department of Economics of Ca’ Foscari, almost exactly a year after my previous trip there for ESOBE 2016. (Except that this was before!) Tomorrow, I will give both a tutorial [for the second time in two weeks!] and a talk on ABC, hopefully with some portion of the audience still there for the second part!

never let me go [book review]

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , on October 15, 2017 by xi'an

Another chance occurrence led me to read that not so recent book by Kazuo Ishiguro, taking advantage of my short nights while in Warwick. [I wrote this post before the unexpected Nobelisation of the author.] As in earlier novels of his, the strongest feeling is one of melancholia, of things that had been or had supposed to have been and are no longer. Especially the incomparable The Remains of the Day… In the great tradition of the English [teen] novel, this ideal universe is a boarding school, where a group of students bond and grow up, until they face the real world. The story is told with a lot of flashbacks and personal impressions of the single narrator, which made me uncertain of the reality behind her perception and recasting. And of her role and actions within that group, since they always appear more mature and sensible than the others’. The sinister features of this boarding school and the reasons why these children are treated differently emerge very very slowly through the book and the description of their treatment remains unclear till the end of the book. Purposely so. However, once one understands the very reason for their existence, the novels looses its tension, as the perpetual rotation of their interactions gets inconsequential when faced with their short destinies. While one can get attached to the main characters, the doom awaiting them blurs the relevance of their affairs and disputes. Maybe what got me so quickly distanced from the story is the complacency of these characters and the lack of rebellion against their treatment, unless of course it was the ultimate goal of Ishiguro to show that readers, as the “normal” characters in the story, would come to treat the other ones as not completely human… While the final scene about souvenirs and memories sounding like plastic trash trapped on barbed wires seems an easy line, I appreciated the slow construct of the art pieces of Tommy and the maybe too obvious link with their own destiny.

When searching for reviews about this book, I discovered a movie had been made out this book, in 2011, with the same title. And of which I had never heard either..! [Which made me realise the characters were all very young when they died.]

Statistics versus Data Science [or not]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on October 13, 2017 by xi'an

Last week a colleague from Warwick forwarded us a short argumentation by Donald Macnaughton (a “Toronto-based statistician”) about switching the name of our field from Statistics to Data Science. This is not the first time I hear of this proposal and this is not the first time I express my strong disagreement with it! Here are the naughtonian arguments

  1. Statistics is (at least in the English language) endowed with several meanings from the compilation of numbers out of a series of observations to the field, to the procedures proposed by the field. This is argued to be confusing for laypeople. And missing the connection with data at the core of our field. As well as the indication that statistics gathers information from the data. Data science seems to convey both ideas… But it is equally vague in that most scientific fields if not all rely on data and observations and the structure exploitation of such data. Actually a lot of so-called “data-scientists” have specialised in the analysis of data from their original field, without voluntarily embarking upon a career of data-scientist. And not necessarily acquiring the proper tools for incorporating uncertainty quantification (aka statistics!).
  2. Statistics sounds old-fashioned and “old-guard” and “inward-looking” and unattractive to young talents, while they flock to Data Science programs. Which is true [that they flock] but does not mean we [as a field] must flock there as well. In five or ten years, who can tell this attraction of data science(s) will still be that strong. We already had to switch our Master names to Data Science or the like, this is surely more than enough.
  3. Data science is encompassing other areas of science, like computer science and operation research, but this is not an issue both in terms of potential collaborations and gaining the upper ground as a “key part” in the field. Which is more wishful thinking than a certainty, given the existing difficulties in being recognised as a major actor in data analysis. (As for instance in a recent grant evaluation in “Big Data” where the evaluation committee involved no statistician. And where we got rejected.)

Nature snapshots [and snide shots]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on October 12, 2017 by xi'an

A very rich issue of Nature I received [late] just before leaving for Warwick with a series of reviews on quantum computing, presenting machine learning as the most like immediate application of this new type of computing. Also including irate letters and an embarassed correction of an editorial published the week before reflecting on the need (or lack thereof) to remove or augment statues of scientists whose methods were unethical, even when eventually producing long lasting advances. (Like the 19th Century gynecologist J. Marion Sims experimenting on female slaves.) And a review of a book on the fascinating topic of Chinese typewriters. And this picture above of a flooded playground that looks like a piece of abstract art thanks to the muddy background.

“Quantum mechanics is well known to produce atypical patterns in data. Classical machine learning methods such as deep neural networks frequently have the feature that they can both recognize statistical patterns in data and produce data that possess the same statistical patterns: they recognize the patterns that they produce. This observation suggests the following hope. If small quantum information processors can produce statistical patterns that are computationally difficult for a classical computer to produce, then perhaps they can also recognize patterns that are equally difficult to recognize classically.” Jacob Biamonte et al., Nature, 14 Sept 2017

One of the review papers on quantum computing is about quantum machine learning. Although like Jon Snow I know nothing about this, I find it rather dull as it spends most of its space on explaining existing methods like PCA and support vector machines. Rather than exploring potential paradigm shifts offered by the exotic nature of quantum computing. Like moving to Bayesian logic that mimics a whole posterior rather than produces estimates or model probabilities. And away from linear representations. (The paper mentions a O(√N) speedup for Bayesian inference in a table, but does not tell more, which may thus be only about MAP estimators for all I know.) I also disagree with the brave new World tone of the above quote or misunderstand its meaning. Since atypical and statistical cannot but clash, “universal deep quantum learners may recognize and classify patterns that classical computers cannot” does not have a proper meaning. The paper contains a vignette about quantum Boltzman machines that finds a minimum entropy approximation to a four state distribution, with comments that seem to indicate an ability to simulate from this system.

Monte Carlo calculations of the radial distribution functions for a proton-electron plasma

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on October 11, 2017 by xi'an

“In conclusion, the Monte Carlo method of calculating radial distribution functions in a plasma is a feasible approach if significant computing time is available (…) The results indicate that at least 10000 iterations must be completed before the system can be considered near to its equilibrium state, and for a badly chosen starting configuration, the run would need to be considerably longer (…) for more conclusive results a longer run is needed so that the energy of the system can settle into an equilibrium pattern and steady-state radial distribution functions can be obtained.” A.A. Barker

Looking for the history behind Barker’s formula the other day made me look for the original 1965 paper. Which got published in the Australian Journal of Physics at the beginning of Barker’s PhD at the University of Adelaide.

As shown in the above screenshot, the basis  of Barker’s algorithm is indeed Barker’s acceptance probability, albeit written in a somewhat confusing way since the current value of the chain is kept if a Uniform variate is smaller than what is actually the rejection probability. No mistake there! And more interestingly, Barker refers to Wood and Parker (1957) for the “complete and rigorous theory” behind the method. (Both Wood and Parker being affiliated with Los Alamos Scientific Laboratory, while Barker acknowledges support from both the Australian Institute of Nuclear Science and Engineering and the Weapons Research Establishment, Salisbury… This were times when nuclear weapon research was driving MCMC. Hopefully we will not come back to such times. Or, on the pessimistic side, we will not have time to come back to such times!)

As in Metropolis et al. (1953), the analysis is made on a discretised (finite) space, building the Markov transition matrix, stating the detailed balance equation (called microscopic reversibility). Interestingly, while Barker acknowledges that there are other ways of assigning the transition probability, his is the “most rapid” in terms of mixing. And equally interestingly, he discusses the scale of the random walk in the [not-yet-called] Metropolis-within-Gibbs move as major, targetting 0.5 as the right acceptance rate, and suggesting to adapt this scale on the go. There is also a side issue that is apparently not processed with all due rigour, namely the fact that the particles in the system cannot get arbitrarily close from one another. It is unclear how a proposal falling below this distance is processed by Barker’s algorithm. When implemented on 32 particles, this algorithm took five hours to execute 6100 iterations. With a plot of the target energy function that does not shout convergence, far from it! As acknowledged by Barker himself (p.131).

The above quote is from the conclusion and its acceptance of the need for increased computing times comes as a sharp contrast with this week when one of our papers was rejected based on this very feature..!