Archive for quarantine

a cartoon that could have been made for lockdown

Posted in Books, pictures with tags , , , , , , , on June 27, 2020 by xi'an

scalable Langevin exact algorithm [armchair Read Paper]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on June 26, 2020 by xi'an

So, Murray Pollock, Paul Fearnhead, Adam M. Johansen and Gareth O. Roberts presented their Read Paper with discussions on the Wednesday aft! With a well-sized if virtual audience of nearly a hundred people. Here are a few notes scribbled during the Readings. And attempts at keeping the traditional structure of the meeting alive.

In their introduction, they gave the intuition of a quasi-stationary chain as the probability to be in A at time t while still alice as π(A) x exp(-λt) for a fixed killing rate λ. The concept is quite fascinating if less straightforward than stationarity! The presentation put the stress on the available recourse to an unbiased estimator of the κ rate whose initialisation scaled as O(n) but allowed a subsampling cost reduction afterwards. With a subsampling rat connected with Bayesian asymptotics, namely on how quickly the posterior concentrates. Unfortunately, this makes the practical construction harder, since n is finite and the concentration rate is unknown (although a default guess should be √n). I wondered if the link with self-avoiding random walks was more than historical.

The initialisation of the method remains a challenge in complex environments. And hence one may wonder if and how better it does when compared with SMC. Furthermore, while the motivation for using a Brownian motion stems from the practical side, this simulation does not account for the target π. This completely blind excursion sounds worse than simulating from the prior in other settings.

One early illustration for quasi stationarity was based on an hypothetical distribution of lions and wandering (Brownian) antelopes. I found that the associated concept of soft killing was not necessarily well received by …. the antelopes!

As it happens, my friend and coauthor Natesh Pillai was the first discussant! I did no not get the details of his first bimodal example. But he addressed my earlier question about how large the running time T should be. Since the computational cost should be exploding with T. He also drew a analogy with improper posteriors as to wonder about the availability of convergence assessment.

And my friend and coauthor Nicolas Chopin was the second discussant! Starting with a request to… leave the Pima Indians (model)  alone!! But also getting into a deeper assessment of the alternative use of SMCs.

scalable Langevin exact algorithm [Read Paper]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , on June 23, 2020 by xi'an

Murray Pollock, Paul Fearnhead, Adam M. Johansen and Gareth O. Roberts (CoI: all with whom I have strong professional and personal connections!) have a Read Paper discussion happening tomorrow [under relaxed lockdown conditions in the UK, except for the absurd quatorzine on all travelers|, but still in a virtual format] that we discussed together [from our respective homes] at Paris Dauphine. And which I already discussed on this blog when it first came out.

Here are quotes I spotted during this virtual Dauphine discussion but we did not come up with enough material to build a significant discussion, although wondering at the potential for solving the O(n) bottleneck, handling doubly intractable cases like the Ising model. And noticing the nice features of the log target being estimable by unbiased estimators. And of using control variates, for once well-justified in a non-trivial environment.

“However, in practice this simple idea is unlikely to work. We can see this most clearly with the rejection sampler, as the probability of survival will decrease exponentially with t—and thus the rejection probability will often be prohibitively large.”

“This can be viewed as a rejection sampler to simulate from μ(x,t), the distribution of the Brownian motion at time  t conditional on its surviving to time t. Any realization that has been killed is ‘rejected’ and a realization that is not killed is a draw from μ(x,t). It is easy to construct an importance sampling version of this rejection sampler.”

scalable Metropolis-Hastings, nested Monte Carlo, and normalising flows

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , on June 16, 2020 by xi'an

Over a sunny if quarantined Sunday, I started reading the PhD dissertation of Rob Cornish, Oxford University, as I am the external member of his viva committee. Ending up in a highly pleasant afternoon discussing this thesis over a (remote) viva yesterday. (If bemoaning a lost opportunity to visit Oxford!) The introduction to the viva was most helpful and set the results within the different time and geographical zones of the Ph.D since Rob had to switch from one group of advisors in Engineering to another group in Statistics. Plus an encompassing prospective discussion, expressing pessimism at exact MCMC for complex models and looking forward further advances in probabilistic programming.

Made of three papers, the thesis includes this ICML 2019 [remember the era when there were conferences?!] paper on scalable Metropolis-Hastings, by Rob Cornish, Paul Vanetti, Alexandre Bouchard-Côté, Georges Deligiannidis, and Arnaud Doucet, which I commented last year. Which achieves a remarkable and paradoxical O(1/√n) cost per iteration, provided (global) lower bounds are found on the (local) Metropolis-Hastings acceptance probabilities since they allow for Poisson thinning à la Devroye (1986) and  second order Taylor expansions constructed for all components of the target, with the third order derivatives providing bounds. However, the variability of the acceptance probability gets higher, which induces a longer but still manageable if the concentration of the posterior is in tune with the Bernstein von Mises asymptotics. I had not paid enough attention in my first read at the strong theoretical justification for the method, relying on the convergence of MAP estimates in well- and (some) mis-specified settings. Now, I would have liked to see the paper dealing with a more complex problem that logistic regression.

The second paper in the thesis is an ICML 2018 proceeding by Tom Rainforth, Robert Cornish, Hongseok Yang, Andrew Warrington, and Frank Wood, which considers Monte Carlo problems involving several nested expectations in a non-linear manner, meaning that (a) several levels of Monte Carlo approximations are required, with associated asymptotics, and (b) the resulting overall estimator is biased. This includes common doubly intractable posteriors, obviously, as well as (Bayesian) design and control problems. [And it has nothing to do with nested sampling.] The resolution chosen by the authors is strictly plug-in, in that they replace each level in the nesting with a Monte Carlo substitute and do not attempt to reduce the bias. Which means a wide range of solutions (other than the plug-in one) could have been investigated, including bootstrap maybe. For instance, Bayesian design is presented as an application of the approach, but since it relies on the log-evidence, there exist several versions for estimating (unbiasedly) this log-evidence. Similarly, the Forsythe-von Neumann technique applies to arbitrary transforms of a primary integral. The central discussion dwells on the optimal choice of the volume of simulations at each level, optimal in terms of asymptotic MSE. Or rather asymptotic bound on the MSE. The interesting result being that the outer expectation requires the square of the number of simulations for the other expectations. Which all need converge to infinity. A trick in finding an estimator for a polynomial transform reminded me of the SAME algorithm in that it duplicated the simulations as many times as the highest power of the polynomial. (The ‘Og briefly reported on this paper… four years ago.)

The third and last part of the thesis is a proposal [to appear in ICML 20] on relaxing bijectivity constraints in normalising flows with continuously index flows. (Or CIF. As Rob made a joke about this cleaning brand, let me add (?) to that joke by mentioning that looking at CIF and bijections is less dangerous in a Trump cum COVID era at CIF and injections!) With Anthony Caterini, George Deligiannidis and Arnaud Doucet as co-authors. I am much less familiar with this area and hence a wee bit puzzled at the purpose of removing what I understand to be an appealing side of normalising flows, namely to produce a manageable representation of density functions as a combination of bijective and differentiable functions of a baseline random vector, like a standard Normal vector. The argument made in the paper is that imposing this representation of the density imposes a constraint on the topology of its support since said support is homeomorphic to the support of the baseline random vector. While the supporting theoretical argument is a mathematical theorem that shows the Lipschitz bound on the transform should be infinity in the case the supports are topologically different, these arguments may be overly theoretical when faced with the practical implications of the replacement strategy. I somewhat miss its overall strength given that the whole point seems to be in approximating a density function, based on a finite sample.

a journal of the plague year [deconfined reviews]

Posted in Books, Kids, pictures, Travel, Wines with tags , , , , , , , , , , , , , , , , , , , , , , , , , , on June 13, 2020 by xi'an

Watched two Korean films, Train to Busan and then Psychokinesis, both by Yeon Sang-Ho. The first one is a mostly traditional zombie action movie, loosing one character after another to the disease with a few funny moments. The second one is also involving supernatural features, rather poorly done, but it offers some political satire on a corrupted real estate project that make it somewhat tolerable. If barely.

Read (an old, cellar-relegated, somewhat mouldy) Henning Mankell’s Sidetracked (book #5 in the Kurt Wallander series), which was good enough to be enjoyable, albeit with a serial killer plot (always a lazy plot idea!), but it definitely made me regret the Martin Beck books of Sjowall and Wahloo which had a stronger social and political perspective (to the point of this book sounding like a pale replica). The more because Maj Sjowall passed away in early May. (Having survived Wahloo by 45 years and never revisiting the series.)

Baked more breads, including rye bread, and experimented with new dishes, like a jollof rice attempt with wild garlic, which tasted a wee too mild and took more time at the cleaning stage of the cocotte (French oven) than the cooking one. As it is one of these rice dishes like tahdig that call for a slightly burned bottom! Cooked several clafoutis with garden cherries and strawberries. Started making weekly rhubarb compote since available at the farmers’ market.

And now growing tomatoes and beans and peppers and onions and potatoes and butternut… We also found a woodcock most unusually and inexplicably stranded in the garden, feeling the worse for a cat attack as shown by a few feathers in the grass, but it was impossible to catch and hence protect from all the stray cats in the neighbourhood. After a day or two, we did not find any remain of the bird, so it presumably escaped.

Also watched A Sun (陽光普照), a psychological Taiwanese film about an “ordinary” family unraveling when the youngest son goes to jail. With an astounding Muter Courage as the central character. And a surprising sequence of characterial twists in the story that makes the movie less bleak that the first 30mn could induce, revealing layers in most characters that were carefully left hidden in the beginning of the film. (Except for the unfortunate girlfriend of A-Ho, who hardly utters a word and never seems to join the family.) With a beautiful final shot relating to the early years of A-Ho. (As one character is named A-Ho and another one A-Hao, it took me a while to spot the difference and stop thinking there were two parallel time-lines in the story!) A really strong film!

Succumbed (!) to ordering and reading the True Bastards, by Johnathan [pardon my] French. Which is the second volume in the Lot Lands trilogy and about as fun as the first volume, although the role of cursed magics is somehow over-done. But the change in this book from a male to a female viewpoint is definitely a worthwhile rarity in fantasy novels, showing how much harder the main character, Fetch, has to work to lead her troop of ½ orcs. And a constant threat of being belittled as such by other characters. (Any resemblance to real life problems being obviously coincidental!) Don’t expect real depth though, from the plot which keeps running in hogs’ circles to the point where everyone seems to be the hidden relative of everyone else (still alive!) to the underlying message, if any!

Got tricked by a Guardian article into watching The Vast of Nights. Frankly, I do not understand the praise heaped upon this academic-oh–so-academic exercise in film making! Yes, it does sound like the Twilight Zone, as the story is the usual trope on aliens being here with only the military being aware of it, trying to bank on their advanced technology, &tc. The hapless characters confuse agitation and action, while speaking, speaking, speaking all the time… It should have been a radio show. In the 1950’s. Not in the current times, already awash in conspiracy theories (although, far from it!, the film is clearly seen as an exercice de style recreating the mood of a 1960’s rural town in the South of the USA, with an endless sequence in a vintage car park).