Archive for Berlin

Die Mauer ist weg!

Posted in Statistics with tags , , , , , , , , on November 9, 2019 by xi'an

Prague fatale [book review]

Posted in Statistics with tags , , , , , , , , on November 9, 2019 by xi'an

Another Philip Kerr’s Bernie Gunther novel I order after reading [and taking place after] Prussian Blue, again with a double entendre title and plenty of smart lines representative of berliner Witz and Schnauze. But much darker than Prussian Blue, as the main character, Bernie Gunther, is getting more morally ambivalent, as a member of the SS, having participated in the mass murders of the Einsatzgruppen on the Eastern Front before his return to Berlin as a member of the intelligence branch of the SS. Under direct orders of Reinhard Heydrich, whose role in the novel is almost as central as Gunther’s. It is thus harder to relate to this anti-hero, and his constant disparagement of Nazis, when he is at the same time a significant if minor part of the Nazi State. It is also unplesant that most characters in the novel are mass murderers, to end up being executed after the war, as described in a post-note. Still, the story has strength in both the murder inquiry itself (until it fizzles out) and the immersion in 1942 Germany and Tchecoslovakia, a strength served by the historical assassination of Heydrich in May 1942. An immersion I do not wish to repeat in a near future, though…

As a side story, I bought this used book for £0.05 on Amazon and received a copy that looked as if it has been stolen from a library from East Renfrewshire, south of Glasgow, as it still had a plastic cover, the barcodes and the list of dates it had been borrowed. I thus called the central offices of the East Renfrewshire libraries to enquire whether or not the book had been stolen, and was told this was not the case, the book being part of a bulk sale of used books by the library to second hand sellers. And that I could enjoy reading the book at my own pace! (As a second order side story, East Renfrewshire is the place in Scotland where Rudolph Hess landed when trying to negociate on his own a peace treaty with Great-Britain in 1942.)

Prussian blue [book review]

Posted in Books, Travel with tags , , , , , , , , , , , , , , , , on September 28, 2019 by xi'an

This is the one-before-last volume in Philip Kerr’s Bernie Gunther series (one-before-last since the author passed away last year). Which I picked in a local bookstore for taking place in Berchtesgaden, which stands a few kilometers west of Salzburg and which I passed on my way there (and back) last week. Very good title, full of double meanings!

“When you’re working for people who are mostly thieves and murderers, a little of it comes off on your hands now and then.”

Two time-lines run in parallel in Prussian Blue, from 1939 Nazi Germany to 1956 France, from (mostly) hunter to hunted. Plenty of wisecracks worth quoting throughout the book, mostly à la Marlowe, but also singling out Berlin(ers) from the rest of Germany. An anti-hero if any in that Bernie Gunther is working there as a policeman for the Nazi State, aiming at making the law respected in a lawless era and to catch murderers at a time where the highest were all murderers and about to upscale this qualification to levels never envisioned before. Still working under Heydrich’s order to solve a murder despite the attempt of other arch-evils like Martin Bormann and Ernst Kaltenbrunner, as well as a helpful (if Hitler supporter!) Gerdy Troost. Among the Gunther novels I have read so far this one is the closest he gets to the ultimate evil, Hitler himself, who considered the Berghof in Berchtesgaden as his favourite place, without ever meeting him. The gratuitous violence and bottomless corruption inherent to the fascist regime are most realistically rendered in the thriller, to the point of making the possibility of a Bernie Gunther debatable!

‘Making a nuisance of yourself is what being a policeman is all about and suspecting people who were completely above suspicion was about the only thing that made doing the job such fun in Nazi Germany.’

As I kept reading the book I could not but draw a connection with the pre-War Rogue Male imperfect but nonetheless impressive novel, where an English “sport” hunter travels to Berchtesgaden to shoot (or aim at) Hitler only to get spotted by soldiers before committing the act and becoming hunted in his turn throughout Europe, ending up [spoiler!] in a burrow trapped by Nazi secret services [well this is not exactly the end!]. This connection has been pointed out in some reviews, but the role of the burrows and oppressive underground and the complicity of the local police forces are strongly present in both books and somewhat decreases the appeal of this novel. Especially since the 1956 thread therein is a much less convincing plot than the 1939 one, despite involving conveniently forgotten old colleagues, the East Germany Stasi, hopeless French policemen and clergymen, the Sarre referendum, [much maligned!] andouillettes and oignons.

Wow!

Posted in pictures, Running with tags , , , , , , on September 16, 2018 by xi'an

Markov chain importance sampling

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on May 31, 2018 by xi'an

Ingmar Schuster (formerly a postdoc at Dauphine and now in Freie Universität Berlin) and Ilja Klebanov (from Berlin) have recently arXived a paper on recycling proposed values in [a rather large class of] Metropolis-Hastings and unadjusted Langevin algorithms. This means using the proposed variates of one of these algorithms as in an importance sampler, with an importance weight going from the target over the (fully conditional) proposal to the target over the marginal stationary target. In the Metropolis-Hastings case, since the later is not available in most setups, the authors suggest using a Rao-Blackwellised nonparametric estimate based on the entire MCMC chain. Or a subset.

“Our estimator refutes the folk theorem that it is hard to estimate [the normalising constant] with mainstream Monte Carlo methods such as Metropolis-Hastings.”

The paper thus brings an interesting focus on the proposed values, rather than on the original Markov chain,  which naturally brings back to mind the derivation of the joint distribution of these proposed values we made in our (1996) Rao-Blackwellisation paper with George Casella. Where we considered a parametric and non-asymptotic version of this distribution, which brings a guaranteed improvement to MCMC (Metropolis-Hastings) estimates of integrals. In subsequent papers with George, we tried to quantify this improvement and to compare different importance samplers based on some importance sampling corrections, but as far as I remember, we only got partial results along this way, and did not cover the special case of the normalising constant Þ… Normalising constants did not seem such a pressing issue at that time, I figure. (A Monte Carlo 101 question: how can we be certain the importance sampler offers a finite variance?)

Ingmar’s views about this:

I think this is interesting future work. My intuition is that for Metropolis-Hastings importance sampling with random walk proposals, the variance is guaranteed to be finite because the importance distribution ρ_θ is a convolution of your target ρ with the random walk kernel q. This guarantees that the tails of ρ_θ are no lighter than those of ρ. What other forms of q mean for the tails of ρ_θ I have less intuition about.

When considering the Langevin alternative with transition (4), I was first confused and thought it was incorrect for moving from one value of Y (proposal) to the next. But that’s what unadjusted means in “unadjusted Langevin”! As pointed out in the early Langevin literature, e.g., by Gareth Roberts and Richard Tweedie, using a discretised Langevin diffusion in an MCMC framework means there is a risk of non-stationarity & non-ergodicity. Obviously, the corrected (MALA) version is more delicate to approximate (?) but at the very least it ensures the Markov chain does not diverge. Even when the unadjusted Langevin has a stationary regime, its joint distribution is likely quite far from the joint distribution of a proper discretisation. Now this also made me think about a parameterised version in the 1996 paper spirit, but there is nothing specific about MALA that would prevent the implementation of the general principle. As for the unadjusted version, the joint distribution is directly available.  (But not necessarily the marginals.)

Here is an answer from Ingmar about that point

Personally, I think the most interesting part is the practical performance gain in terms of estimation accuracy for fixed CPU time, combined with the convergence guarantee from the CLT. ULA was particularly important to us because of the papers of Arnak Dalalyan, Alain Durmus & Eric Moulines and recently from Mike Jordan’s group, which all look at an unadjusted Langevin diffusion (and unimodal target distributions). But MALA admits a Metropolis-Hastings importance sampling estimator, just as Random Walk Metropolis does – we didn’t include MALA in the experiments to not get people confused with MALA and ULA. But there is no delicacy involved whatsoever in approximating the marginal MALA proposal distribution. The beauty of our approach is that it works for almost all Metropolis-Hastings algorithms where you can evaluate the proposal density q, there is no constraint to use random walks at all (we will emphasize this more in the paper).

Berlin [and Vienna] noir [book review]

Posted in Statistics with tags , , , , , , , , , , on August 17, 2017 by xi'an

While in Cambridge last month, I picked a few books from a local bookstore as fodder for my incoming vacations. Including this omnibus volume made of the first three books by Philip Kerr featuring Bernie Gunther, a private and Reich detective in Nazi Germany, namely, March Violets (1989), The Pale Criminal (1990), and A German Requiem (1991). (Book that I actually read before the vacations!) The stories take place before the war, in 1938, and right after, in 1946, in Berlin and Vienna. The books centre on a German version of Philip Marlowe, wise cracks included, with various degrees of success. (There actually is a silly comparison with Chandler on the back of the book! And I found somewhere else a similarly inappropriate comparison with Graham Greene‘s The Third Man…) Although I read the whole three books in a single week, which clearly shows some undeniable addictive quality in the plots, I find those plots somewhat shallow and contrived, especially the second one revolving around a serial killer of young girls that aims at blaming Jews for those crimes and at justifying further Nazi persecutions. Or the time spent in Dachau by Bernie Gunther as undercover agent for Heydrich. If anything, the third volume taking place in post-war Berlin and Wien is much better at recreating the murky atmosphere of those cities under Allied occupations. But overall there is much too much info-dump passages in those novels to make them a good read. The author has clearly done his documentation job correctly, from the early homosexual persecutions to Kristallnacht, to the fights for control between the occupying forces, but the information about the historical context is not always delivered in the most fluent way. And having the main character working under Heydrich, then joining the SS, does make relating to him rather unlikely, to say the least. It is hence unclear to me why those books are so popular, apart from the easy marketing line that stories involving Nazis are more likely to sell… Nothing to be compared with the fantastic Alone in Berlin, depicting the somewhat senseless resistance of a Berliner during the Nazi years, dropping hand-written messages against the regime under strangers’ doors.

seeking the error in nested sampling

Posted in pictures, Statistics, Travel with tags , , , , , , on April 13, 2017 by xi'an

A newly arXived paper on the error in nested sampling, written by Higson and co-authors, and read in Berlin, looks at the difficult task of evaluating the sampling error of nested sampling. The conclusion is essentially negative in that the authors recommend multiple runs of the method to assess the magnitude of the variability of the output by bootstrap, i.e. to call for the most empirical approach…

The core of this difficulty lies in the half-plug-in, half-quadrature, half-Monte Carlo (!) feature of the nested sampling algorithm, in that (i) the truncation of the unit interval is based on a expectation of the mass of each shell (i.e., the zone between two consecutive isoclines of the likelihood, (ii) the evidence estimator is a quadrature formula, and (iii) the level of the likelihood at the truncation is replaced with a simulated value that is not even unbiased (and correlated with the previous value in the case of an MCMC implementation). As discussed in our paper with Nicolas, the error in the evidence approximation is of the same order as other Monte Carlo methods in that it gets down like the square root of the number of terms at each iteration. Contrary to earlier intuitions that focussed on the error due to the quadrature.

But the situation is much less understood when the resulting sample is used for estimation of quantities related with the posterior distribution. With no clear approach to assess and even less correct the resulting error, since it is not solely a Monte Carlo error. As noted by the authors, the quadrature approximation to the univariate integral replaces the unknown prior weight of a shell with its Beta order statistic expectation and the average of the likelihood over the shell with a single (uniform???) realisation. Or the mean value of a transform of the parameter with a single (biased) realisation. Since most posterior expectations can be represented as integrals over likelihood levels of the average value over an iso-likelihood contour. The approach advocated in the paper involved multiple threads of an “unwoven nested sampling run”, which means launching n nested sampling runs with one living term from the n currents living points in the current nested sample. (Those threads may then later be recombined into a single nested sample.) This is the starting point to a nested flavour of bootstrapping, where threads are sampled with replacement, from which confidence intervals and error estimates can be constructed. (The original notion appears in Skilling’s 2006 paper, but I missed it.)

The above graphic is an attempt within the paper at representing the (marginal) posterior of a transform f(θ). That I do not fully understand… The notations are rather horrendous as X is not the data but the prior probability for the likelihood to be above a given bound which is actually the corresponding quantile. (There is no symbol for data and £ is used for the likelihood function as well as realisations of the likelihood function…) A vertical slice on the central panel gives the posterior distribution of f(θ) given the event that the likelihood is in the corresponding upper tail. Or given the corresponding shell (?).