model misspecification in ABC

With David Frazier and Judith Rousseau, we just arXived a paper studying the impact of a misspecified model on the outcome of an ABC run. This is a question that naturally arises when using ABC, but that has been not directly covered in the literature apart from a recently arXived paper by James Ridgway [that was earlier this month commented on the ‘Og]. On the one hand, ABC can be seen as a robust method in that it focus on the aspects of the assumed model that are translated by the [insufficient] summary statistics and their expectation. And nothing else. It is thus tolerant of departures from the hypothetical model that [almost] preserve those moments. On the other hand, ABC involves a degree of non-parametric estimation of the intractable likelihood, which may sound even more robust, except that the likelihood is estimated from pseudo-data simulated from the “wrong” model in case of misspecification.

In the paper, we examine how the pseudo-true value of the parameter [that is, the value of the parameter of the misspecified model that comes closest to the generating model in terms of Kullback-Leibler divergence] is asymptotically reached by some ABC algorithms like the ABC accept/reject approach and not by others like the popular linear regression [post-simulation] adjustment. Which suprisingly concentrates posterior mass on a completely different pseudo-true value. Exploiting our recent assessment of ABC convergence for well-specified models, we show the above convergence result for a tolerance sequence that decreases to the minimum possible distance [between the true expectation and the misspecified expectation] at a slow enough rate. Or that the sequence of acceptance probabilities goes to zero at the proper speed. In the case of the regression correction, the pseudo-true value is shifted by a quantity that does not converge to zero, because of the misspecification in the expectation of the summary statistics. This is not immensely surprising but we hence get a very different picture when compared with the well-specified case, when regression corrections bring improvement to the asymptotic behaviour of the ABC estimators. This discrepancy between two versions of ABC can be exploited to seek misspecification diagnoses, e.g. through the acceptance rate versus the tolerance level, or via a comparison of the ABC approximations to the posterior expectations of quantities of interest which should diverge at rate Vn. In both cases, ABC reference tables/learning bases can be exploited to draw and calibrate a comparison with the well-specified case.


Il cemeterio de alpinismo

In the cemetery around Chiesa Vecchia in Macugnaga, at the bottom of Monte Rosa (and the other size from Zermatt), there is such a number of alpinists and guides buried there that the cemetery is called the cemetery of the alpinists. A memorial recalls deaths of local guides and climbers on the different routes of the Monte Rosa group [there is no Monte Rosa peak per se but a collection of 15 tops above 4000m]. Plus crosses and plaques on the church wall for those whose bodies were not recovered, according to a local guide. (Which sounds strange given that these are not the Himalayas! Unless these are glacier-related deaths…]

the DeepMind debacle

“I hope for a world where data is at the heart of understanding and decision making. To achieve this we need better public dialogue.” Hetan Shah

As I was reading one of the Nature issues I brought on vacations, while the rain was falling on an aborted hiking day on the fringes of Monte Rosa, I came across a 20 July tribune by Hetan Shah, executive director of the RSS. A rare occurrence of a statistician’s perspective in Nature. The event prompting this column is the ruling against the Royal Free London hospital group providing patient data to DeepMind for predicting kidney. Without the patients’ agreement. And with enough information to identify the patients. The issues raised by Hetan Shah are that data transfers should become open, and that they should be commensurate in volume and details to the intended goals. And that public approval should be seeked. While I know nothing about this specific case, I find the article overly critical of DeepMind, which interest in health related problems is certainly not pure and disinterested but nonetheless can contribute advances in (personalised) care and prevention through its expertise in machine learning. (Disclaimer: I have neither connection nor conflict with the company!) And I do not see exactly how public approval or dialogue can help in making progress in handling data, unless I am mistaken in my understanding of “the public”. The article mentions the launch of a UK project on data ethics, involving several [public] institutions like the RSS: this is certainly commandable and may improve personal data is handled by companies, but I would not call this conglomerate representative of the public, which most likely does not really trust these institutions either…

Das Kapital [not a book review]

A rather bland article by Gareth Stedman Jones in Nature reminded me that the first volume of Karl Marx’ Das Kapital is 150 years old this year. Which makes it appear quite close in historical terms [just before the Franco-German war of 1870] and rather remote in scientific terms. I remember going painstakingly through the books in 1982 and 1983, mostly during weekly train trips between Paris and Caen, and not getting much out of it! Even with the help of a cartoon introduction I had received as a 1982 Xmas gift! I had no difficulty in reading the text per se, as opposed to my attempt of Kant’s Critique of Pure Reason the previous summer [along with the other attempt to windsurf!], as the discourse was definitely grounded in economics and not in philosophy. But the heavy prose did not deliver a convincing theory of the evolution of capitalism [and of its ineluctable demise]. While the fundamental argument of workers’ labour being an essential balance to investors’ capital for profitable production was clearly if extensively stated, the extrapolations on diminishing profits associated with decreasing labour input [and the resulting collapse] were murkier and sounded more ideological than scientific. Not that I claim any competence in the matter: my attempts at getting the concepts behind Marxist economics stopped at this point and I have not been seriously thinking about it since! But it still seems to me that the theory did age very well, missing the increasing power of financial agents in running companies. And of course [unsurprisingly] the numerical revolution and its impact on the (des)organisation of work and the disintegration of proletariat as Marx envisioned it. For instance turning former workers into forced and poor entrepreneurs (Uber, anyone?!). Not that the working conditions are particularly rosy for many, from a scarsity of low-skill jobs, to a nurtured competition between workers for existing jobs (leading to extremes like the scandalous zero hour contracts!), to minimum wages turned useless by the fragmentation of the working space and the explosion of housing costs in major cities, to the hopelessness of social democracies to get back some leverage on international companies…

maggiore alba [jatp]

Berlin [and Vienna] noir [book review]

While in Cambridge last month, I picked a few books from a local bookstore as fodder for my incoming vacations. Including this omnibus volume made of the first three books by Philip Kerr featuring Bernie Gunther, a private and Reich detective in Nazi Germany, namely, March Violets (1989), The Pale Criminal (1990), and A German Requiem (1991). (Book that I actually read before the vacations!) The stories take place before the war, in 1938, and right after, in 1946, in Berlin and Vienna. The books centre on a German version of Philip Marlowe, wise cracks included, with various degrees of success. (There actually is a silly comparison with Chandler on the back of the book! And I found somewhere else a similarly inappropriate comparison with Graham Greene‘s The Third Man…) Although I read the whole three books in a single week, which clearly shows some undeniable addictive quality in the plots, I find those plots somewhat shallow and contrived, especially the second one revolving around a serial killer of young girls that aims at blaming Jews for those crimes and at justifying further Nazi persecutions. Or the time spent in Dachau by Bernie Gunther as undercover agent for Heydrich. If anything, the third volume taking place in post-war Berlin and Wien is much better at recreating the murky atmosphere of those cities under Allied occupations. But overall there is much too much info-dump passages in those novels to make them a good read. The author has clearly done his documentation job correctly, from the early homosexual persecutions to Kristallnacht, to the fights for control between the occupying forces, but the information about the historical context is not always delivered in the most fluent way. And having the main character working under Heydrich, then joining the SS, does make relating to him rather unlikely, to say the least. It is hence unclear to me why those books are so popular, apart from the easy marketing line that stories involving Nazis are more likely to sell… Nothing to be compared with the fantastic Alone in Berlin, depicting the somewhat senseless resistance of a Berliner during the Nazi years, dropping hand-written messages against the regime under strangers’ doors.