Archive for the Books Category

light and widely applicable MCMC: approximate Bayesian inference for large datasets

Posted in Books, Statistics, University life, Wines with tags , , , , , , , , , , on March 24, 2015 by xi'an

Florian Maire (whose thesis was discussed in this post), Nial Friel, and Pierre Alquier (all in Dublin at some point) have arXived today a paper with the above title, aimed at quickly analysing large datasets. As reviewed in the early pages of the paper, this proposal follows a growing number of techniques advanced in the past years, like pseudo-marginals, Russian roulette, unbiased likelihood estimators. firefly Monte Carlo, adaptive subsampling, sub-likelihoods, telescoping debiased likelihood version, and even our very own delayed acceptance algorithm. (Which is incorrectly described as restricted to iid data, by the way!)

The lightweight approach is based on an ABC idea of working through a summary statistic that plays the role of a pseudo-sufficient statistic. The main theoretical result in the paper is indeed that, when subsampling in an exponential family, subsamples preserving the sufficient statistics (modulo a rescaling) are optimal in terms of distance to the true posterior. Subsamples are thus weighted in terms of the (transformed) difference between the full data statistic and the subsample statistic, assuming they are both normalised to be comparable. I am quite (positively) intrigued by this idea in that it allows to somewhat compare inference based on two different samples. The weights of the subsets are then used in a pseudo-posterior that treats the subset as an auxiliary variable (and the weight as a substitute to the “missing” likelihood). This may sound a wee bit convoluted (!) but the algorithm description is not yet complete: simulating jointly from this pseudo-target is impossible because of the huge number of possible subsets. The authors thus suggest to run an MCMC scheme targeting this joint distribution, with a proposed move on the set of subsets and a proposed move on the parameter set conditional on whether or not the proposed subset has been accepted.

From an ABC perspective, the difficulty in calibrating the tolerance ε sounds more accute than usual, as the size of the subset comes as an additional computing parameter. Bootstrapping options seem impossible to implement in a large size setting.

An MCMC issue with this proposal is that designing the move across the subset space is both paramount for its convergence properties and lacking in geometric intuition. Indeed, two subsets with similar summary statistics may be very far apart… Funny enough, in the representation of the joint Markov chain, the parameter subchain is secondary if crucial to avoid intractable normalising constants. It is also unclear for me from reading the paper maybe too quickly whether or not the separate moves when switching and when not switching subsets retain the proper balance condition for the pseudo-joint to still be the stationary distribution. The stationarity for the subset Markov chain is straightforward by design, but it is not so for the parameter. In case of switched subset, simulating from the true full conditional given the subset would work, but not simulated  by a fixed number L of MCMC steps.

The lightweight technology therein shows its muscles on an handwritten digit recognition example where it beats regular MCMC by a factor of 10 to 20, using only 100 datapoints instead of the 10⁴ original datapoints. While very nice and realistic, this example may be misleading in that 100 digit realisations may be enough to find a tolerable approximation to the true MAP. I was also intrigued by the processing of the probit example, until I realised the authors had integrated the covariate out and inferred about the mean of that covariate, which means it is not a genuine probit model.

ABC for copula estimation

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , on March 23, 2015 by xi'an

Roma from Piazzale Napoleone I, Villa Borghese, Feb. 29, 2012Clara Grazian and Brunero Liseo (di Roma) have just arXived a note on a method merging copulas, ABC, and empirical likelihood. The approach is rather hybrid and thus not completely Bayesian, but this must be seen as a consequence of an ill-posed problem. Indeed, as in many econometric models, the model there is not fully defined: the marginals of iid observations are represented as being from well-known parametric families (and are thus well-estimated by Bayesian tools), while the joint distribution remains uncertain and hence so does the associated copula. The approach in the paper is to proceed stepwise, i.e., to estimate correctly each marginal, well correctly enough to transform the data by an estimated cdf, and then only to estimate the copula or some aspect of it based on this transformed data. Like Spearman’s ρ. For which an empirical likelihood is computed and aggregated to a prior to make a BCel weight. (If this sounds unclear, each BEel evaluation is based on a random draw from the posterior samples, which transfers some uncertainty in the parameter evaluation into the copula domain. Thanks to Brunero and Clara for clarifying this point for me!)

At this stage of the note, there are two illustrations revolving around Spearman’s ρ. One on simulated data, with better performances than a nonparametric frequentist solution. And another one on a Garch (1,1) model for two financial time-series.

I am quite glad to see an application of our BCel approach in another domain although I feel a tiny bit uncertain about the degree of arbitrariness in the approach, from the estimated cdf transforms of the marginals to the choice of the moment equations identifying the parameter of interest like Spearman’s ρ. Especially if one uses a parametric copula which moments are equally well-known. While I see the practical gain in analysing each component separately, the object created by the estimated cdf transforms may have a very different correlation structure from the true cdf transforms. Maybe there exist consistency conditions on the estimated cdfs… Maybe other notions of orthogonality or independence could be brought into the picture to validate further the two-step solution…

Dom Juan’s opening

Posted in Books, Kids with tags , , , , , , , , , on March 22, 2015 by xi'an

The opening lines of the Dom Juan plan by Molière, a play with highly subversive undertones about free will and religion. And this ode to tobacco that may get it banned in Australia, if the recent deprogramming of Bizet’s Carmen is setting a trend! [Personal note to Andrew: neither Molière’s not my research are or were supported by a tobacco company! Although I am not 100% sure about Molière…]

“Quoi que puisse dire Aristote et toute la philosophie, il n’est rien d’égal au tabac: c’est la passion des honnêtes gens, et qui vit sans tabac n’est pas digne de vivre. Non seulement il réjouit et purge les cerveaux humains, mais encore il instruit les âmes à la vertu, et l’on apprend avec lui à devenir honnête homme.”

Dom Juan, Molière, 1665

[Whatever may be argued by Aristotle and the entire philosophy, there is nothing equal to tobacco; it is the passion of upright people, and whoever lives without tobacco does not deserve living. Not only it rejoices and purges human brains, but it also brings souls towards virtue, and teaches about becoming a gentleman.]

Gray matters [not much, truly]

Posted in University life, Books with tags , , , , , , , , , on March 21, 2015 by xi'an

Through the blog of Andrew Jaffe, Leaves on the Lines, I became aware of John Gray‘s tribune in The Guardian, “What scares the new atheists“. Gray’s central points against “campaigning” or “evangelical” atheists are that their claim to scientific backup is baseless, that they mostly express a fear about the diminishing influence of the liberal West, and that they cannot produce an alternative form of morality. The title already put me off and the beginning of the tribune just got worse, as it goes on and on about the eugenics tendencies of some 1930’s atheists and on how they influenced Nazi ideology. It is never a good sign in a debate when the speaker strives to link the opposite side with National Socialist ideas and deeds. Even less so in a supposedly philosophical tribune! (To add injury to insult, Gray also brings Karl Marx in the picture with a similar blame for ethnocentrism…) Continue reading

The synoptic problem and statistics [book review]

Posted in Books, R, Statistics, University life, Wines with tags , , , , , , , , , , , , on March 20, 2015 by xi'an

A book that came to me for review in CHANCE and that came completely unannounced is Andris Abakuks’ The Synoptic Problem and Statistics.  “Unannounced” in that I had not heard so far of the synoptic problem. This problem is one of ordering and connecting the gospels in the New Testament, more precisely the “synoptic” gospels attributed to Mark, Matthew and Luke, since the fourth canonical gospel of John is considered by experts to be posterior to those three. By considering overlaps between those texts, some statistical inference can be conducted and the book covers (some of?) those statistical analyses for different orderings of ancestry in authorship. My overall reaction after a quick perusal of the book over breakfast (sharing bread and fish, of course!) was to wonder why there was no mention made of a more global if potentially impossible approach via a phylogeny tree considering the three (or more) gospels as current observations and tracing their unknown ancestry back just as in population genetics. Not because ABC could then be brought into the picture. Rather because it sounds to me (and to my complete lack of expertise in this field!) more realistic to postulate that those gospels were not written by a single person. Or at a single period in time. But rather that they evolve like genetic mutations across copies and transmission until they got a sort of official status.

“Given the notorious intractability of the synoptic problem and the number of different models that are still being advocated, none of them without its deficiencies in explaining the relationships between the synoptic gospels, it should not be surprising that we are unable to come up with more definitive conclusions.” (p.181)

The book by Abakuks goes instead through several modelling directions, from logistic regression using variable length Markov chains [to predict agreement between two of the three texts by regressing on earlier agreement] to hidden Markov models [representing, e.g., Matthew’s use of Mark], to various independence tests on contingency tables, sometimes bringing into the model an extra source denoted by Q. Including some R code for hidden Markov models. Once again, from my outsider viewpoint, this fragmented approach to the problem sounds problematic and inconclusive. And rather verbose in extensive discussions of descriptive statistics. Not that I was expecting a sudden Monty Python-like ray of light and booming voice to disclose the truth! Or that I crave for more p-values (some may be found hiding within the book). But I still wonder about the phylogeny… Especially since phylogenies are used in text authentication as pointed out to me by Robin Ryder for Chauncer’s Canterbury Tales.

Significance and artificial intelligence

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , , , , on March 19, 2015 by xi'an

As my sorry excuse of an Internet provider has been unable to fix my broken connection for several days, I had more time to read and enjoy the latest Significance I received last week. Plenty of interesting entries, once again! Even though, faithful to my idiosyncrasies, I must definitely criticise the cover (but you may also skip till the end of the paragraph!): It shows a pile of exams higher than the page frame on a student table in a classroom and a vague silhouette sitting behind the exams. I do not know whether or not this is intentional but the silhouette has definitely been added to the original picture (and presumably the exams as well!), because the seat and blackboard behind this silhouette show through it. If this is intentional, does that mean that the poor soul grading this endless pile of exams has long turned into a wraith?! If not intentional, that’s poor workmanship for a magazine usually apt at making the most from the graphical side. (And then I could go on and on about the clearly independent choice of illustrations by the managing editor rather than the author(s) of the article…) End of the digression! Or maybe not because there also was an ugly graph from Knowledge is Beautiful about the causes of plane crashes that made pie-charts look great… Not that all the graphs in the book are bad, far from it!

“The development of full artificial intelligence could spell the end of the human race.’ S. Hawkins

The central theme of the magazine is artificial intelligence (and machine learning). A point I wanted to mention in a post following the recent doom-like messages of Gates and Hawking about AIs taking over humanity à la Blade Runner… or in Turing’s test. As if they had not already impacted our life so much and in so many ways. And no all positive or for the common good. Witness the ultra-fast codes on the stock market. Witness the self-replicating and modifying computer viruses. Witness the increasingly autonomous military drones. Or witness my silly Internet issue, where I cannot get hold of a person who can tell me what the problem is and what the company is doing to solve it (if anything!), but instead have to listen to endless phone automata that tell me to press “1 if…” and “3 else”, and that my incident ticket has last been updated three days ago… But at the same time the tone of The Independent tribune by Hawking, Russell, Tegmark, and Wilczek is somewhat misguided, if I may object to such luminaries!, and playing on science fiction themes that have been repeated so many times that they are now ingrained, rather than strong scientific arguments. Military robots that could improve themselves to the point of evading their conceptors are surely frightening but much less realistic than a nuclear reaction that could not be stopped in a Fukushima plant. Or than the long-term impacts of genetically modified crops and animals. Or than the current proposals of climate engineering. Or than the emerging nano-particles.

“If we build systems that are game-theoretic or utility maximisers, we won’t get what we’re hoping for.” P. Norvig

The discussion of this scare in Significance does not contribute much in my opinion. It starts with the concept of a perfect Bayesian agent, supposedly the state of an AI creating paperclips, which (who?) ends up using the entire Earth’s resources to make more paperclips. The other articles in this cover story are more relevant, as for instance how AI moved from pure logic to statistical or probabilist intelligence. With Yee Whye Teh discussing Bayesian networks and the example of Google translation (including a perfect translation into French of an English sentence).

solution manual for Bayesian Essentials with R

Posted in Books, Kids, Statistics, University life with tags , , , , , , on March 18, 2015 by xi'an

The solution manual to our Bayesian Essentials with R has just been arXived. If I link this completion with the publication date of the book itself, it sure took an unreasonable time to come out and sadly with no obvious reason or even less justification for the delay… Given the large overlap with the solution manual of the previous edition, Bayesian Core, this version should have been completed much much earlier but, paradoxically if in-line with the lengthy completion of the book istelf, this previous manual is one of the causes for the delay, as we thought the overlap allowed for self-study readers to check some of the exercises. Prodded by Hannah Bracken from Springer-Verlag, and unable to hire an assistant towards this task, I eventually decided to spend the few days required to clean up this solution manual, with the unintentional help from my sorry excuse for an Internet provider who accidentally cutting my home connection for a whole week so far…!

solmanIn the course of writing solutions, I stumbled upon one inexplicably worded exercise about the Lemer-Schur algorithm for testing stationarity, exercise that I had to rewrite from scratch. Apologies to any reader of Bayesian Essentials with R getting stuck on that exercise!!!


Get every new post delivered to your Inbox.

Join 793 other followers