As CHANCE book editor, I received the other day from Oxford University Press acts from an École de Physique des Houches on Statistical Physics, Optimisation, Inference, and Message-Passing Algorithms that took place there in September 30 – October 11, 2013. While it is mostly unrelated with Statistics, and since Igor Caron already reviewed the book a year and more ago, I skimmed through the few chapters connected to my interest, from Devavrat Shah’s chapter on graphical models and belief propagation, to Andrea Montanari‘s denoising and sparse regression, including LASSO, and only read in some detail Manfred Opper’s expectation propagation chapter. This paper made me realise (or re-realise as I had presumably forgotten an earlier explanation!) that expectation propagation can be seen as a sort of variational approximation that produces by a sequence of iterations the distribution within a certain parametric (exponential) family that is the closest to the distribution of interest. By writing the Kullback-Leibler divergence the opposite way from the usual variational approximation, the solution equates the expectation of the natural sufficient statistic under both models… Another interesting aspect of this chapter is the connection with estimating normalising constants. (I noticed a slight typo on p.269 in the final form of the Kullback approximation q() to p().
Archive for book review
Great news! Mark Huber (whom I’ve know for many years, so this review may be not completely objective!) has just written a book on perfect simulation! I remember (and still share) the excitement of the MCMC community when the first perfect simulation papers of Propp and Wilson (1995) came up on the (now deceased) MCMC preprint server, as it seemed then the ideal (perfect!) answer to critics of the MCMC methodology, plugging MCMC algorithms into a generic algorithm that eliminating burnin, warmup, and convergence issues… It seemed both magical, with the simplest argument: “start at T=-∞ to reach stationarity at T=0”, and esoteric (“why forward fails while backward works?!”), requiring simple random walk examples (and a java app by Jeff Rosenthal) to understand the difference (between backward and forward), as well as Wilfrid Kendall’s kids’ coloured wood cubes and his layer of leaves falling on the ground and seen from below… These were exciting years, with MCMC still in its infancy, and no goal seemed too far away! Now that years have gone, and that the excitement has clearly died away, perfect sampling can be considered in a more sedate manner, with pros and cons well-understood. This is why Mark Huber’s book is coming at a perfect time if any! It covers the evolution of the perfect sampling techniques, from the early coupling from the past to the monotonous versions, to the coalescence principles, with applications to spatial processes, to the variations on nested sampling and their use in doubly intractable distributions, with forays into the (fabulous) Bernoulli factory problem (a surprise for me, as Bernoulli factories are connected with unbiasedness, not stationarity! Even though my only fieldwork [with Randal Douc] in such factories was addressing a way to turn MCMC into importance sampling. The key is in the notion of approximate densities, introduced in Section 2.6.). The book is quite thorough with the probabilistic foundations of the different principles, with even “a [tiny weeny] little bit of measure theory.
Any imperfection?! Rather, only a (short too short!) reflection on the limitations of perfect sampling, namely that it cannot cover the simulation of posterior distributions in the Bayesian processing of most statistical models. Which makes the quote
“Distributions where the label of a node only depends on immediate neighbors, and where there is a chance of being able to ignore the neighbors are the most easily handled by perfect simulation protocols (…) Statistical models in particular tend to fall into this category, as they often do not wish to restrict the outcome too severely, instead giving the data a chance to show where the model is incomplete or incorrect.” (p.223)
just surprising, given the very small percentage of statistical models which can be handled by perfect sampling. And the downsizing of perfect sampling related papers in the early 2000’s. Which also makes the final and short section on the future of perfect sampling somewhat restricted in its scope.
So, great indeed!, a close to perfect entry to a decade of work on perfect sampling. If you have not heard of the concept before, consider yourself lucky to be offered such a gentle guidance into it. If you have dabbled with perfect sampling before, reading the book will be like meeting old friends and hearing about their latest deeds. More formally, Mark Huber’s book should bring you a new perspective on the topic. (As for me, I had never thought of connecting perfect sampling with accept reject algorithms.)
During my week in Warwick, I bought a book called Ghost Town, by Catriona Troth, from the campus bookstore, somewhat randomly, mostly because its back-cover was mentioning Coventry in the early 1980’s, racial riots, and anti-skinhead demonstrations, as well as the University of Warwick. And Ska, this musical style from the 1980’s, inspired from an earlier Jamaican rhythm, which emerged in Coventry with a groups called The Specials. (And the more mainstream Madness from Camden Town.) While this was some of the music I was listening to at that time, I was completely unaware it had started in Coventry! And Ghost Town is a popular song from The Specials. Which thus inspired the title of the book..
Enough with preliminaries!, the book is quite a good read, although more for the very realistic rendering of the atmosphere of the early 1980’s than for the story itself, even though both are quite intermingled. Most of the book action takes place in an homeless shelter where students just out of the University (or simply jobless) run the shelter and its flow of unemployed workers moving or drifting from the closed factories of the North towards London… This is Margaret Thatcher’s era, no doubt about this!, and the massive upheaval of industrial Britain at that time is translated into the gloomy feeling of an impoverished Midlands city like Coventry. This is also the end of the 1970’s, with (more) politically active students, almost indiscriminatingly active against every perceived oppression, from racism, to repression, the war in Ireland (with the death of Bobby Sand in Maze prison, for which I remember marching in Caen…), but mostly calling for a more open society. Given the atmosphere at that time, and especially given this was the time I was a student, there is enough material to make the book quite enjoyable [for me] to read! Even though I find the personal stories of both main protagonists somewhat caricaturesque and rather predictable. And, maybe paradoxically, the overall tone of the (plot) relationship between those two is somewhat patronising and conservative. When considering that they both can afford to retreat to safe havens when need be. But this does not make the bigger picture any less compelling a read, as the description of the (easy) manipulation of the local skinheads towards more violent racism by unnamed political forces is scary, with a very sad ending.
One side comment [of no relevance] is that reading the book made me realise I had no idea what Coventry looks like: none of the parts of town mentioned there evokes anything to me as I have never ventured farther than the train station! Which actually stands outside the ring road, hence not within the city limits. I hope I can find time during one of my next trips to have a proper look at down-town Coventry!
I have now read through Salman Rushdie‘s version of the tales of 1001 nights (which amount to two years, eight months, and twenty-eight nights—this would make exactly two years and nine months if the last month was a month of February!, not that it particularly matters). It is a fantastic tale, with supernatural jinns playing an obviously supernatural role, a tale which plot does not matter very much as it is the (Pandora) box for more tales and deeper philosophical reflections about religion and rationality. It is not a novel and even less a science-fiction novel as I read it in some reviews.
“It was the ungodly who had been specified as the targets but (…) this place was not at all ungodly. In point of fact it was excessively godly.”
What I liked very much, besides the literary style and the almost overwhelming culture (or cultures) of the author—of which I certainly missed a large chunk!—, in two years, eight months, and twenty-eight nights is the mille-feuille structure of the story and the associated distanciation imposed upon the reader against a natural reader’s tendency to believe or want to believe despite all inconsistencies. An induced agnosticism of sorts most appropriate to mock the irrationality of religious believers, jinns and humans alike, in a godless universe: while jinn magic abounds in the book, there is no god or at least no acting god that we can detect. But gods and religious beliefs are exploited in the war of the jinns against the hapless humans. There are just as many levels of irony therein, which further contribute to skepticism and disbelief.
“Many, including the present author, trace the beginnings of the so-called “death of the gods”, back to this period.”
The book is also very much embedded in today’s world, for all its connections with medieval philosophy and the historical character Ibn Rushdn (whose name was borrowed by Rushdie’s father to become their family name) or Averroes. The War on Terror, the Afghan and Syrian rise of religious fundamentalists, the Wall Street excesses, even the shooting down of the Malaysian airline MH17 by Ukrainian rebels, all take place in the background of the so-called war of the jinns. Which makes the conclusion of the book highly pessimistic if in tune with the overall philosophical cynicism of the author: if it really takes magical forces and super-heroes to bring rationality to the world, there is little hope for our own world…
“He passed a woman with astonishing face makeup, a zipper running down the middle of her face, `unzipped’ around her mouth to reveal bloody skinless flesh all the way down her chin.”
A last remark is that the above description of an Halloween disguise reminded me of the disguise my friend Julien Cornebise opted for a few years ago! No surprise as this is exactly the same. Which shows that Rushdie and he share some common background in popular culture.
By some piece of luck, I came upon the book Think Bayes: Bayesian Statistics Made Simple, written by Allen B. Downey and published by Green Tea Press [which I could relate to No Starch Press, focussing on coffee!, which published Statistics Done Wrong that I reviewed a while ago] which usually publishes programming books with fun covers. The book is available on-line for free in pdf and html formats, and I went through it during a particularly exciting administrative meeting…
“Most books on Bayesian statistics use mathematical notation and present ideas in terms of mathematical concepts like calculus. This book uses Python code instead of math, and discrete approximations instead of continuous mathematics. As a result, what would be an integral in a math book becomes a summation, and most operations on probability distributions are simple loops.”
The book is most appropriately published in this collection as most of it concentrates on Python programming, with hardly any maths formula. In some sense similar to Jim Albert’s R book. Obviously, coming from maths, and having never programmed in Python, I find the approach puzzling, But just as obviously, I am aware—both from the comments on my books and from my experience on X validated—that a large group (majority?) of newcomers to the Bayesian realm find the mathematical approach to the topic a major hindrance. Hence I am quite open to this editorial choice as it is bound to include more people to think Bayes, or to think they can think Bayes.
“…in fewer than 200 pages we have made it from the basics of probability to the research frontier. I’m very happy about that.”
The choice made of operating almost exclusively through motivating examples is rather traditional in US textbooks. See e.g. Albert’s book. While it goes against my French inclination to start from theory and concepts and end up with illustrations, I can see how it operates in a programming book. But as always I fear it makes generalisations uncertain and understanding more shaky… The examples are per force simple and far from realistic statistics issues. Hence illustrates more the use of Bayesian thinking for decision making than for data analysis. To wit, those examples are about the Monty Hall problem and other TV games, some urn, dice, and coin models, blood testing, sport predictions, subway waiting times, height variability between men and women, SAT scores, cancer causality, a Geiger counter hierarchical model inspired by Jaynes, …, the exception being the final Belly Button Biodiversity dataset in the final chapter, dealing with the (exciting) unseen species problem in an equally exciting way. This may explain why the book does not cover MCMC algorithms. And why ABC is covered through a rather artificial normal example. Which also hides some of the maths computations under the carpet.
“The underlying idea of ABC is that two datasets are alike if they yield the same summary statistics. But in some cases, like the example in this chapter, it is not obvious which summary statistics to choose.¨
In conclusion, this is a very original introduction to Bayesian analysis, which I welcome for the reasons above. Of course, it is only an introduction, which should be followed by a deeper entry into the topic, and with [more] maths. In order to handle more realistic models and datasets.
“Today, a week or two spent reading Jaynes’ book can be a life-changing experience.” (p.8)
I received this book by Peter Grindrod, Mathematical underpinnings of Analytics (theory and applications), from Oxford University Press, quite a while ago. (Not that long ago since the book got published in 2015.) As a book for review for CHANCE. And let it sit on my desk and in my travel bag for the same while as it was unclear to me that it was connected with Statistics and CHANCE. What is [are?!] analytics?! I did not find much of a definition of analytics when I at last opened the book, and even less mentions of statistics or machine-learning, but Wikipedia told me the following:
“Analytics is a multidimensional discipline. There is extensive use of mathematics and statistics, the use of descriptive techniques and predictive models to gain valuable knowledge from data—data analysis. The insights from data are used to recommend action or to guide decision making rooted in business context. Thus, analytics is not so much concerned with individual analyses or analysis steps, but with the entire methodology.”
Barring the absurdity of speaking of a “multidimensional discipline” [and even worse of linking with the mathematical notion of dimension!], this tells me analytics is a mix of data analysis and decision making. Hence relying on (some) statistics. Fine.
“Perhaps in ten years, time, the mathematics of behavioural analytics will be common place: every mathematics department will be doing some of it.”(p.10)
First, and to start with some positive words (!), a book that quotes both Friedrich Nietzsche and Patti Smith cannot get everything wrong! (Of course, including a most likely apocryphal quote from the now late Yogi Berra does not partake from this category!) Second, from a general perspective, I feel the book meanders its way through chapters towards a higher level of statistical consciousness, from graphs to clustering, to hidden Markov models, without precisely mentioning statistics or statistical model, while insisting very much upon Bayesian procedures and Bayesian thinking. Overall, I can relate to most items mentioned in Peter Grindrod’s book, but mostly by first reconstructing the notions behind. While I personally appreciate the distanced and often ironic tone of the book, reflecting upon the author’s experience in retail modelling, I am thus wondering at which audience Mathematical underpinnings of Analytics aims, for a practitioner would have a hard time jumping the gap between the concepts exposed therein and one’s practice, while a theoretician would require more formal and deeper entries on the topics broached by the book. I just doubt this entry will be enough to lead maths departments to adopt behavioural analytics as part of their curriculum… Continue reading
Mýrin (“The Bog”) is the third novel in the Inspector Erlendur series written by Arnaldur Indridason. It contains the major themes of the series, from the fascination for unexplained disappearances in Iceland to Elendur’s inability to deal with his family responsibilities, to domestic violence, to exhumations. The death that starts the novel takes place in the district of Norðurmýri, “the northern marsh”, not far from the iconic Hallgrimskirkja, and not far either from DeCODE, the genetic company I visited last June and which stores genetic information about close to a million Icelanders, the Íslendingabók. And which plays an important and nefarious role in the current novel. While this episode takes place mostly between Reykjavik and Keflavik, hence does not offer any foray into Icelandic landscapes, it reflects quite vividly on the cultural pressure still present in the recent years to keep rapes and sexual violence a private matter, hidden from an indifferent or worse police force. It also shows how the police misses (in 2001) the important genetic clues for being yet unaware of the immense and frightening possibilities of handling the genetic code of an entire population. (The English and French titles refer to the unauthorised private collections of body part accumulated [in jars] by doctors after autopsies, families being unaware of the fact.) As usual, solving the case is the least important part of the story, which tells about broken lifes and survivors against all odds.