“They went. They dug. Found nothing and came back, mostly in the rain.”

*[Warning: some spoilers in the following!]* The most striking imbalance in the story is the rather mundane pursuits of the three major heroes, from finding an old sword to avenging fallen friends here and there, against the threat of an unravelling of the entire Universe and of the disappearance of the current cosmology. In addition, the absolute separation maintained by Morgan between Archeth and Ringil kills some of the alchemy of the previous books and increases the tendency to boring inner monologues. The volume is much, much more borderline science-fiction than the previous ones, which obviously kills some of the magic, given that the highest powers that be sound like a sort of meta computer code that eventually gives Ringil *the* ultimate decision. As often, this mix between fantasy and science-fiction is not much to my taste, since it gives too much power to the foreign machines, *the Helmsmen*, which sound like they are driving the main human players for very long term goals. And which play too often *deus ex machina* to save the “heroes” from unsolvable situations. Overall a wee bit of a lengthy book, with a story coming to an unexpected end in the very final pages, leaving some threads unexplained and some feeling that style prevailed over story. But nonetheless a page turner in its second half.

Filed under: Books Tagged: A Land Fit for Heroes, book reviews, heroic fantasy, Richard K. Morgan, science fiction, The Dark Defiles, trilogy ]]>

Filed under: pictures Tagged: Iran, Monir Shahroudy Farmanfarmaian, Tehran, The Guardian ]]>

Filed under: Kids, pictures, University life Tagged: entrance exam, medical school ]]>

to make sure they are exactly identical. (Where x denotes the part of the parameter being simulated and z anything else.) The paper also mentions an older paper by John Geweke—of which I was curiously unaware!—leading to another test: consider iterating the following two steps:

- update the parameter θ given the current data x by an MCMC step that preserves the posterior p(θ|x);
- update the data x given the current parameter value θ from the sampling distribution p(x|θ).

Since both steps preserve the joint distribution p(x,θ), values simulated from those steps should exhibit the same properties as a forward production of (x,θ), i.e., simulating from p(θ) and then from p(x|θ). So with enough simulations, comparison tests can be run. (Andrew has a very similar proposal at about the same time.) There are potential limitations to the first approach, obviously, from being unable to write the full conditionals [an ABC version anyone?!] to making a programming mistake that keep both ratios equal [as it would occur if a Metropolis-within-Gibbs was run by using the ratio of the joints in the acceptance probability]. Further, as noted by the authors it only addresses the mathematical correctness of the code, rather than the issue of whether the MCMC algorithm mixes well enough to provide a pseudo-iid-sample from p(θ|x). (Lack of mixing that could be spotted by Geweke’s test.) But it is so immediately available that it can indeed be added to every and all simulations involving a conditional step. While Geweke’s test requires re-running the MCMC algorithm altogether. Although clear divergence between an iid sampling from p(x,θ) and the Gibbs version above could appear fast enough for a stopping rule to be used. In fine, a worthwhile addition to the collection of checkings and tests built across the years for MCMC algorithms! (Of which the trick proposed by my friend Tobias Rydén to run *first* the MCMC code with n=0 observations in order to recover *the prior* p(θ) remains my favourite!)

Filed under: Books, Statistics, University life Tagged: ABC, convergence assessment, Geweke's test, Gibbs sampling, John Geweke, MCMC, Monte Carlo Statistical Methods, prior distributions, simulation ]]>

*“Will you follow me, one last time?”*

With my daughter, we completed our Xmas Tolkien cycle by going together to see *The battle of the five armies*. As several have noted before me, the best thing I can say about this Hobbit series is that it is now… over! Just like the previous two instalments, watching Peter Jackson’s grand finale was mostly enjoyable, but mainly for the same reasons one enjoys visiting a venerable great-aunt once a year around Christmas, namely for bringing back memories of good times and shared laughs. Indeed, Jackson managed to link both sagas through his central character of Gandalf who, while overly fond of raised eyebrows and mischievous eyes, is certainly the most compelling character all over. While the plot stretched too thinly to keep me enthralled, as I could not remember why the orcs and goblins were converging to Erebor at the same time as the elves and dwarves and men of Dale (unless it was to justify the future name of the battle?!), I soon got battle-weary of the repeated clashes between the various armies which sounded like straight copies from on-line war games and even more of the half-dozen duels, while the rescue of Gandalf from Dol Gurdur is unbearably clumsy, with an apocryphal appearance of the Nazguls. As too often in the story, the giant eagles were so instrumental to victory that one could only wonder why they had not been around from the start.

The comical parts are much sparser here than in the previous movies: hardly any screen time for Radagast’s rabbits, thank Sauron!, or for the jovial Dain with his great Scottish brogue and his war[t]hog opening, or yet for Thranduil’s moose to show its major advantage in battle, a few steps before being shot down, or for the war mountain goats who appeared then vanished at the moment of direst need, or for Bard to find a pre-historical skateboard. I also noted that the [dumb] orgs managed to invent a precursor of Chappe’s telegraph that alas could only transmit one symbol [since it was always taking the same shape!], that Legolas recreated the Matrix by walking on a disintegrating bridge, and that Thorin turned on gravity for a few crucial seconds in a movie where most characters seem to have no issue with falling, jumping or fighting without the slightest consideration for mechanics, with a strong tendency for characters to head-butt into walls…

“What this adaptation of “The Hobbit” can’t avoid by its final instalment is its predictability and hollow foundations.” NYT, Dec. 16, 2014

Other features I did not enjoy much: Thorin sulked way too long, Alferid outlasted its stay on screen by about 144 minutes, only to vanish unexpectedly, Bilbo seemed lost at the margins most of the movie, while the love story between Kili and Tauriel was really one addition too many to Tolkien’s book. The search for variety in the steeds of the various armies made me almost wish for more races on the battle-field as we could then have seen fighters on giant moles or on battle-hens… And everyone could have done without the “Dune moment”, with giant earth-worms breaking tunnels only to return to oblivion. Anyway, we have now been “*There and Back Again”* and can now settle in our own hobbit-hole to re-read the books and enjoy a certain nostalgia about the days where we could imagine on our own how Bilbo, Gandalf or Thorin would look like, while humming “Song of the Misty Mountains”…

Filed under: Books, Kids, pictures Tagged: movie review, New Zealand, Peter Jackson, Song of the Misty Mountains, The battle of the five armies, The Hobbit, The Lord of the Rings ]]>

The major argument in using EP in a large data setting is that the approximation to the true posterior can be build using one part of the data at a time and thus avoids handling the entire likelihood function. Nonetheless, I still remain mostly agnostic about using EP and a seminar this morning at CREST by Guillaume Dehaene and Simon Barthelmé (re)generated self-interrogations about the method that hopefully can be exploited towards the future version of the paper.

One of the major difficulties I have with EP is about the nature of the resulting approximation. Since it is chosen out of a “nice” family of distributions, presumably restricted to an exponential family, the optimal approximation will remain within this family, which further makes EP sound like a specific variational Bayes method since the goal is to find the family member the closest to the posterior in terms of Kullback-Leibler divergence. (Except that the divergence is the opposite one.) I remain uncertain about what to do with the resulting solution, as the algorithm does not tell me how close this solution will be from the true posterior. Unless one can use it as a pseudo-distribution for indirect inference (a.k.a., ABC)..?

Another thing that became clear during this seminar is that the decomposition of the target as a product is completely arbitrary, i.e., does not correspond to an feature of the target other than the later being the product of those components. Hence, the EP partition could be adapted or even optimised within the algorithm. Similarly, the parametrisation could be optimised towards a “more Gaussian” posterior. This is something that makes EP both exciting as opening many avenues for experimentation and fuzzy as its perceived lack of goal makes comparing approaches delicate. For instance, using MCMC or HMC steps to estimate the parameters of the tilted distribution is quite natural in complex settings but the impact of the additional approximation must be gauged against the overall purpose of the approach.

Filed under: Books, Statistics, University life Tagged: cavity distribution, CREST, data partitioning, EP, expectation-propagation, Kullback-Leibler divergence, large data problems, parallel processing ]]>

At each iteration of Ben’s algorithm, N proposed values are generated conditional on the “current” value of the Markov chain, which actually consists of (N+1) components and from which one component is drawn at random to serve as a seed for the next proposal distribution and the simulation of N other values. In short, this is a data-augmentation scheme with the index I on the one side and the N modified components on the other side. The neat trick in the proposal [and the reason for the jump in efficiency] is that the stationary distribution of the auxiliary variable can be determined and hence used (N+1) times in updating the vector of (N+1) components. (Note that picking the index at random means computing *all* (N+1) possible transitions from one component to the N others. Or even all (N+1)! if the proposals differ. Hence a potential *increase* in the computing cost, even though what costs the most is usually the likelihood computation, dispatched on the parallel processors.) While there are (N+1) terms involved at each step, the genuine Markov chain is truly over a *single* chain and the N other proposed values are not recycled. Even though they could be [for Monte Carlo integration purposes], as shown e.g. in our paper with Pierre Jacob and Murray Smith. Something that took a few iterations for me to understand is why Ben rephrases the original Metropolis-Hastings algorithm as a finite state space Markov chain on the set of indices {1,…,N+1} (Proposition 1). Conditionally on the values of the (N+1) vector, the stationary of that sub-chain is no longer uniform. Hence, picking (N+1) indices from the stationary helps in selecting the most appropriate images, which explains why the rejection rate decreases.

The paper indeed evaluates the impact of increasing the number of proposals in terms of effective sample size (ESS), acceptance rate, and mean squared jump distance, based two examples. As often in parallel implementations, the paper suggests an “N-fold increase in computational speed” even though this is simply the effect of running the *same* algorithm on a single processor and on N parallel processors. If the comparison is between a single proposal Metropolis-Hastings algorithm on a single processor and an N-fold proposal on N processors, I would say the latter is *slower* because of the selection of the index I that forces all pairs of reverse move. Nonetheless, since this is an almost free bonus resulting from using N processors, when compared with more complex coupled chains, it sounds worth investigating and comparing with those more complex parallel schemes.

Filed under: Books, Statistics, University life Tagged: parallel MCMC, PNAS, proceedings, vanilla Rao-Blackwellisation ]]>

which acts like a Bayesian p-value of sorts. I discussed several times the reservations I have about this notion on this blog… Including running one experiment on the uniformity of the ppp while in Duke last year. One item of those reservations being that it evaluates the posterior probability of an event that does not exist a priori. Which is somewhat connected to the issue of using the data “twice”.

“A posterior predictive p-value has a transparent Bayesian interpretation.”

Another item that was suggested [to me] in the current paper is the difficulty in defining the posterior predictive (pp), for instance by including latent variables

which reminds me of the multiple possible avatars of the BIC criterion. The question addressed by Rubin-Delanchy and Lawson is how far from the uniform distribution stands this pp when the model is correct. The main result of their paper is that any sub-uniform distribution can be expressed as a particular posterior predictive. The authors also exhibit the distribution that achieves the bound produced by Xiao-Li Meng, Namely that

where *P* is the above (top) probability. (Hence it is uniform up to a factor 2!) Obviously, the proximity with the upper bound only occurs in a limited number of cases that do not validate the overall use of the ppp. But this is certainly a nice piece of theoretical work.

Filed under: Books, Statistics, University life Tagged: Andrew Gelman, arXiv, Bayesian p-values, DIC, posterior predictive, uniformity, University of Bristol, using the data twice, warhammer, Xiao-Li Meng ]]>

Filed under: Kids, pictures, Travel Tagged: celebrations, clouds, France, holidays, Montpellier, plane picture, sunrise, Yule ]]>

Filed under: Kids, pictures, University life Tagged: All of Statistics, central limit theorem, introductory textbooks, t-test, Université Paris Dauphine ]]>