Archive for the Books Category

non-reversible Langevin samplers

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , on February 6, 2017 by xi'an

In the train to Oxford yesterday night, I read through the recently arXived Duncan et al.’s Nonreversible Langevin Samplers: Splitting Schemes, Analysis and Implementation. Standing up the whole trip in the great tradition of British trains.

The paper is fairly theoretical and full of Foster-Lyapunov assumptions but aims at defending an approach based on a non-reversible diffusion. One idea is that the diffusion based on the drift {∇ log π(x) + γ(x)} is associated with the target π provided

∇ . {π(x)γ(x)} = 0

which holds for the Langevin diffusion when γ(x)=0, but produces a non-reversible process in the alternative. The Langevin choice γ(x)=0 happens to be the worst possible when considering the asymptotic variance. In practice however the diffusion need be discretised, which induces an approximation that may be catastrophic for convergence if not corrected, and a relapse into reversibility if corrected by Metropolis. The proposal in the paper is to use a Lie-Trotter splitting I had never heard of before to split between reversible [∇ log π(x)] and non-reversible [γ(x)] parts of the process. The deterministic part is chosen as γ(x)=∇ log π(x) [but then what is the point since this is Langevin?] or as the gradient of a power of π(x). Although I was mostly lost by that stage, the paper then considers the error induced by a numerical integrator related with this deterministic part, towards deriving asymptotic mean and variance for the splitting scheme. On the unit hypercube. Although the paper includes a numerical example for the warped normal target, I find it hard to visualise the implementation of this scheme. Having obviously not heeded Nicolas’ and James’ advice, the authors also analyse the Pima Indian dataset by a logistic regression!)

the three-body problem [book review]

Posted in Books with tags , , , , , , , on February 5, 2017 by xi'an

“Back then, I thought of one thing: Have you heard of the Monte Carlo method? Ah, it’s a computer algorithm often used for calculating the area of irregular shapes. Specifically, the software puts the figure of interest in a figure of known area, such as a circle, and randomly strikes it with many tiny balls, never targeting the same spot twice. After a large number of balls, the proportion of balls that fall within the irregular shape compared to the total number of balls used to hit the circle will yield the area of the shape. Of course, the smaller the balls used, the more accurate the result.

Although the method is simple, it shows how, mathematically, random brute force can overcome precise logic. It’s a numerical approach that uses quantity to derive quality. This is my strategy for solving the three-body problem. I study the system moment by moment. At each moment, the spheres’ motion vectors can combine in infinite ways. I treat each combination like a life form. The key is to set up some rules: which combinations of motion vectors are “healthy” and “beneficial,” and which combinations are “detrimental” and “harmful.” The former receive a survival advantage while the latter are disfavored. The computation proceeds by eliminating the disadvantaged and preserving the advantaged. The final combination that survives is the correct prediction for the system’s next configuration, the next moment in time.”

While I had read rather negative reviews of the Three-Body Problem, I still decided to buy the book from an Oxford bookstore and give it a try. Ìf only because this was Chinese science-fiction and I had never read any Chinese science-fiction. (Of course the same motivation would apply for most other countries!) While the historical (or pseudo-historical) part of the novel is most interesting, about the daughter of a university physicist killed by Red Guards during the Cultural Revolution and hence forever suspect, even after decades of exile, the science-fiction part about contacting another inhabited planet and embracing its alien values based on its sole existence is quite deficient and/or very old-fashioned. As is [old-fashioned] the call to more than three dimensions to manage anything, from space travel to instantaneous transfer of information, to ultimate weapons. And an alien civilization that is not dramatically alien. As for the three body problem itself, there is very little of interest in the book and the above quote on using Monte Carlo to “solve” the three-body problem is not of any novelty since it started in the early 1940’s.

I am thus very much surprised at the book getting a Hugo award. For a style that is more reminiscent of early Weird Tales than of current science-fiction… In addition, the characters are rather flat and often act in unnatural ways. (Some critics blame the translation, but I think it gets deeper than that.)

a well-hidden E step

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , , , on February 3, 2017 by xi'an

Grand Palais from Esplanade des Invalides, Paris, Dec. 07, 2012A recent question on X validated ended up being quite interesting! The model under consideration is made of parallel Markov chains on a finite state space, all with the same Markov transition matrix, M, which turns into a hidden Markov model when the only summary available is the number of chains in a given state at a given time. When writing down the EM algorithm, the E step involves the expected number of moves from a given state to a given state at a given time. The conditional distribution of those numbers of chains is a product of multinomials across times and starting states, with no Markov structure since the number of chains starting from a given state is known at each instant. Except that those multinomials are constrained by the number of “arrivals” in each state at the next instant and that this makes the computation of the expectation intractable, as far as I can see.

A solution by Monte Carlo EM means running the moves for each instant under the above constraints, which is thus a sort of multinomial distribution with fixed margins, enjoying a closed-form expression but for the normalising constant. The direct simulation soon gets too costly as the number of states increases and I thus considered a basic Metropolis move, using one margin (row or column) or the other as proposal, with the correction taken on another margin. This is very basic but apparently enough for the purpose of the exercise. If I find time in the coming days, I will try to look at the ABC resolution of this problem, a logical move when starting from non-sufficient statistics!

estimating constants [survey]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on February 2, 2017 by xi'an

A new survey on Bayesian inference with intractable normalising constants was posted on arXiv yesterday by Jaewoo Park and Murali Haran. A rather massive work of 58 pages, almost handy for a short course on the topic! In particular, it goes through the most common MCMC methods with a detailed description, followed by comments on components to be calibrated and the potential theoretical backup. This includes for instance the method of Liang et al. (2016) that I reviewed a few months ago. As well as the Wang-Landau technique we proposed with Yves Atchadé and Nicolas Lartillot. And the noisy MCMC of Alquier et al. (2016), also reviewed a few months ago. (The Russian Roulette solution is only mentioned very briefly as” computationally very expensive”. But still used in some illustrations. The whole area of pseudo-marginal MCMC is also missing from the picture.)

“…auxiliary variable approaches tend to be more efficient than likelihood approximation approaches, though efficiencies vary quite a bit…”

The authors distinguish between MCMC methods where the normalizing constant is approximated and those where it is omitted by an auxiliary representation. The survey also distinguishes between asymptotically exact and asymptotically inexact solutions. For instance, using a finite number of MCMC steps instead of the associated target results in an asymptotically inexact method. The question that remains open is what to do with the output, i.e., whether or not there is a way to correct for this error. In the illustration for the Ising model, the double Metropolis-Hastings version of Liang et al. (2010) achieves for instance massive computational gains, but also exhibits a persistent bias that would go undetected were it the sole method implemented. This aspect of approximate inference is not really explored in the paper, but constitutes a major issue for modern statistics (and machine learning as well, when inference is taken into account.)

In conclusion, this survey provides a serious exploration of recent MCMC methods. It begs for a second part involving particle filters, which have often proven to be faster and more efficient than MCMC methods, at least in state space models. In that regard, Nicolas Chopin and James Ridgway examined further techniques when calling to leave the Pima Indians [dataset] alone.

relativity is the keyword

Posted in Books, Statistics, University life with tags , , , , , , , on February 1, 2017 by xi'an

St John's College, Oxford, Feb. 23, 2012As I was teaching my introduction to Bayesian Statistics this morning, ending up with the chapter on tests of hypotheses, I found reflecting [out loud] on the relative nature of posterior quantities. Just like when I introduced the role of priors in Bayesian analysis the day before, I stressed the relativity of quantities coming out of the BBB [Big Bayesian Black Box], namely that whatever happens as a Bayesian procedure is to be understood, scaled, and relativised against the prior equivalent, i.e., that the reference measure or gauge is the prior. This is sort of obvious, clearly, but bringing the argument forward from the start avoids all sorts of misunderstanding and disagreement, in that it excludes the claims of absolute and certainty that may come with the production of a posterior distribution. It also removes the endless debate about the determination of the prior, by making each prior a reference on its own. With an additional possibility of calibration by simulation under the assumed model. Or an alternative. Again nothing new there, but I got rather excited by this presentation choice, as it seems to clarify the path to Bayesian modelling and avoid misapprehensions.

Further, the curious case of the Bayes factor (or of the posterior probability) could possibly be resolved most satisfactorily in this framework, as the [dreaded] dependence on the model prior probabilities then becomes a matter of relativity! Those posterior probabilities depend directly and almost linearly on the prior probabilities, but they should not be interpreted in an absolute sense as the ultimate and unique probability of the hypothesis (which anyway does not mean anything in terms of the observed experiment). In other words, this posterior probability does not need to be scaled against a U(0,1) distribution. Or against the p-value if anyone wishes to do so. By the end of the lecture, I was even wondering [not so loudly] whether or not this perspective was allowing for a resolution of the Lindley-Jeffreys paradox, as the resulting number could be set relative to the choice of the [arbitrary] normalising constant. Continue reading

3rd conference on geometric science[s] of information

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , on January 31, 2017 by xi'an

A call for contribution to the 3rd Conference on Geometric Science of Information that I was asked to advertise. (I would have used Sciences instead of Science.) With a nice background picture related to Adelard de Bath, who among other things in natural philosophy introduced the Hindu-Arabic numerals in Europe [and later to America, even though the use of Arabic numerals there may soon come to an end]. And which Latin translation of Euclid’s Elements includes the above picture. The conference is on November 7-9, 2017, in the centre of Paris (Écoles de Mines, next to Luxembourg). (As I cannot spot the registration rates of that conference on the website, I cannot at this stage bring full support to the conference!)

Elsevier in the frontline

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 27, 2017 by xi'an

“Viewed this way, the logo represents, in classical symbolism, the symbiotic relationship between publisher and scholar. The addition of the Non Solus inscription reinforces the message that publishers, like the elm tree, are needed to provide sturdy support for scholars, just as surely as scholars, the vine, are needed to produce fruit. Publishers and scholars cannot do it alone. They need each other. This remains as apt a representation of the relationship between Elsevier and its authors today – neither dependent, nor independent, but interdependent.”

There were two items of news related with the publishark Elsevier in the latest issue of Nature I read. One was that Germany, Peru, and Taiwan had no longer access to Elsevier journals, after negotiations or funding stopped. Meaning the scientists there have to find alternative ways to procure the papers, from the authors’ webpage [I do not get why authors fail to provide their papers through their publication webpage!] to peer-to-peer platforms like Sci-Hub. Beyond this short term solution, I hope this pushes for the development of arXiv-based journals, like Gower’s Discrete Analysis. Actually, we [statisticians] should start planing a Statistics version of it!

The second item is about  Elsevier developing its own impact factor index, CiteScore. While I do not deem the competition any more relevant for assessing research “worth”, seeing a publishark developing its own metrics sounds about as appropriate as Breithart News starting an ethical index for fake news. I checked the assessment of Series B on that platform, which returns the journal as ranking third, with the surprising inclusion of the Annual Review of Statistics and its Application [sic], a review journal that only started two years ago, of Annals of Mathematics, which does not seem to pertain to the category of Statistics, Probability, and Uncertainty, and of Statistics Surveys, an IMS review journal that started in 2009 (of which I was blissfully unaware). And the article in Nature points out that, “scientists at the Eigenfactor project, a research group at the University of Washington, published a preliminary calculation finding that Elsevier’s portfolio of journals gains a 25% boost relative to others if CiteScore is used instead of the JIF“. Not particularly surprising, eh?!

When looking for an illustration of this post, I came upon the hilarious quote given at the top: I particularly enjoy the newspeak reversal between the tree and the vine,  the parasite publishark becoming the support and the academics the (invasive) vine… Just brilliant! (As a last note, the same issue of Nature mentions New Zealand aiming at getting rid of all invasive predators: I wonder if publishing predators are also included!)