## MAP or mean?!

Posted in Statistics, Travel, University life with tags , , , on March 5, 2014 by xi'an

“A frequent matter of debate in Bayesian inversion is the question, which of the two principle point-estimators, the maximum-a-posteriori (MAP) or the conditional mean (CM) estimate is to be preferred.”

An interesting topic for this arXived paper by Burger and Lucka that I (also) read in the plane to Montréal, even though I do not share the concern that we should pick between those two estimators (only or at all), since what matters is the posterior distribution and the use one makes of it. I thus disagree there is any kind of a “debate concerning the choice of point estimates”. If Bayesian inference reduces to producing a point estimate, this is a regularisation technique and the Bayesian interpretation is both incidental and superfluous.

Maybe the most interesting result in the paper is that the MAP is expressed as a proper Bayes estimator! I was under the opposite impression, mostly because the folklore (and even The Bayesian Core)  have it that it corresponds to a 0-1 loss function does not hold for continuous parameter spaces and also because it seems to conflict with the results of Druihlet and Marin (BA, 2007), who point out that the MAP ultimately depends on the choice of the dominating measure. (Even though the Lebesgue measure is implicitly chosen as the default.) The authors of this arXived paper start with a distance based on the prior; called the Bregman distance. Which may be the quadratic or the entropy distance depending on the prior. Defining a loss function that is a mix of this Bregman distance and of the quadratic distance

$||K(\hat u-u)||^2+2D_\pi(\hat u,u)$

produces the MAP as the Bayes estimator. So where did the dominating measure go? In fact, nowhere: both the loss function and the resulting estimator are clearly dependent on the choice of the dominating measure… (The loss depends on the prior but this is not a drawback per se!)

## Advances in scalable Bayesian computation [day #1]

Posted in Books, Mountains, pictures, R, Statistics, University life with tags , , , , , , , , , on March 4, 2014 by xi'an

This was the first day of our workshop Advances in Scalable Bayesian Computation and it sounded like the “main” theme was probabilistic programming, in tune with my book review posted this morning. Indeed, both Vikash Mansinghka and Frank Wood gave talks about this concept, Vikash detailing the specifics of a new programming language called Venture and Frank focussing on his state-space version of the above called Anglican. This is a version of the language Church, developed to handle probabilistic models and inference (hence the joke about Anglican, “a Church of England Venture’! But they could have also added that Frank Wood was also the name of a former archbishop of Melbourne..!) I alas had an involuntary doze during Vikash’s talk, which made it harder for me to assess the fundamentals of those ventures, of how they extended beyond a “mere” new software (and of why I would invest in learning a Lisp-based language!).

The other talks of Day #1 were of a more “classical” nature with Pierre Jacob explaining why non-negative unbiased estimators were impossible to provide in general, a paper I posted about a little while ago, and including an objective Bayes example that I found quite interesting. Then Sumeet Singh (no video) presented a joint work with Nicolas Chopin on the uniform ergodicity of the particle Gibbs sampler, a paper that I should have commented here (except that it appeared just prior to The Accident!), with a nice coupling proof. And Maria Lomeli gave us an introduction to the highly general Poisson-Kingman mixture models as random measures, which encompasses all of the previously studied non-parametric random measures, with an MCMC implementation that included a latent variable representation for the alpha-stable process behind the scene, representation that could be (and maybe is) also useful in parametric analyses of alpha-stable processes.

We also had an open discussion in the afternoon that ended up being quite exciting, with a few of us voicing out some problems or questions about existing methods and others making suggestions or contradictions. We are still a wee bit short of considering a collective paper on MCMC under constraints with coherent cross-validated variational Bayes and loss-based pseudo priors, with applications to basketball data” to appear by the end of the week!

Add to this two visits to the Sally Borden Recreation Centre for morning swimming and evening climbing, and it is no wonder I woke up a bit late this morning! Looking forward Day #2!

## cascade on Cascade Mountain

Posted in Mountains, pictures, Travel with tags , , , , , , , , on March 3, 2014 by xi'an

I started my stay in Banff with an interesting ice-climb on Cascade Mountain, just next to the Icefields Parkway exit to the town. (So we climbed the redundant Cascade Fall!) While the difficulty of the climb was much lower [grade III] than for my earlier ice-climb in Banff, it was incredibly cold (when we started, the temperature was -27⁰C… and rose to -19⁰C by the mid-afternoon, freezing the water in my thermos bottle) and I was a little worried at getting numb fingers, which would definitely not help with the climbing. And at the ice getting too brittle. As it happened, the cold temperature did not bother us at all during the climb which ended up being highly enjoyable. (The missing thumb did not bother me either. Except when clipping gear in and out, where I was rather clumsy.) The mountain guide who took us there was Joe McKay, who was hilarious and highly laid-back. He is also involved in filming climbing tricks and advices, so I may see him again at the Banff Centre this week… (In a video, Joe states that one should not be climbed if it’s 25 below!)

## Advances in Scalable Bayesian Computation [14w5125]

Posted in Mountains, Statistics, University life with tags , , , , , , , on March 3, 2014 by xi'an

Here is the Press release for the workshop:

The Banff International Research Station will host the “Advances in Scalable Bayesian Computation” workshop from March 2nd to March 7th, 2014.

Computational advances are always accompanied by new challenges, due both to the growth in data processing and in the possible exploration of new models. While highly innovative statistical computing methods from the early 1990′s still are at the core of today’s statistical practice, new models, especially in population genetics and statistical signal processing cannot be readily handled by such methods. Solutions at the interface between improved computing algorithms and controlled model approximations are now appearing in several fields and this workshop aims at bringing together experts from those different fields.

The Banff International Research Station for Mathematical Innovation and Discovery (BIRS) is a collaborative Canada-US-Mexico venture that provides an environment for creative interaction as well as the exchange of ideas, knowledge, and methods within the Mathematical Sciences, with related disciplines and with industry. The research station is located at The Banff Centre in Alberta and is supported by Canada’s Natural Science and Engineering Research Council (NSERC), the U.S. National Science Foundation (NSF), Alberta’s Advanced Education and Technology, and Mexico’s Consejo Nacional de Ciencia y Tecnología (CONACYT).

The talks at BIRS are available on video as well.

## Bayesian programming [book review]

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , on March 3, 2014 by xi'an

“We now think the Bayesian Programming methodology and tools are reaching maturity. The goal of this book is to present them so that anyone is able to use them. We will, of course, continue to improve tools and develop new models. However, pursuing the idea that probability is an alternative to Boolean logic, we now have a new important research objective, which is to design specific hsrdware, inspired from biology, to build a Bayesian computer.”(p.xviii)

On the plane to and from Montpellier, I took an extended look at Bayesian Programming a CRC Press book recently written by Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha. (Very nice picture of a fishing net on the cover, by the way!) Despite the initial excitement at seeing a book which final goal was to achieve a Bayesian computer, as demonstrated by the above quote, I however soon found the book too arid to read due to its highly formalised presentation… The contents are clear indications that the approach is useful as they illustrate the use of Bayesian programming in different decision-making settings, including a collection of Python codes, so it brings an answer to the what but it somehow misses the how in that the construction of the priors and the derivation of the posteriors is not explained in a way one could replicate.

“A modeling methodology is not sufficient to run Bayesian programs. We also require an efficient Bayesian inference engine to automate the probabilistic calculus. This assumes we have a collection of inference algorithms adapted and tuned to more or less specific models and a software architecture to combine them in a coherent and unique tool.” (p.9)

For instance, all models therein are described via the curly brace formalism summarised by

which quickly turns into an unpalatable object, as in this example taken from the online PhD thesis of Gabriel Synnaeve (where he applied Bayesian programming principles to a MMORPG called StarCraft and developed an AI (or bot) able to play BroodwarBotQ)

thesis that I found most interesting!

“Consequently, we have 21 × 16 = 336 bell-shaped distributions and we have 2 × 21 × 16 = 772 free parameters: 336 means and 336 standard deviations.¨(p.51)

Now, getting back to the topic of the book, I can see connections with statistical problems and models, and not only via the application of Bayes’ theorem, when the purpose (or Question) is to take a decision, for instance in a robotic action. I still remain puzzled by the purpose of the book, since it starts with very low expectations on the reader, but hurries past notions like Kalman filters and Metropolis-Hastings algorithms in a few paragraphs. I do not get some of the details, like this notion of a discretised Gaussian distribution (I eventually found the place where the 772 prior parameters are “learned” in a phase called “identification”.)

“Thanks to conditional independence the curse of dimensionality has been broken! What has been shown to be true here for the required memory space is also true for the complexity of inferences. Conditional independence is the principal tool to keep the calculation tractable. Tractability of Bayesian inference computation is of course a major concern as it has been proved NP-hard (Cooper, 1990).”(p.74)

The final chapters (Chap. 14 on “Bayesian inference algorithms revisited”, Chap. 15 on “Bayesian learning revisited” and  Chap. 16 on “Frequently asked questions and frequently argued matters” [!]) are definitely those I found easiest to read and relate to. With mentions made of conjugate priors and of the EM algorithm as a (Bayes) classifier. The final chapter mentions BUGS, Hugin and… Stan! Plus a sequence of 23 PhD theses defended on Bayesian programming for robotics in the past 20 years. And explains the authors’ views on the difference between Bayesian programming and Bayesian networks (“any Bayesian network can be represented in the Bayesian programming formalism, but the opposite is not true”, p.316), between Bayesian programming and probabilistic programming (“we do not search to extend classical languages but rather to replace them by a new programming approach based on probability”, p.319), between Bayesian programming and Bayesian modelling (“Bayesian programming goes one step further”, p.317), with a further (self-)justification of why the book sticks to discrete variables, and further more philosophical sections referring to Jaynes and the principle of maximum entropy.

“The “objectivity” of the subjectivist approach then lies in the fact that two different subjects with same preliminary knowledge and same observations will inevitably reach the same conclusions.”(p.327)

Bayesian Programming thus provides a good snapshot of (or window on) what one can achieve in uncertain environment decision-making with Bayesian techniques. It shows a long-term reflection on those notions by Pierre Bessière, his colleagues and students. The topic is most likely too remote from my own interests for the above review to be complete. Therefore, if anyone is interested in reviewing any further this book for CHANCE, before I send the above to the journal, please contact me. (Usual provisions apply.)

## is atheism irrational?

Posted in Books, Kids with tags , , , , , , , on March 2, 2014 by xi'an

“If a belief is as likely to be false as to be true, we’d have to say the probability that any particular belief is true is about 50 percent. Now suppose we had a total of 100 independent beliefs (of course, we have many more). Remember that the probability that all of a group of beliefs are true is the multiplication of all their individual probabilities. Even if we set a fairly low bar for reliability — say, that at least two-thirds (67 percent) of our beliefs are true — our overall reliability, given materialism and evolution, is exceedingly low: something like .0004. So if you accept both materialism and evolution, you have good reason to believe that your belief-producing faculties are not reliable.”

On the (New York Times) philosophy blog The Stone, I spotted this entry and first wondered if I had misread the title, as atheism sounds (to me) as a most rational position. I then read the piece and found it mostly missing, even though a few points rang true(r). First, theism is never properly defined. (Even though the author Alvin Plantinga seems to stick to monotheist religions.) This is a not-so-subtle trick as it makes atheism appear as the extreme position, since it is rejecting any form of theism! Then, the interviewee is mostly using a sequence of sophisms as arguments that atheists are irrational, see e.g. the even-star-ism and a-moonism and a-teapotism entries. Further, some of his entries very strongly resemble intelligent design arguments, e.g. the “fine-tuning” line that the universe is too perfectly suited to human life to be due to randomness. Even though Plantinga also resorts to evolution when needed, as in the above quote. (The interviewer is not doing a great job either, by referring to evil, or the need (or lack thereof) of God versus science to explain the world. Rather than resorting to rational arguments. And without mentioning the fundamental point in favour of atheism that the existence of a sentient being driving the whole universe while remaining hidden to us humans requires an infinitely stronger step than arguing this is impossibly incompatible with the laws of Physics and the accumulated corpus of experience since the dawn of humanity.) The whole strategy of Plantinga is actually to turn atheism into another kind of belief “that materialism and evolution are true” and then to rank it equal with the theisms. A very poor philosophical performance. As also (and better) pointed out in this other post. (And as my daughter remarked, fresh from writing a philosophy essay, Plantinga is missing the best argument of all, namely Pascal’s wager, an early instance of decision theory applied to religion.)

## snapshot from Québec (#2)

Posted in pictures, Travel with tags , , , , on March 1, 2014 by xi'an