**T**oday, I read a newly arXived paper by Stephen Gratton on a method called GLASS for *General Likelihood Approximate Solution Scheme*… The starting point is the same as with ABC or synthetic likelihood, namely a collection of summary statistics and an intractable likelihood. The author proposes to use as a substitute a maximum entropy solution based on these summary statistics and their assumed moments under the theoretical model. What is quite unclear in the paper is whether or not these assumed moments are available in closed form or not. Otherwise, it would appear as a variant to the synthetic likelihood [aka simulated moments] approach, meaning that the expectations of the summary statistics under the theoretical model and for a given value of the parameter are obtained through Monte Carlo approximations. (All the examples therein allow for closed form expressions.)

## Archive for maximum entropy

## approximate likelihood

Posted in Books, Statistics with tags ABC, arXiv, likelihood-free methods, maximum entropy, synthetic likelihood, untractable normalizing constant on September 6, 2017 by xi'an## Bayesian programming [book review]

Posted in Books, Kids, pictures, Statistics, University life with tags artificial intelligence, Bayesian inference, Bayesian programming, CHANCE, conjugate priors, E.T. Jaynes, graphical models, maximum entropy, Python, robots on March 3, 2014 by xi'an

“We now think the Bayesian Programming methodology and tools are reaching maturity. The goal of this book is to present them so that anyone is able to use them. We will, of course, continue to improve tools and develop new models. However, pursuing the idea that probability is an alternative to Boolean logic, we now have a new important research objective, which is to design specific hsrdware, inspired from biology, to build a Bayesian computer.”(p.xviii)

**O**n the plane to and from Montpellier, I took an extended look at Bayesian Programming a CRC Press book recently written by Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha. *(Very nice picture of a fishing net on the cover, by the way!)* Despite the initial excitement at seeing a book which final goal was to achieve a Bayesian computer, as demonstrated by the above quote, I however soon found the book too arid to read due to its highly formalised presentation… The contents are clear indications that the approach is useful as they illustrate the use of Bayesian programming in different decision-making settings, including a collection of Python codes, so it brings an answer to the *what* but it somehow misses the *how* in that the construction of the priors and the derivation of the posteriors is not explained in a way one could replicate.

“A modeling methodology is not sufficient to run Bayesian programs. We also require an efficient Bayesian inference engine to automate the probabilistic calculus. This assumes we have a collection of inference algorithms adapted and tuned to more or less specific models and a software architecture to combine them in a coherent and unique tool.” (p.9)

**F**or instance, all models therein are described via the curly brace formalism summarised by

which quickly turns into an unpalatable object, as in this example taken from the online PhD thesis of Gabriel Synnaeve (where he applied Bayesian programming principles to a MMORPG called StarCraft and developed an AI (or bot) able to play BroodwarBotQ)

thesis that I found most interesting!

“Consequently, we have 21 × 16 = 336 bell-shaped distributions and we have 2 × 21 × 16 = 772 free parameters: 336 means and 336 standard deviations.¨(p.51)

**N**ow, getting back to the topic of the book, I can see connections with statistical problems and models, and not only via the application of Bayes’ theorem, when the purpose (or *Question*) is to take a decision, for instance in a robotic action. I still remain puzzled by the purpose of the book, since it starts with very low expectations on the reader, but hurries past notions like Kalman filters and Metropolis-Hastings algorithms in a few paragraphs. I do not get some of the details, like this notion of a discretised Gaussian distribution (I eventually found the place where the 772 prior parameters are “learned” in a phase called “identification”.)

“Thanks to conditional independence the curse of dimensionality has been broken! What has been shown to be true here for the required memory space is also true for the complexity of inferences. Conditional independence is the principal tool to keep the calculation tractable. Tractability of Bayesian inference computation is of course a major concern as it has been proved NP-hard (Cooper, 1990).”(p.74)

**T**he final chapters (Chap. 14 on “Bayesian inference algorithms revisited”, Chap. 15 on “Bayesian learning revisited” and Chap. 16 on “Frequently asked questions and frequently argued matters” [!]) are definitely those I found easiest to read and relate to. With mentions made of conjugate priors and of the EM algorithm as a (Bayes) classifier. The final chapter mentions BUGS, Hugin and… Stan! Plus a sequence of 23 PhD theses defended on Bayesian programming for robotics in the past 20 years. And explains the authors’ views on the difference between Bayesian programming and Bayesian networks (“any Bayesian network can be represented in the Bayesian programming formalism, but the opposite is not true”, p.316), between Bayesian programming and probabilistic programming (“we do not search to extend classical languages but rather to replace them by a new programming approach based on probability”, p.319), between Bayesian programming and Bayesian modelling (“Bayesian programming goes one step further”, p.317), with a further (self-)justification of why the book sticks to discrete variables, and further more philosophical sections referring to Jaynes and the principle of maximum entropy.

“The “objectivity” of the subjectivist approach then lies in the fact that two different subjects with same preliminary knowledge and same observations will inevitably reach the same conclusions.”(p.327)

Bayesian Programming thus provides a good snapshot of (or window on) what one can achieve in uncertain environment decision-making with Bayesian techniques. It shows a long-term reflection on those notions by Pierre Bessière, his colleagues and students. The topic is most likely too remote from my own interests for the above review to be complete. Therefore, if anyone is interested in reviewing any further this book for CHANCE, before I send the above to the journal, please contact me. (Usual provisions apply.)

## bioinformatics workshop at Pasteur

Posted in Books, Statistics, University life with tags bioinformatics, John Burdon Sanderson Haldane, MaxEnt, maximum entropy, protein folding on September 23, 2013 by xi'an**O**nce again, I (did) find myself attending lectures on a Monday! This time, it was at the Institut Pasteur, (where I did not spot any mention of Alexandre Yersin) in the bioinformatics unit, around Bayesian methods in computational biology. The workshop was organised by Michael Nilges and the program started as follows:

9:10 AM Michael Habeck (MPI Göttingen) Bayesian methods for cryo-EM

9:50 AM John Chodera (Sloan-Kettering research institute) Toward Bayesian inference of conformational distributions, analysis of isothermal titration calorimetry experiments, and forcefield parameters

11:00 AM Jeff Hoch (University of Connecticut Health Center) Haldane, Bayes, and Reproducible Research: Bedrock Principles for the Era of Big Data

11:40 AM Martin Weigt (UPMC Paris) Direct-Coupling Analysis: From residue co-evolution to structure prediction

12:20 PM Riccardo Pellarin (UCSF) Modeling the structure of macromolecules using cross-linking data

2:20 PM Frederic Cazals (INRIA Sophia-Antipolis) Coarse-grain Modeling of Large Macro-Molecular Assemblies: Selected Challenges

3:00 PM Yannick Spill (Institut Pasteur) Bayesian Treatment of SAXS Data

3:30 PM Guillaume Bouvier (Institut Pasteur) Clustering protein conformations using Self-Organizing Maps

This is a highly interesting community, from which stemmed many of the MC and MCMC ideas, but I must admit I got lost (in translation) most of the time (and did not attend the workshop till its end), just like when I attended this workshop at the German synchrotron in Hamburg last Spring: some terms and concepts were familiar like Gibbs sampling, Hamiltonian MCMC, HMM modelling, EM steps, maximum entropy priors, reversible jump MCMC, &tc., but the talks were going too fast (for me) and focussed instead on the bio-chemical aspects, like protein folding, entropy-enthalpy, free energy, &tc. So the following comments mostly reflect my being alien to this community…

**F**or instance, I found the talk by John Chodera quite interesting (in a fast-forward high-energy/content manner), but the probabilistic modelling was mostly absent from his slides (and seemed to reduce to a Gaussian likelihood) and the defence of Bayesian statistics sounded a bit like a mantra at times (something like *“put a prior on everything you do not know and everything will end up fine with enough simulations”*), a feature I once observed in the past with Bayesian ideas coming to a new field (although this hardly seems to be the case here).

**A**ll talks I attended mentioned maximum entropy as a way of modelling, apparently a common tool in this domain (as there were too little details for me). For instance, Jeff Hoch’s talk remained at a very general level, referring to a large literature (incl. David Donoho’s) for the advantages of using MaxEnt deconvolution to preserve sensitivity. (The “Haldane” part of his talk was about Haldane —who moved from UCL to the ISI in Calcutta— writing a parody on how to fake genetic data in a convincing manner. And showing the above picture.) Although he linked them with MaxEnt principles, Martin Weigt’s talk was about Markov random fields modelling contacts between amino acids in the protein, but I could not get how the selection among the huge number of possible models was handled: To me it seemed to amount to estimate a graphical model on the protein, as it also did for my neighbour. (No sign of any ABC processing in the picture.)

## MaxEnt 2013, Canberra, Dec. 15-20

Posted in Mountains, pictures, Statistics, Travel, University life with tags Australia, Canberra, conference, E.T. Jaynes, MaxEnt, maximum entropy, O'Bayes, Oxford (Mississipi) on July 3, 2013 by xi'an**J**ust got this announcement that MaxEnt 2013, *33ième du genre*, is taking place in Canberra, Australia, next December. (Which is winter here but summer there!) See the website for details, although they are not yet aplenty! I took part in MaxEnt 2009, in Oxford, Mississipi, but will not attend MaxEnt 2013 as it is (far away and) during O-Bayes 2013 in Duke…

## MaxEnt2009 impressions

Posted in Statistics, University life with tags Bayesian statistics, MaxEnt2009, maximum entropy, nested sampling, Valencia meeting on July 9, 2009 by xi'an**A**s I am getting ready to leave Oxford and the **MaxEnt2009** meeting, a few quick impressions. First, this is a meeting like no other I have attended in that the mix of disciplines there is much wider and that I find myself at the very (statistical) end of the spectrum. There are researchers from astronomy, physics, chemistry, computer science, engineering, and hardly any from statistics. Second, the audience being of a decent (meaning small enough) size, the debates are numerous and often focus on the foundations of Bayesian statistics, a feature that has almost disappeared from the Valencia meetings. Some of the talks were actually mostly philosophical on the nature of deduction and inference, and I could not always see the point, but this is also enjoyable (once in a while). For instance, during the poster session, I had a long and lively discussion with David Blower on the construction of Jeffreys versus Haldane priors, as well as another long and lively discussion with Subhadeep Mukhopadhyay on fast exploration stochastic approximation algorithms. It was also an opportunity to reflect on nested sampling, given the surroundings and the free time, and I now perceive this technique primarily as a particular importance sampling method. So, overall, an enjoyable time! (Since **MaxEnt2010** will take place in Grenoble, I may even attend the next conference.)

## MaxEnt2009

Posted in Statistics with tags MaxEnt2009, maximum entropy, Monte Carlo, nested sampling, simulation on July 6, 2009 by xi'an**T**he revision of my talk for the MaxEnt2009 conference is now available on slideshare as

**T**he major difference with the talks I gave in Montréal, Edinburgh, Warwick or Rimini is the additional experiment we ran with Darren Wraith on the banana targets already used in the “PMC for cosmologist” paper. As for the mixture benchmark used in the paper with Nicolas Chopin, we found that implementing nested sampling by the book, ie based on the lighthouse code provided in the original papers, led to the right value on average but with a lot more variability than for importance sampling solutions (with a comparable number of iterations).

**W**hen running this morning on the campus (i.e. wading through water particles without a snorkel), I thought—helped by comments from Olivier Cappé—that the best explanation for nested sampling is one of importance sampling, the weight of points being the prior mass of the current upper likelihood region. Obviously, this is not the whole story since those constrained priors have a *smaller* support than the posterior, but this may help in a better evaluation of nested sampling.

## La partenza per MaxEnt 2009

Posted in Books, Statistics, Travel with tags Biometrika, Faulkner, MaxEnt2009, maximum entropy, nested sampling on July 4, 2009 by xi'an**J**ust to let this indefinable perfume of Italy linger for a few more posts, but there is no objective reason to switch to Italian… I am actually off to Oxford, Mississippi, at the MaxEnt 2009 meeting, which is the 29th workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering. Despite the strong connections with my research interests, I have never attended one of those conferences and this sounds exciting. Especially given my talk there on computational methods, which covers in particular nested sampling with the critical views summarised in this post. This promises for interesting exchanges, since the inventor of nested sampling, John Skilling, is one of the founding fathers of the maximum entropy community.

**T**his will also give me the opportunity to go over the new revision of the paper before sending it back to ** Biometrika**. The reviews were so lukewarm that I was ready to pack up and go. Fortunately, Nicolas Chopin overrode my depressive tendencies and launched into the revision! All I have to do in the coming days is thus to compile once again the seven pages of the

**style manual to make sure we comply with every item of it. (The editorial requests of**

*Biometrika***go beyond most other journals’, even**

*Biometrika***, which is a pain when writing a paper, because the time spent of complying with the stylistic restrictions will be wasted if the paper is rejected!, but which is also most enjoyable at the reader’s level.)**

*PNAS*Oxford is the home town of William Faulkner and. as such, enjoys a literary atmosphere (whatever that means) with several genuine bookstores, including Square Books. Given that I have never been in the South (Florida being a separate geographical entity!), this is also an interesting opportunity. Even though I am not looking forward the hot, clammy, humid weather (humidity is currently 78% at 11pm…). And considering that MaxEnt 2003 was in Jackson Hole, Wyoming…