bioinformatics workshop at Pasteur

Once again, I (did) find myself attending lectures on a Monday! This time, it was at the Institut Pasteur, (where I did not spot any mention of Alexandre Yersin) in the bioinformatics unit, around Bayesian methods in computational biology. The workshop was organised by Michael Nilges and the program started as follows:

9:10 AM Michael Habeck (MPI Göttingen) Bayesian methods for cryo-EM
9:50 AM John Chodera (Sloan-Kettering research institute) Toward Bayesian inference of conformational distributions, analysis of isothermal titration calorimetry experiments, and forcefield parameters
11:00 AM Jeff Hoch (University of Connecticut Health Center) Haldane, Bayes, and Reproducible Research: Bedrock Principles for the Era of Big  Data
11:40 AM Martin Weigt (UPMC Paris) Direct-Coupling Analysis: From residue co-evolution to structure prediction
12:20 PM Riccardo Pellarin (UCSF) Modeling the structure of macromolecules using cross-linking data
2:20 PM Frederic Cazals (INRIA Sophia-Antipolis) Coarse-grain Modeling of Large Macro-Molecular Assemblies: Selected Challenges
3:00 PM Yannick Spill (Institut Pasteur) Bayesian Treatment of SAXS Data
3:30 PM Guillaume Bouvier (Institut Pasteur) Clustering protein conformations using Self-Organizing Maps

This is a highly interesting community, from which stemmed many of the MC and MCMC ideas, but I must admit I got lost (in translation) most of the time (and did not attend the workshop till its end), just like when I attended this workshop at the German synchrotron in Hamburg last Spring: some terms and concepts were familiar like Gibbs sampling, Hamiltonian MCMC, HMM modelling, EM steps, maximum entropy priors, reversible jump MCMC, &tc., but the talks were going too fast (for me) and focussed instead on the bio-chemical aspects, like protein folding, entropy-enthalpy, free energy, &tc. So the following comments mostly reflect my being alien to this community…

For instance, I found the talk by John Chodera quite interesting (in a fast-forward high-energy/content manner), but the probabilistic modelling was mostly absent from his slides (and seemed to reduce to a Gaussian likelihood) and the defence of Bayesian statistics sounded a bit like a mantra at times (something like “put a prior on everything you do not know and everything will end up fine with enough simulations”), a feature I once observed in the past with Bayesian ideas coming to a new field (although this hardly seems to be the case here).

All talks I attended mentioned maximum entropy as a way of modelling, apparently a common tool in this domain (as there were too little details for me). For instance, Jeff Hoch’s talk remained at a very general level, referring to a large literature (incl. David Donoho’s) for the advantages of using MaxEnt deconvolution to preserve sensitivity. (The “Haldane” part of his talk was about Haldane —who moved from UCL to the ISI in Calcutta— writing a parody on how to fake genetic data in a convincing manner. And showing the above picture.) Although he linked them with MaxEnt principles, Martin Weigt’s talk was about Markov random fields modelling contacts between amino acids in the protein, but I could not get how the selection among the huge number of possible models was handled: To me it seemed to amount to estimate a graphical model on the protein, as it also did for my neighbour. (No sign of any ABC processing in the picture.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.