## reactionaries behind wheels

Posted in pictures, Running with tags , , , , , , , , , , , on November 18, 2018 by xi'an

France was hit by hundreds of blockades yesterday, sometimes with dramatic consequences, as a reaction to the planned ecological tax on gas announced by the French government. As in every occasion French drivers are impacted by new laws or taxes, from reducing the legal speed limit to installing new radars, to tolls for trucks, they react like a Swiss watch, blocking streets and highways, often with success in the end. As in the previous “bonnets rouges” movement (making me wonder why these actions are always connected with clothes!). While being highly privileged to be able to bike to work (or to use the local trains, when they run) and to shop locally, I am struck by the doubly myopic of the protesters, myopy of not seeing the larger picture of the urgent need to cut the addiction to cars  with obvious negative consequences in the short term and myopy of seeing these protests as “spontaneous” and “politically neutral” despite the immediate recuperation by the fringe political parties. And thus hope the French government will hold on that measure (despite its poor record so far in terms of ecological policy).

## Nature snapshots

Posted in Books, pictures, University life with tags , , , , , , , , , , on April 2, 2018 by xi'an

In this 15 March issue of Nature, a rather puzzling article on altruism that seemed to rely on the Hamilton rule:

linking fitness and number of offspring and benefits through kth-order moments. In a stochastic environment represented by distribution π. Hard to fathom what comes from data and what follows from this (hypothetical) model. Plus a proposal for geoengineering meltdown delays on some Greenland and Antarctica glaciers. Scary. And a film review of Iceman (Der Mann aus dem Eis), retracing Ötzi’s life in an archaeologically acceptable rendering. Including a reconstituted local language, Rhaetic.

## can we trust computer simulations? [day #2]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , on July 13, 2015 by xi'an

“Sometimes the models are better than the data.” G. Krinner

Second day at the conference on building trust in computer simulations. Starting with a highly debated issue, climate change projections. Since so many criticisms are addressed to climate models as being not only wrong but also unverifiable. And uncheckable. As explained by Gerhart Krinner, the IPCC has developed methodologies to compare models and evaluate predictions. However, from what I understood, this validation does not say anything about the future, which is the part of the predictions that matters. And that is attacked by critics and feeds climatic-skeptics. Because it is so easy to argue against the homogeneity of the climate evolution and for “what you’ve seen is not what you’ll get“! (Even though climatic-skeptics are the least likely to use this time-heterogeneity argument, being convinced as they are of the lack of human impact over the climate.)  The second talk was by Viktoria Radchuk about validation in ecology. Defined here as a test of predictions against independent data (and designs). And mentioning Simon Wood’s synthetic likelihood as the Bayesian reference for conducting model choice (as a synthetic likelihoods ratio). I had never thought of this use (found in Wood’s original paper) for synthetic likelihood, I feel a bit queasy about using a synthetic likelihood ratio as a genuine likelihood ratio. Which led to a lively discussion at the end of her talk. The next talk was about validation in economics by Matteo Richiardi, who discussed state-space models where the hidden state is observed through a summary statistic, perfect playground for ABC! But Matteo opted instead for a non-parametric approach that seems to increase imprecision and that I have never seen used in state-space models. The last part of the talk was about non-ergodic models, for which checking for validity becomes much more problematic, in my opinion. Unless one manages multiple observations of the non-ergodic path. Nicole Saam concluded this “Validation in…” morning with Validation in Sociology. With a more pessimistic approach to the possibility of finding a falsifying strategy, because of the vague nature of sociology models. For which data can never be fully informative. She illustrated the issue with an EU negotiation analysis. Where most hypotheses could hardly be tested.

“Bayesians persist with poor examples of randomness.” L. Smith

“Bayesians can be extremely reasonable.” L. Smith

The afternoon session was dedicated to methodology, mostly statistics! Andrew Robinson started with a talk on (frequentist) model validation. Called splitters and lumpers. Illustrated by a forest growth model. He went through traditional hypothesis tests like Neyman-Pearson’s that try to split between samples. And (bio)equivalence tests that take difference as the null. Using his equivalence R package. Then Leonard Smith took over [in a literal way!] from a sort-of-Bayesian perspective, in a work joint with Jim Berger and Gary Rosner on pragmatic Bayes which was mostly negative about Bayesian modelling. Introducing (to me) the compelling notion of structural model error as a representation of the inadequacy of the model. With illustrations from weather and climate models. His criticism of the Bayesian approach is that it cannot be holistic while pretending to be [my wording]. And being inadequate to measure model inadequacy, to the point of making prior choice meaningless. Funny enough, he went back to the ball dropping experiment David Higdon discussed at one JSM I attended a while ago, with the unexpected outcome that one ball did not make it to the bottom of the shaft. A more positive side was that posteriors are useful models but should not be interpreted from a probabilistic perspective. Move beyond probability was his final message. (For most of the talk, I misunderstood P(BS), the probability of a big surprise, for something else…) This was certainly the most provocative talk of the conference  and the discussion could have gone on for the rest of day! Somewhat, Lenny was voluntarily provocative in piling the responsibility upon the Bayesian’s head for being overconfident and not accounting for the physicist’ limitations in modelling the phenomenon of interest. Next talk was by Edward Dougherty on methods used in biology. He separated within-model uncertainty from outside-model inadequacy. The within model part is mostly easy to agree upon. Even though difficulties in estimating parameters creates uncertainty classes of models. Especially because of being from a small data discipline. He analysed the impact of machine learning techniques like classification as being useless without prior knowledge. And argued in favour of the Bayesian minimum mean square error estimator. Which can also lead to a classifier. And experimental design. (Using MSE seems rather reductive when facing large dimensional parameters.) Last talk of the day was by Nicolas Becu, a geographer, with a surprising approach to validation via stakeholders. A priori not too enticing a name! The discussion was of a more philosophical nature, going back to (re)define validation against reality and imperfect models. And including social aspects of validation, e.g., reality being socially constructed. This led to the stakeholders, because a model is then a shared representation. Nicolas illustrated the construction by simulation “games” of a collective model in a community of Thai farmers and in a group of water users.

In a rather unique fashion, we also had an evening discussion on points we share and points we disagreed upon. After dinner (and wine), which did not help I fear! Bill Oberkampf mentioned the use of manufactured solutions to check code, which seemed very much related to physics. But then we got mired into the necessity of dividing between verification and validation. Which sounded very and too much engineering-like to me. Maybe because I do not usually integrate coding errors and algorithmic errors into my reasoning (verification)… Although sharing code and making it available makes a big difference. Or maybe because considering all models are wrong is neither part of my methodology (validation). This part ended up in a fairly pessimistic conclusion on the lack of trust in most published articles. At least in the biological sciences.

## 10w2170, Banff

Posted in Books, Mountains, R, Statistics with tags , , , , , , , , on September 11, 2010 by xi'an

Yesterday night, we started the  Hierarchical Bayesian Methods in Ecology workshop by trading stories. Everyone involved in the programme discussed his/her favourite dataset and corresponding expectations from the course. I found the exchange most interesting, like the one we had two years ago in Gran Paradiso, because of the diversity of approaches to Statistics reflected by the exposition. However, a constant theme is the desire to compare and rank models (this term having different meanings for different students) and the understanding that hierarchical models are a superior way to handle heterogeneity and to gather strength from the whole dataset. A two-day workshop is certainly too short to meet students’ expectations and I hope I will manage to focus on the concepts rather than on the maths and computations…

As each time I come here, the efficiency of BIRS in handling the workshop and making everything smooth and running amazes me. Except for the library, I think it really compares with Oberwolfach in terms of environment and working facilities. (Oberwolfach offers the appeal of seclusion and the Black Forest, while BIRS is providing summits all around plus the range of facility of the Banff Centre and the occasional excitement of a bear crossing the campus or a cougar killing a deer on its outskirt…)

## Off to Banff!!

Posted in Books, Mountains, R, Statistics, Travel, University life with tags , , , , , , , , , on September 10, 2010 by xi'an

Today I am travelling from Paris to Banff, via Amsterdam and Calgary, to take part in the Hierarchical Bayesian Methods in Ecology two day workshop organised at BIRS by Devin Goodsman (University of Alberta),  François Teste (University of Alberta), and myself. I am very excited both by the opportunity to meet young researchers in ecology and forestry, and by the prospect in spending a few days in the Rockies, hopefully with an opportunity to go hiking, scrambling and even climbing. (Plus the purely random crossing of Julien‘s trip in this area!) The slides will be mostly following those of the course I gave in Aosta, while using Introducing Monte Carlo Methods with R for R practicals:

Posted in Books, Statistics with tags , , , , , , , on September 10, 2009 by xi'an

I really like the models derived from capture-recapture experiments, because they encompass latent variables, hidden Markov process, Gibbs simulation, EM estimation, and hierarchical models in a simple setup with a nice side story to motivate it (at least in Ecology, in Social Sciences, those models are rather associated with sad stories like homeless, heroin addicts or prostitutes…) I was thus quite surprised to hear from many that the capture-recapture chapter in Bayesian Core was hard to understand. In a sense, I find it easier than the mixture chapter because the data is discrete and everything can [almost!] be done by hand…

Today I received an email from Cristiano about a typo in The Bayesian Choice concerning capture-recapture models:

“I’ve read the paragraph (4.3.3) in your book and I have some doubts about the proposed formula in example 4.3.3. My guess is that a typo is here, where (n-n_1) instead of n_2 should appear in the hypergeometric distribution.”

It is indeed the case! This mistake has been surviving the many revisions and reprints of the book and is also found in the French translation Le Choix Bayésien, in Example 4.19… In both cases, ${n_2 \choose n_2-n_{11}}$ should be ${n-n_1 \choose n_2-n_{11}}$, shame on me! (The mistake does not appear in Bayesian Core.)

to which I can only suggest to incorporate the error-in-variable structure, ie the possible confusion  in identifying individuals, within the model and to run a Gibbs sampler that simulates iteratively the latent variable” true numbers of individuals in captures 1 and 2″ and the parameters given those latent variables. This problem of counting the same individual twice or more has obvious applications in Ecology, when animals are only identified by watchers, as in whale sightings, and in Social Sciences, when individuals are lacking identification. [To answer specifically the overestimation question, this is clearly the case since $n_1$ and $n_2$ are larger than in truth, while $n_{11}$ presumably remains the same….]