MCMSki [day 2]

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , , on January 8, 2014 by xi'an

I was still feeling poorly this morning with my brain in a kind of flu-induced haze so could not concentrate for a whole talk, which is a shame as I missed most of the contents of the astrostatistics session put together by David van Dyk… Especially the talk by Roberto Trotta I was definitely looking for. And the defence of nested sampling strategies for marginal likelihood approximations. Even though I spotted posterior distributions for WMAP and Plank data on the ΛCDM that reminded me of our own work in this area… Apologies thus to all speakers for dozing in and out, it was certainly not due to a lack of interest!

Sebastian Seehars mentioned emcee (for ensemble Monte Carlo), with a corresponding software nicknamed “the MCMC hammer”, and their own CosmoHammer software. I read the paper by Goodman and Ware (2010) this afternoon during the ski break (if not on a ski lift!). Actually, I do not understand why an MCMC should be affine invariant: a good adaptive MCMC sampler should anyway catch up the right scale of the target distribution. Other than that, the ensemble sampler reminds me very much of the pinball sampler we developed with Kerrie Mengersen (1995 Valencia meeting), where the target is the product of L targets,

$\pi(x_1)\cdots\pi(x_L)$

and a Gibbs-like sampler can be constructed, moving one component (with index k, say) of the L-sample at a time. (Just as in the pinball sampler.) Rather than avoiding all other components (as in the pinball sampler), Goodman and Ware draw a single other component at random  (with index j, say) and make a proposal away from it:

$\eta=x_j(t) + \zeta \{x_k(t)-x_j(t)\}$

where ζ is a scale random variable with (log-) symmetry around 1. The authors claim improvement over a single track Metropolis algorithm, but it of course depends on the type of Metropolis algorithms that is chosen… Overall, I think the criticism of the pinball sampler also applies here: using a product of targets can only slow down the convergence. Further, the affine structure of the target support is not a given. Highly constrained settings should not cope well with linear transforms and non-linear reparameterisations would be more efficient….

big bang/data/computers

Posted in Running, Statistics, University life with tags , , , , , , , , , on September 21, 2012 by xi'an

I missed this astrostatistics conference announcement (and the conference itself, obviously!), occurring next door… Actually, I would have had (wee) trouble getting there as I was (and am) mostly stuck at home with a bruised knee and a doctor ban on any exercise in the coming day, thanks to a bike fall last Monday! (One of my 1991 bike pedals broke as I was climbing a steep slope and I did not react fast enough… Just at the right time to ruin my training preparation of the Argentan half-marathon. Again.) Too bad because there was a lot of talks that were of interest to me!

Kant, Platon, Bayes, & Le Monde…

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on July 2, 2012 by xi'an

In the weekend edition of Le Monde I bought when getting out of my plane back from Osaka, and ISBA 2012!, the science leaflet has a (weekly) tribune by a physicist called Marco Zito that discussed this time of the differences between frequentist and Bayesian confidence intervals. While it is nice to see this opposition debated in a general audience daily like Le Monde, I am not sure the tribune will bring enough light to help to the newcomer to reach an opinion about the difference! (The previous tribune considering Bayesian statistics was certainly more to my taste!)

Since I cannot find a link to the paper, let me sum up: the core of the tribune is to wonder what does 90% in 90% confidence interval mean? The Bayesian version sounds ridiculous since “there is a single true value of [the parameter] M and it is either in the interval or not” [my translation]. The physicist then goes into stating that the probability is in fact “subjective. It measures the degree of conviction of the scientists, given the data, for M to be in the interval. If those scientists were aware of another measure, they would use another interval” [my translation]. Darn… so many misrepresentations in so few words! First, as a Bayesian, I most often consider there is a true value for the parameter associated with a dataset but I still use a prior and a posterior that are not point masses, without being incoherent, simply because the posterior only summarizes what I know about the  parameter, but is obviously not a property of the true parameter. Second, the fact that the interval changes with the measure has nothing to do with being Bayesians. A frequentist would also change her/his interval with other measures…Third, the Bayesian “confidence” interval is but a tiny (and reductive) part of the inference one can draw from the posterior distribution.

From this delicate start, things do not improve in the tribune: the frequentist approach is objective and not contested by Marco Zito, as it sounds eminently logical. Kant is associated with Bayes and Platon with the frequentist approach, “religious wars” are mentioned about both perspectives debating endlessly about the validity of their interpretation (is this truly the case? In the few cosmology papers I modestly contributed to, referees’ reports never objected to the Bayesian approach…) The conclusion makes one wonders what is the overall point of this tribune: superficial philosophy (“the debate keeps going on and this makes sense since it deals with the very nature of research: can we know and speak of the world per se or is it forever hidden to us? (…) This is why doubt and even distrust apply about every scientific result and also in other settings.”) or criticism of statistics (“science (or art) of interpreting results from an experiment”)? (And to preamp a foreseeable question: no, I am not writing to the journal this time!)

550 billion particles

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , on April 22, 2012 by xi'an

“Space,” it says, “is big. Really big. You just won’t believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to space, listen…” The Hitchhiker’s Guide to the Galaxy, Douglas Adams

There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another theory which states that this has already happened.The Hitchhiker’s Guide to the Galaxy, Douglas Adams

Following a link on Science Daily when looking at this 64 kcal mystery, I found an interesting annoucement about the most complete simulation of the evolution of the Universe from the Big Bang till now. The cosmology research unit in charge of the project is furthermore called DEUS (for Dark Energy Universe Simulation!), mostly located at Université Paris-Diderot, and its “goal is to investigate the imprints of dark energy on cosmic structure formation through high-performance numerical simulations”. It just announced the “simulation of the full observable universe for the concordance ΛCDM model”, which allows for the comparison of several cosmological models. (Data is freely available.) Besides the sheer scientific appeal of the project, the simulation side is also fascinating, although quite remote from Monte Carlo principles, in that the approach relies on very few repetitions of the simulation. The statistics are based on a single simulation, for a completely observed (simulated) Universe.

If life is going to exist in a Universe of this size, then the one thing it cannot afford to have is a sense of proportion…” The Hitchhiker’s Guide to the Galaxy, Douglas Adams

The amounts involved in this simulation are simply mindboggling: 92 000 CPUs,  150 PBytes of data, 2 (U.S.) quadrillion flops (2 PFlop/s), the equivalent of 30 million computing hours, each particle has the size of the Milky Way, and so on… Here is a videoed description of the project (make sure to turn the sounds off if, like me, you simply and definitely hate Strauss’ music, and even if you like it, since the pictures do not move at the same pace as the music!):

Cosmology meets machine learning

Posted in Statistics, Travel, University life with tags , , , , , , , on April 6, 2011 by xi'an

There is a workshop on cosmology and machine learning at UCL on May 3-4, i.e. just before the ABC in London workshop at Imperial! I wish I had heard about it earlier so as to plan my trip accordingly… There is even a talk on ABC in cosmology by Manfred Opper! The next day (during the ABC in London workshop) is an update about the GREAT10 data challenge.