## absurdum technicae

Posted in Kids, Wines with tags , , , , , on February 14, 2015 by xi'an

In what could have been the most expensive raclette ever, I almost get rid of my oven! Last weekend, to fight the ongoing cold wave, we decided to have a raclette with mountain cheese and potatoes, but the raclette machine (mostly a resistance to melt the cheese) had an electric issue and kept blowing the meter. We then decided to use the over to melt the cheese but, while giving all signs of working, it would not heat. Rather than a cold raclette, we managed with the microwave (!), but I though the oven had blown as well. The next morning, I still checked on the web for similar accidents and found the explanation: by pressing the proper combination of buttons, we had succeeded to switch the over into the demo mode, used by shops to run the oven with no heating. The insane part of this little [very little] story is that nowhere in the manual appeared any indication of an existing demo mode and of a way of getting back to normal! After pushing combinations of buttons at random, I eventually got the solution and the oven is again working, instead of standing in the recycling bin.

## brief stop in Edinburgh

Posted in Mountains, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , on January 24, 2015 by xi'an

Yesterday, I was all too briefly in Edinburgh for a few hours, to give a seminar in the School of Mathematics, on the random forests approach to ABC model choice (that was earlier rejected). (The slides are almost surely identical to those used at the NIPS workshop.) One interesting question at the end of the talk was on the potential bias in the posterior predictive expected loss, bias against some model from the collection of models being evaluated for selection. In the sense that the array of summaries used by the random forest could fail to capture features of a particular model and hence discriminate against it. While this is correct, there is no fundamental difference with implementing a posterior probability based on the same summaries. And the posterior predictive expected loss offers the advantage of testing, that is, for representative simulations from each model, of returning the corresponding model prediction error to highlight poor performances on some models. A further discussion over tea led me to ponder whether or not we could expand the use of random forests to Bayesian quantile regression. However, this would imply a monotonicity structure on a collection of random forests, which sounds daunting…

My stay in Edinburgh was quite brief as I drove to the Highlands after the seminar, heading to Fort William, Although the weather was rather ghastly, the traffic was fairly light and I managed to get there unscathed, without hitting any of the deer of Rannoch Mor (saw one dead by the side of the road though…) or the snow banks of the narrow roads along Loch Lubnaig. And, as usual, it still was a pleasant feeling to drive through those places associated with climbs and hikes, Crianlarich, Tyndrum, Bridge of Orchy, and Glencoe. And to get in town early enough to enjoy a quick dinner at The Grog & Gruel, reflecting I must have had half a dozen dinners there with friends (or not) over the years. And drinking a great heather ale to them!

## Sequential Monte Carlo 2015 workshop

Posted in pictures, R, Statistics, Travel, University life, Wines with tags , , , , , on January 22, 2015 by xi'an
An announcement for the SMC 2015 workshop:
Sequential Monte Carlo methods (also known as particle filters) have revolutionized the on-line and off-line analysis of data in fields as diverse as target tracking, computer vision, financial modelling, brain imagery, or population ecology. Their popularity stems from the fact that they have made possible to solve numerically many complex problems that were previously intractable.
The aim of the SMC 2015 workshop, in the spirit of SMC2006 and SMC2012, is to gather scientists from all areas of science interested in the theory, methodology or application of Sequential Monte Carlo methods.
SMC 2015 will take place at ENSAE, Paris, on August 26-28 2015.
The organising committee
Nicolas Chopin ENSAE, Paris
Adam Johansen, Warwick University
Thomas Schön, Uppsala University

## cadillac [of wines]

Posted in Wines with tags , , , , on January 20, 2015 by xi'an

## foie gras fois trois

Posted in Statistics, Wines with tags , , , , , , on December 31, 2014 by xi'an

As New Year’s Eve celebrations are getting quite near, newspapers once again focus on related issues, from the shortage of truffles, to the size of champagne bubbles, to the prohibition of foie gras. Today, I noticed an headline in Le Monde about a “huge increase in French people against force-fed geese and ducks: 3% more than last year are opposed to this practice”. Now, looking at the figures, it is based on a survey of 1,032 adults, out of which 47% were against. From a purely statistical perspective, this is not highly significant since

$\dfrac{\hat{p}_1-\hat{p_2}}{\sqrt{2\hat{p}(1-\hat{p})/1032}}=1.36$

is compatible with the null hypothesis N(0,1) distribution.

## Frescobaldi Castello di Nipozzano Montesodi

Posted in Wines with tags , , , , , , on December 20, 2014 by xi'an

## Quasi-Monte Carlo sampling

Posted in Books, Kids, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , on December 10, 2014 by xi'an

“The QMC algorithm forces us to write any simulation as an explicit function of uniform samples.” (p.8)

As posted a few days ago, Mathieu Gerber and Nicolas Chopin will read this afternoon a Paper to the Royal Statistical Society on their sequential quasi-Monte Carlo sampling paper.  Here are some comments on the paper that are preliminaries to my written discussion (to be sent before the slightly awkward deadline of Jan 2, 2015).

Quasi-Monte Carlo methods are definitely not popular within the (mainstream) statistical community, despite regular attempts by respected researchers like Art Owen and Pierre L’Écuyer to induce more use of those methods. It is thus to be hoped that the current attempt will be more successful, it being Read to the Royal Statistical Society being a major step towards a wide diffusion. I am looking forward to the collection of discussions that will result from the incoming afternoon (and bemoan once again having to miss it!).

“It is also the resampling step that makes the introduction of QMC into SMC sampling non-trivial.” (p.3)

At a mathematical level, the fact that randomised low discrepancy sequences produce both unbiased estimators and error rates of order

$\mathfrak{O}(N^{-1}\log(N)^{d-}) \text{ at cost } \mathfrak{O}(N\log(N))$

means that randomised quasi-Monte Carlo methods should always be used, instead of regular Monte Carlo methods! So why is it not always used?! The difficulty stands [I think] in expressing the Monte Carlo estimators in terms of a deterministic function of a fixed number of uniforms (and possibly of past simulated values). At least this is why I never attempted at crossing the Rubicon into the quasi-Monte Carlo realm… And maybe also why the step had to appear in connection with particle filters, which can be seen as dynamic importance sampling methods and hence enjoy a local iid-ness that relates better to quasi-Monte Carlo integrators than single-chain MCMC algorithms.  For instance, each resampling step in a particle filter consists in a repeated multinomial generation, hence should have been turned into quasi-Monte Carlo ages ago. (However, rather than the basic solution drafted in Table 2, lower variance solutions like systematic and residual sampling have been proposed in the particle literature and I wonder if any of these is a special form of quasi-Monte Carlo.) In the present setting, the authors move further and apply quasi-Monte Carlo to the particles themselves. However, they still assume the deterministic transform

$\mathbf{x}_t^n = \Gamma_t(\mathbf{x}_{t-1}^n,\mathbf{u}_{t}^n)$

which the q-block on which I stumbled each time I contemplated quasi-Monte Carlo… So the fundamental difficulty with the whole proposal is that the generation from the Markov proposal

$m_t(\tilde{\mathbf{x}}_{t-1}^n,\cdot)$

has to be of the above form. Is the strength of this assumption discussed anywhere in the paper? All baseline distributions there are normal. And in the case it does not easily apply, what would the gain bw in only using the second step (i.e., quasi-Monte Carlo-ing the multinomial simulation from the empirical cdf)? In a sequential setting with unknown parameters θ, the transform is modified each time θ is modified and I wonder at the impact on computing cost if the inverse cdf is not available analytically. And I presume simulating the θ’s cannot benefit from quasi-Monte Carlo improvements.

The paper obviously cannot get into every detail, obviously, but I would also welcome indications on the cost of deriving the Hilbert curve, in particular in connection with the dimension d as it has to separate all of the N particles, and on the stopping rule on m that means only Hm is used.

Another question stands with the multiplicity of low discrepancy sequences and their impact on the overall convergence. If Art Owen’s (1997) nested scrambling leads to the best rate, as implied by Theorem 7, why should we ever consider another choice?

In connection with Lemma 1 and the sequential quasi-Monte Carlo approximation of the evidence, I wonder at any possible Rao-Blackwellisation using all proposed moves rather than only those accepted. I mean, from a quasi-Monte Carlo viewpoint, is Rao-Blackwellisation easier and is it of any significant interest?

What are the computing costs and gains for forward and backward sampling? They are not discussed there. I also fail to understand the trick at the end of 4.2.1, using SQMC on a single vector instead of (t+1) of them. Again assuming inverse cdfs are available? Any connection with the Polson et al.’s particle learning literature?

Last questions: what is the (learning) effort for lazy me to move to SQMC? Any hope of stepping outside particle filtering?