Archive for the Statistics Category

missing slide

Posted in pictures, Statistics, Travel, University life with tags , , on July 31, 2014 by xi'an

I realised too late I should have added this slide to my talk in Bangalore, to thank the Indian participants and organisers of the IFCAM workshop:

நன்றி    ಧನ್ಯವಾದ   ਤੁਹਾਡਾ ਧੰਨਵਾਦ

धन्यवाद  આભાર  আপনাকে   ধন্যবাদ

अनुगुरिहीतोसुमि ధన్యవాదాలు  آپ کا شکریہ

Bangalore workshop [ಬೆಂಗಳೂರು ಕಾರ್ಯಾಗಾರ]

Posted in pictures, R, Running, Statistics, Travel, University life, Wines with tags , , , , , , on July 31, 2014 by xi'an

mathdeptSecond day at the Indo-French Centre for Applied Mathematics and the workshop. Maybe not the most exciting day in terms of talks (as I missed the first two plenary sessions by (a) oversleeping and (b) running across the campus!). However I had a neat talk with another conference participant that led to [what I think are] interesting questions… (And a very good meal in a local restaurant as the guest house had not booked me for dinner!)

To wit: given a target like

\lambda \exp(-\lambda) \prod_{i=1}^n \dfrac{1-\exp(-\lambda y_i)}{\lambda}\quad (*)

the simulation of λ can be demarginalised into the simulation of

\pi (\lambda,\mathbf{z})\propto \lambda \exp(-\lambda) \prod_{i=1}^n \exp(-\lambda z_i) \mathbb{I}(z_i\le y_i)

where z is a latent (and artificial) variable. This means a Gibbs sampler simulating λ given z and z given λ can produce an outcome from the target (*). Interestingly, another completion is to consider that the zi‘s are U(0,yi) and to see the quantity

\pi(\lambda,\mathbf{z}) \propto \lambda \exp(-\lambda) \prod_{i=1}^n \exp(-\lambda z_i) \mathbb{I}(z_i\le y_i)

as an unbiased estimator of the target. What’s quite intriguing is that the quantity remains the same but with different motivations: (a) demarginalisation versus unbiasedness and (b) zi ∼ Exp(λ) versus zi ∼ U(0,yi). The stationary is the same, as shown by the graph below, the core distributions are [formally] the same, … but the reasoning deeply differs.


Obviously, since unbiased estimators of the likelihood can be justified by auxiliary variable arguments, this is not in fine a big surprise. Still, I had not thought of the analogy between demarginalisation and unbiased likelihood estimation previously. Continue reading

Bangalore workshop [ಬೆಂಗಳೂರು ಕಾರ್ಯಾಗಾರ]

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , on July 30, 2014 by xi'an

iiscFirst day at the Indo-French Centre for Applied Mathematics and the get-together (or speed-dating!) workshop. The campus of the Indian Institute of Science of Bangalore where we all stay is very pleasant with plenty of greenery in the middle of a very busy city. Plus, being at about 1000m means the temperature remains tolerable for me, to the point of letting me run in the morning.Plus, staying in a guest house in the campus also means genuine and enjoyable south Indian food.

The workshop is a mix of statisticians and of mathematicians of neurosciences, from both India and France, and we are few enough to have a lot of opportunities for discussion and potential joint projects. I gave the first talk this morning (hence a fairly short run!) on ABC model choice with random forests and, given the mixed audience, may have launched too quickly into the technicalities of the forests. Even though I think I kept the statisticians on-board for most of the talk. While the mathematical biology talks mostly went over my head (esp. when I could not resist dozing!), I enjoyed the presentation of Francis Bach of a fast stochastic gradient algorithm, where the stochastic average is only updated one term at a time, for apparently much faster convergence results. This is related with a joint work with Éric Moulines that both Éric and Francis presented in the past month. And makes me wonder at the intuition behind the major speed-up. Shrinkage to the mean maybe?

PMC for combinatoric spaces

Posted in Statistics, University life with tags , , , , , , , on July 28, 2014 by xi'an

I received this interesting [edited] email from Xiannian Fan at CUNY:

I am trying to use PMC to solve Bayesian network structure learning problem (which is in a combinatorial space, not continuous space).

In PMC, the proposal distributions qi,t can be very flexible, even specific to each iteration and each instance. My problem occurs due to the combinatorial space.

For importance sampling, the requirement for proposal distribution, q, is:

support (p) ⊂ support (q)             (*)

For PMC, what is the support of the proposal distribution in iteration t? is it

support (p) ⊂ U support(qi,t)    (**)

or does (*) apply to every qi,t?

For continuous problem, this is not a big issue. We can use random walk of Normal distribution to do local move satisfying (*). But for combination search, local moving only result in finite states choice, just not satisfying (*). For example for a permutation (1,3,2,4), random swap has only choose(4,2)=6 neighbor states.

Fairly interesting question about population Monte Carlo (PMC), a sequential version of importance sampling we worked on with French colleagues in the early 2000’s.  (The name population Monte Carlo comes from Iba, 2000.)  While MCMC samplers do not have to cover the whole support of p at each iteration, it is much harder for importance samplers as their core justification is to provide an unbiased estimator to for all integrals of interest. Thus, when using the PMC estimate,

1/n ∑i,t {p(xi,t)/qi,t(xi,t)}h(qi,t),  xi,t~qi,t(x)

this estimator is only unbiased when the supports of the qi,t “s are all containing the support of p. The only other cases I can think of are

  1. associating the qi,t “s with a partition Si,t of the support of p and using instead

    i,t {p(xi,t)/qi,t(xi,t)}h(qi,t), xi,t~qi,t(x)

  2. resorting to AMIS under the assumption (**) and using instead

    1/n ∑i,t {p(xi,t)/∑j,t qj,t(xi,t)}h(qi,t), xi,t~qi,t(x)

but I am open to further suggestions!

off to Bangalore

Posted in Statistics, Travel, University life with tags , , , on July 26, 2014 by xi'an

I am off to Bangalore for a few days, taking part in an Indo-French workshop on statistics and mathematical biology run by the Indo-French Centre for Applied Mathematics (IFCAM).


Get every new post delivered to your Inbox.

Join 604 other followers