Archive for Brown University

label switching by optimal transport: Wasserstein to the rescue

Posted in Books, Statistics, Travel with tags , , , , , , , , , , , , , , on November 28, 2019 by xi'an

A new arXival by Pierre Monteiller et al. on resolving label switching by optimal transport. To appear in NeurIPS 2019, next month (where I will be, but extra muros, as I have not registered for the conference). Among other things, the paper was inspired from an answer of mine on X validated, presumably a première (and a dernière?!). Rather than picketing [in the likely unpleasant weather ]on the pavement outside the conference centre, here are my raw reactions to the proposal made in the paper. (Usual disclaimer: I was not involved in the review of this paper.)

“Previous methods such as the invariant losses of Celeux et al. (2000) and pivot alignments of Marin et al. (2005) do not identify modes in a principled manner.”

Unprincipled, me?! We did not aim at identifying all modes but only one of them, since the posterior distribution is invariant under reparameterisation. Without any bad feeling (!), I still maintain my position that using a permutation invariant loss function is a most principled and Bayesian approach towards a proper resolution of the issue. Even though figuring out the resulting Bayes estimate may prove tricky.

The paper thus adopts a different approach, towards giving a manageable meaning to the average of the mixture distributions over all permutations, not in a linear Euclidean sense but thanks to a Wasserstein barycentre. Which indeed allows for an averaged mixture density, although a point-by-point estimate that does not require switching to occur at all was already proposed in earlier papers of ours. Including the Bayesian Core. As shown above. What was first unclear to me is how necessary the Wasserstein formalism proves to be in this context. In fact, the major difference with the above picture is that the estimated barycentre is a mixture with the same number of components. Computing time? Bayesian estimate?

Green’s approach to the problem via a point process representation [briefly mentioned on page 6] of the mixture itself, as for instance presented in our mixture analysis handbook, should have been considered. As well as issues about Bayes factors examined in Gelman et al. (2003) and our more recent work with Kate Jeong Eun Lee. Where the practical impossibility of considering all possible permutations is processed by importance sampling.

An idle thought that came to me while reading this paper (in Seoul) was that a more challenging problem would be to face a model invariant under the action of a group with only a subset of known elements of that group. Or simply too many elements in the group. In which case averaging over the orbit would become an issue.

at the centre of Bayes

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on October 14, 2019 by xi'an

ICERM, Brown, Providence, RI (#3)

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , on December 3, 2012 by xi'an

ICERM building, Providence, RI, Nov. 29, 2012Just yet another perfect day in Providence! Especially when I thought it was going to be a half-day: After a longer and slightly warmer run in the early morning around the peninsula, I attended the lecture by Eric Moulines on his recent results on adaptive MCMC and the equi-energy sampler. At this point, we were told that, since Peter Glynn was sick, the afternoon talks were drifted forward. This meant that I could attend Mylène Bédard’s talk in the morning and most of Xiao-Li Meng’s talk, before catching my bus to the airport, making it a full day in the end!

The research presented by Mylène (and coauthored with Randal Douc and Eric Moulines) was on multiple-try MCMC and delayed-rejection MCMC, with optimal scaling results and a comparison of the efficiency of those involved schemes. I had not seen the work before and got quite impressed by the precision of the results and the potential for huge efficiency gains. One of the most interesting tricks was to use an antithetic move for the second step, considerably improving the acceptance rate in the process. An aside exciting point was to realise that the hit-and-run solution was also open to wide time-savings thanks to some factorisation.

DSC_3532While Xiao-Li’s talk had connections with his earlier illuminating talk in New York last year, I am quite desolate to have missed [the most novel] half of it (and still caught my bus by a two minute margin!), esp. because it connected beautifully with the constant estimation controverse! Indeed, Xiao-Li started his presentation with the pseudo-paradox that the likelihood cannot be written as a function of the normalising constant, simply because this is not a free parameter. He then switched to his usual theme that the dominating measure was to be replaced with a substitute and estimated.The normalising constant being a function of the dominating measure, it is a by-product of this estimation step. And can even be endowed within a Bayesian framework. Obviously, one can always argue against the fact that the dominating measure is truly unknown, however this gives a very elegant safe-conduct to escape the debate about the constant that did not want to be estimated…So to answer Xiao-Li’s question as I was leaving the conference room, I have now come to a more complete agreement with his approach. And think further advances could be contemplated along this path…

ICERM, Brown, Providence, RI (#2)

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , on November 30, 2012 by xi'an

ICERM building by the canal, Providence, RI, Nov. 29, 2012Just another perfect day in Providence! After a brisk run in the eearly morning which took me through Brown campus, I attended the lecture by Sean Meyn on feedback particle filters. As it was mostly on diffusions with control terms, just too far from my field, I missed most of the points. (My fault, not Sean’s!) Then Ramon von Handel gave a talk about the curse(s) of dimensionality in particle filters, much closer to my interests, with a good summary of why (optimal) filters were not suffering from a curse in n, the horizon size, but in d, the dimension of the space, followed by an argument that some degree of correlation decay could overcome this dimensional curse as well. After the lunch break (where I thought further about the likelihood principle!), Dana Randall gave a technical talk on mixing properties of the hardcore model on Z² and bounding the cutoff parameter, which is when I appreciated the ability to follow talks from the ICERM lounge, watching slides and video of the talk taking place on the other side of the wall! At last, and in a programming contrapoint from slowly mixing to fastest mixing, Jim Fill presented his recent work on ordering Markov chains and finding fastest-mixing chains, which of course reminded me of Peskun ordering although there may be little connection in the end. The poster session in the evening had sufficiently few posters to make the discussion with each author enjoyable and relevant.A consistent feature of the meeting thus, allowing for quality interacting time between participants. I am now looking forward the final day with a most intriguing title by my friend Eric Moulines on TBA…

ICERM, Brown, Providence, RI (#1)

Posted in Statistics, Travel, University life with tags , , , , on November 29, 2012 by xi'an

As I mentioned yesterday, and earlier, I was rather excited by the visit of the ICERM building. As it happens, the centre is located at the upper floor of a (rather bland!) 11 floor building sitting between Main St. and the river. It is quite impressive indeed, with a feeling of space due to the high ceilings and the glass walls all around the conference room, plus pockets of quietness with blackboards at the rescue. The whiteboard that makes the wall between the conference room and the lobby is also appreciable for discussion as it is huge (the whole wall is the whiteboard!) and made of a glassy material that makes writing on it a true pleasure (the next step would be to have a recording device embedded in it!). When I gave my talk and attended the other three talks of the day, I kind of regretted that the dual projector system would not allow for a lag of sorts in the presentation. Even though the pace of the other talks was quite reasonable (mine was a bit hurried I am afraid!), writing down a few notes was enough for me to miss some point from the previous slide. With huge walls, it should be easy to project at least the previous slide at the same time and maybe even all of the previous slide (maybe, maybe not, as it would get quickly confusing…)

Paul Dupuis’ talk covered new material (at least for me) on importance sampling for diffusions and the exploration of equilibriums, and it was thus quite enjoyable, even when fighting one of my dozing attacks. Gareth Roberts’ talk provided a very broad picture of the different optimal scalings (à la 0.234!) for MCMC algorithms (while I have attended several lectures by Gareth on this theme, there is always something new and interesting coming out of them!). Krzysztof Latuszynski’s talk on irreducible diffusions and the construction of importance sampling solutions replacing the (unavailable) exact sampling of Beskos et al. (2006) led to some discussion on the handling of negative weights. This is a question that has always intrigued me: if unbiasedness or exact simulation or something else induce negative weights in a sample, how can we process those weights when resampling? The conclusion of the discussion was that truncating the weights to zero seemed like the best solution, at least when resampling since the weights can be used as such in averages, but I wonder if there is a more elaborate scheme involving mixtures or whatnot!

ICERM, Brown, Providence, RI (#0)

Posted in Running, Statistics, Travel, University life with tags , , , , , , , , on November 28, 2012 by xi'an

I have just arrived in Providence, RI, for the ICERM workshop on Performance Analysis of Monte Carlo Methods. While the plane trip was uneventful and even relaxing, as I could work on the revision to our ABCel (soon to be BCel!) paper, the bus trip from Boston to Providence, while smooth, quiet, wirelessed, and on-time, was a wee too much as it was already late for my standards… Anyway, I am giving one of the talks tomorrow, with a pot-pourri on ABC and empirical likelihood as in Ames and Chicago last month. The format of the workshop sounds very nice, with only four talks a day, which should leave a lot of space for interactions between participants (if I do not crash from my early early rise…) And, as mentioned earlier, I am looking forward visiting the futuristic building.

Computational Challenges in Probability [ICERM, Sept. 5 – Dec. 7]

Posted in Statistics, Travel, University life with tags , , , , , , , , , on May 18, 2012 by xi'an

I have just received an invitation to take part in the program “Computational Challenges in Probability” organised by ICERM (Institute for Computational and Experimental Research in Mathematics, located in what sounds like a terrific building!) next semester. Here is the purpose statement:

The Fall 2012 Semester on “Computational Challenges in Probability” aims to bring together leading experts and young researchers who are advancing the use of probabilistic and computational methods to study complex models in a variety of fields. The goal is to identify common challenges, exchange existing tools, reveal new application areas and forge new collaborative efforts. The semester includes four workshops – Bayesian Nonparametrics, Uncertainty Quantification, Monte Carlo Methods in the Physical and Biological Sciences and Performance Analysis of Monte Carlo Methods. In addition, synergistic activities will be planned throughout the duration of the semester. In particular, there will be several short courses and plenary invited talks by experts on related topics such as graphical models, randomized algorithms and stochastic networks, regular weekly seminars and relevant film screenings.

There are thus four workshops organised over the period and an impressive collection of long-term participants. I will most likely take part in the last workshop, “Performance Analysis of Monte Carlo Methods”, although I would like to attend all of them! (Interesting side remark: while looking at the ICERM website, I found that May 18th is the Day of Data! Great, except that neither the word statistitics nor the word statistician appear on the page…)