Archive for Université Paris Dauphine

evidence estimation in finite and infinite mixture models

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 20, 2022 by xi'an

Adrien Hairault (PhD student at Dauphine), Judith and I just arXived a new paper on evidence estimation for mixtures. This may sound like a well-trodden path that I have repeatedly explored in the past, but methinks that estimating the model evidence doth remain a notoriously difficult task for large sample or many component finite mixtures and even more for “infinite” mixture models corresponding to a Dirichlet process. When considering different Monte Carlo techniques advocated in the past, like Chib’s (1995) method, SMC, or bridge sampling, they exhibit a range of performances, in terms of computing time… One novel (?) approach in the paper is to write Chib’s (1995) identity for partitions rather than parameters as (a) it bypasses the label switching issue (as we already noted in Hurn et al., 2000), another one is to exploit  Geyer (1991-1994) reverse logistic regression technique in the more challenging Dirichlet mixture setting, and yet another one a sequential importance sampling solution à la  Kong et al. (1994), as also noticed by Carvalho et al. (2010). [We did not cover nested sampling as it quickly becomes onerous.]

Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest and diverging opinions, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish there the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric ‘strongly identifiable’ Dirichlet Process Mixture (DPM) model.

a journal of the plague and pestilence year

Posted in Books, Kids, Mountains, pictures, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on May 5, 2022 by xi'an

Saw our fist Ukrainian applications for graduate studies at Dauphine, presumably numbers are going to rise in the coming weeks as the Russian aggression continues in the East and South of Ukraine…

Read The Unbroken by Cherae Clark, in part because it had been nominated for the 2022 Locus Award. The universe is vaguely inspired from the French colonisation of North Africa, with additional layers of magic and royals (the French occupation of Algeria actually started in 1830, during a monarchic intermede, but went full blast when the Republic resumed). And the central character is a colonial soldier, stolen from her parents at a young age and trained in the dominating kingdom, called Balladaire. (This sounds vaguely French if meaningless in the vernacular and there are a few French locations in the story. The suppression of religion in the empire could also be inspired from the French secular laws of the late 19th Century, even though it is unclear to me that secularism was at all enforced in North Africa, witness the existence of muslim courts, as most inhabitants were not French citizens.) While this could have been a great setting, the story falls flat (and even one-dimensional) as it is driven by a tiny number of characters that sadly lack in depth. To the extent of feeling like a school-yard conflict.

Cooked mostly curried butternut soups over the past month! And just restarted making radish stem pancakes as radishes are back on market stalls, often at a bargain.  Plus made an attempt at panak paneer and aloo gobi, just missing the paneer (I did not have time to make) and using mascarpone instead!

Watched Partners for Justice (검법남녀) a sort of Korean NCIS, between judicial prosecution and legal medicine, pleasant enough if burdened by too many coincidences and plenty of red herrings. Especially the second season, with darker sides of corruption, murder, and child abuse. A shocking moment was when the young (and central) prosecutor asks for death penalty during a trial, as I had not realised capital punishment was still a possibility in Korea (although not implemented since 1997). There was also an episode with a schizophrenic suspect where the scenaric treatment of his condition was abyssal… Hopefully not reflecting on the societal perception.

O’Bayes 2022 in UC Santa X

Posted in Statistics with tags , , , , , , , , , on March 4, 2022 by xi'an

primaire [game over]

Posted in Kids, pictures with tags , , , , , , , , , , , on January 31, 2022 by xi'an

ABC by classification

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , on December 21, 2021 by xi'an

As a(nother) coincidence, yesterday, we had a reading group discussion at Paris Dauphine a few days after Veronika Rockova presented the paper in person in Oaxaca. The idea in ABC by classification that she co-authored with Yuexi Wang and Tetsuya Kaj is to use the empirical Kullback-Leibler divergence as a substitute to the intractable likelihood at the parameter value θ. In the generalised Bayes setting of Bissiri et al. Since this quantity is not available it is estimated as well. By a classification method that somehow relates to Geyer’s 1994 inverse logistic proposal, using the (ABC) pseudo-data generated from the model associated with θ. The convergence of the algorithm obviously depends on the choice of the discriminator used in practice. The paper also makes a connection with GANs as a potential alternative for the generalised Bayes representation. It mostly focus on the frequentist validation of the ABC posterior, in the sense of exhibiting a posterior concentration rate in n, the sample size, while requiring performances of the discriminators that may prove hard to check in practice. Expanding our 2018 result to this setting, with the tolerance decreasing more slowly than the Kullback-Leibler estimation error.

Besides the shared appreciation that working with the Kullback-Leibler divergence was a nice and under-appreciated direction, one point that came out of our discussion is that using the (estimated) Kullback-Leibler divergence as a form of distance (attached with a tolerance) is less prone to variability (or more robust) than using directly (and without tolerance) the estimate as a substitute to the intractable likelihood, if we interpreted the discrepancy in Figure 3 properly. Another item was about the discriminator function itself: while a machine learning methodology such as neural networks could be used, albeit with unclear theoretical guarantees, it was unclear to us whether or not a new discriminator needed be constructed for each value of the parameter θ. Even when the simulations are run by a deterministic transform.

%d bloggers like this: