Archive for better together

Markov melding

Posted in Books, Statistics, University life with tags , , , on July 2, 2020 by xi'an

“An alternative approach is to model smaller, simpler aspects of the data, such that designing these submodels is easier, then combine the submodels.”

An interesting paper by Andrew Manderson and Robert Goudie I read on arXiv on merging (or melding) several models together. With different data and different parameters. The assumption is one of a common parameter φ shared by all (sub)models. Since the product of the joint distributions across the m submodels involves m replicates of φ, the melded distribution is the product of the conditional distributions given φ, times a common (or pooled) prior on φ. Which leads to a perfectly well-defined joint distribution provided the support of this pooled prior is compatible with all conditionals.

The MCMC aspects of such a target are interesting in that the submodels can easily be exploited to return proposal distributions on their own parameters (plus φ). Although the notion is fraught with danger when considering a flat prior on φ, since the posterior is not necessarily well-defined. Or at the very least unrelated with the actual marginal posterior. This first stage is used to build a particle approximation to the posterior distribution of φ, exploited in the later simulation of the other subsample parameters and updates of φ. Due to the rare availability of the (submodel) marginal prior on φ, it is replaced in the paper by a kernel density estimate. Not a great idea as (a) it is unstable and (b) the joint density is costly, while existing! Which brings the authors to set a goal of estimating a ratio. Of the same marginal density in two different values of φ. (Not our frequent problem of the ratio of different marginals!) They achieve this by targeting another joint, using a weight function both for the simulation and the kernel density estimation… Requiring the calibration of the weight function and the production of a biased estimate of the ratio.

While the paper concentrates very much on computational improvements, including the possible recourse to unbiased MCMC, I also feel it is missing on the Bayesian aspects, since the construction of the multi-level Bayesian model faces many challenges. In a sense this is an alternative to our better together paper, where cuts are used to avoid the duplication of common parameters.

punch [him] back, Britain!

Posted in Statistics with tags , , , , , , , on December 12, 2019 by xi'an

Better together in Kolkata [slides]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on January 4, 2018 by xi'an

Here are the slides of the talk on modularisation I am giving today at the PC Mahalanobis 125 Conference in Kolkata, mostly borrowed from Pierre’s talk at O’Bayes 2018 last month:

[which made me realise Slideshare has discontinued the option to update one’s presentation, forcing users to create a new presentation for each update!] Incidentally, the amphitheatre at ISI is located right on top of a geological exhibit room with a reconstituted Barapasaurus tagorei so I will figuratively ride a dinosaur during my talk!

better together?

Posted in Books, Mountains, pictures, Statistics, University life with tags , , , , , , , , on August 31, 2017 by xi'an

Yesterday came out on arXiv a joint paper by Pierre Jacob, Lawrence Murray, Chris Holmes and myself, Better together? Statistical learning in models made of modules, paper that was conceived during the MCMski meeting in Chamonix, 2014! Indeed it is mostly due to Martyn Plummer‘s talk at this meeting about the cut issue that we started to work on this topic at the fringes of the [standard] Bayesian world. Fringes because a standard Bayesian approach to the problem would always lead to use the entire dataset and the entire model to infer about a parameter of interest. [Disclaimer: the use of the very slogan of the anti-secessionists during the Scottish Independence Referendum of 2014 in our title is by no means a measure of support of their position!] Comments and suggested applications most welcomed!

The setting of the paper is inspired by realistic situations where a model is made of several modules, connected within a graphical model that represents the statistical dependencies, each relating to a specific data modality. In a standard Bayesian analysis, given data, a conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of or even massive uncertainty about any module in the graph can contaminate the estimate and update of parameters of other modules, often in unpredictable ways. Particularly so when certain modules are trusted more than others. Hence the appearance of cut models, where practitioners  prefer skipping the full model and limit the information propagation between these modules, for example by restricting propagation to only one direction along the edges of the graph. (Which is sometimes represented as a diode on the edge.) The paper investigates in which situations and under which formalism such modular approaches can outperform the full model approach in misspecified settings. By developing the appropriate decision-theoretic framework. Meaning we can choose between [several] modular and full-model approaches.

EU turns 60!

Posted in Kids, pictures with tags , , , , on March 25, 2017 by xi'an