Archive for discussion paper

latent nested nonparametric priors

Posted in Books, Statistics with tags , , , , , , , on September 23, 2019 by xi'an

A paper on an extended type of non-parametric priors by Camerlenghi et al. [all good friends!] is about to appear in Bayesian Analysis, with a discussion open for contributions (until October 15). While a fairly theoretical piece of work, it validates a Bayesian approach for non-parametric clustering of separate populations with, broadly speaking, common clusters. More formally, it constructs a new family of models that allows for a partial or complete equality between two probability measures, but does not force full identity when the associated samples do share some common observations. Indeed, the more traditional structures prohibit one or the other, from the Dirichlet process (DP) prohibiting two probability measure realisations from being equal or partly equal to some hierarchical DP (HDP) already allowing for common atoms across measure realisations, but prohibiting complete identity between two realised distributions, to nested DP offering one extra level of randomness, but with an infinity of DP realisations that prohibits common atomic support besides completely identical support (and hence distribution).

The current paper imagines two realisations of random measures written as a sum of a common random measure and of one of two separate almost independent random measures: (14) is the core formula of the paper that allows for partial or total equality. An extension to a setting larger than facing two samples seems complicated if only because of the number of common measures one has to introduce, from the totally common measure to measures that are only shared by a subset of the samples. Except in the simplified framework when a single and universally common measure is adopted (with enough justification). The randomness of the model is handled via different completely random measures that involved something like four degrees of hierarchy in the Bayesian model.

Since the example is somewhat central to the paper, the case of one or rather two two-component Normal mixtures with a common component (but with different mixture weights) is handled by the approach, although it seems that it was already covered by HDP. Having exactly the same term (i.e., with the very same weight) is not, but this may be less interesting in real life applications. Note that alternative & easily constructed & parametric constructs are already available in this specific case, involving a limited prior input and a lighter computational burden, although the  Gibbs sampler behind the model proves extremely simple on the paper. (One may wonder at the robustness of the sampler once the case of identical distributions is visited.)

Due to the combinatoric explosion associated with a higher number of observed samples, despite obvious practical situations,  one may wonder at any feasible (and possibly sequential) extension, that would further keep a coherence under marginalisation (in the number of samples). And also whether or not multiple testing could be coherently envisioned in this setting, for instance when handling all hospitals in the UK. Another consistency question covers the Bayes factor used to assess whether the two distributions behind the samples are or not identical. (One may wonder at the importance of the question, hopefully applied to more relevant dataset than the Iris data!)

Bayesian conjugate gradients [open for discussion]

Posted in Books, pictures, Statistics, University life with tags , , , , , on June 25, 2019 by xi'an

When fishing for an illustration for this post on Google, I came upon this Bayesian methods for hackers cover, a book about which I have no clue whatsoever (!) but that mentions probabilistic programming. Which serves as a perfect (?!) introduction to the call for discussion in Bayesian Analysis of the incoming Bayesian conjugate gradient method by Jon Cockayne, Chris Oates (formerly Warwick), Ilse Ipsen and Mark Girolami (still partially Warwick!). Since indeed the paper is about probabilistic numerics à la Mark and co-authors. Surprisingly dealing with solving the deterministic equation Ax=b by Bayesian methods. The method produces a posterior distribution on the solution x⁰, given a fixed computing effort, which makes it pertain to the anytime algorithms. It also relates to an earlier 2015 paper by Christian Hennig where the posterior is on A⁻¹ rather than x⁰ (which is quite a surprising if valid approach to the problem!) The computing effort is translated here in computations of projections of random projections of Ax, which can be made compatible with conjugate gradient steps. Interestingly, the choice of the prior on x is quite important, including setting a low or high convergence rate…  Deadline is August 04!

the end of the Series B’log…

Posted in Books, Statistics, University life with tags , , , , on September 22, 2017 by xi'an

Today is the last and final day of Series B’log as David Dunson, Piotr Fryzlewicz and myself have decided to stop the experiment, faute de combattants. (As we say in French.) The authors nicely contributed long abstracts of their papers, for which I am grateful, but with a single exception, no one came out with comments or criticisms, and the idea to turn some Series B papers into discussion papers does not seem to appeal, at least in this format. Maybe the concept will be rekindled in another form in the near future, but for now we let it lay down. So be it!

beyond objectivity, subjectivity, and other ‘bjectivities

Posted in Statistics with tags , , , , , , , , , , , , , on April 12, 2017 by xi'an

Here is my discussion of Gelman and Hennig at the Royal Statistical Society, which I am about to deliver!

objective and subjective RSS Read Paper next week

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on April 5, 2017 by xi'an

Andrew Gelman and Christian Hennig will give a Read Paper presentation next Wednesday, April 12, 5pm, at the Royal Statistical Society, London, on their paper “Beyond subjective and objective in statistics“. Which I hope to attend and else to write a discussion. Since the discussion (to published in Series A) is open to everyone, I strongly encourage ‘Og’s readers to take a look at the paper and the “radical” views therein to hopefully contribute to this discussion. Either as a written discussion or as comments on this very post.

a response by Ly, Verhagen, and Wagenmakers

Posted in Statistics with tags , , , , , , , , on March 9, 2017 by xi'an

Following my demise [of the Bayes factor], Alexander Ly, Josine Verhagen, and Eric-Jan Wagenmakers wrote a very detailed response. Which I just saw the other day while in Banff. (If not in Schiphol, which would have been more appropriate!)

“In this rejoinder we argue that Robert’s (2016) alternative view on testing has more in common with Jeffreys’s Bayes factor than he suggests, as they share the same ‘‘shortcomings’’.”

Rather unsurprisingly (!), the authors agree with my position on the dangers to ignore decisional aspects when using the Bayes factor. A point of dissension is the resolution of the Jeffreys[-Lindley-Bartlett] paradox. One consequence derived by Alexander and co-authors is that priors should change between testing and estimating. Because the parameters have a different meaning under the null and under the alternative, a point I agree with in that these parameters are indexed by the model [index!]. But with which I disagree when arguing that the same parameter (e.g., a mean under model M¹) should have two priors when moving from testing to estimation. To state that the priors within the marginal likelihoods “are not designed to yield posteriors that are good for estimation” (p.45) amounts to wishful thinking. I also do not find a strong justification within the paper or the response about choosing an improper prior on the nuisance parameter, e.g. σ, with the same constant. Another a posteriori validation in my opinion. However, I agree with the conclusion that the Jeffreys paradox prohibits the use of an improper prior on the parameter being tested (or of the test itself). A second point made by the authors is that Jeffreys’ Bayes factor is information consistent, which is correct but does not solved my quandary with the lack of precise calibration of the object, namely that alternatives abound in a non-informative situation.

“…the work by Kamary et al. (2014) impressively introduces an alternative view on testing, an algorithmic resolution, and a theoretical justification.”

The second part of the comments is highly supportive of our mixture approach and I obviously appreciate very much this support! Especially if we ever manage to turn the paper into a discussion paper! The authors also draw a connection with Harold Jeffreys’ distinction between testing and estimation, based upon Laplace’s succession rule. Unbearably slow succession law. Which is well-taken if somewhat specious since this is a testing framework where a single observation can send the Bayes factor to zero or +∞. (I further enjoyed the connection of the Poisson-versus-Negative Binomial test with Jeffreys’ call for common parameters. And the supportive comments on our recent mixture reparameterisation paper with Kaniav Kamari and Kate Lee.) The other point that the Bayes factor is more sensitive to the choice of the prior (beware the tails!) can be viewed as a plus for mixture estimation, as acknowledged there. (The final paragraph about the faster convergence of the weight α is not strongly

nonparametric Bayesian clay for robust decision bricks

Posted in Statistics with tags , , , , , , on January 30, 2017 by xi'an

Just received an email today that our discussion with Judith of Chris Holmes and James Watson’s paper was now published as Statistical Science 2016, Vol. 31, No. 4, 506-510… While it is almost identical to the arXiv version, it can be read on-line.