**T**he full and definitive program of the O’Bayes 2019 conference in Warwick is now on line. Including discussants for all papers. And the three [and free] tutorials on Friday afternoon, 28 June, on model selection (M. Barbieri), MCMC recent advances (G.O. Roberts) and BART (E.I. George). Registration remains open at the reduced rate and submissions of posters can still be sent to me for all conference participants.

## Archive for prior selection

## O’Bayes 2019 conference program

Posted in Kids, pictures, Statistics, Travel, University life with tags Bayesian conference, Bayesian model selection, BNP12, England, frequentist inference, imprecise probabilities, ISBA, O'Bayes 2019, objective Bayes, prior selection, Statistical learning, University of Warwick on May 13, 2019 by xi'an## leave Bayes factors where they once belonged

Posted in Statistics with tags Bayes factors, Bayesian Analysis, Bayesian decision theory, cross validated, prior comparison, prior predictive, prior selection, The Bayesian Choice, The Beatles, using the data twice, xkcd on February 19, 2019 by xi'an**I**n the past weeks I have received and read several papers (and X validated entries)where the Bayes factor is used to compare priors. Which does not look right to me, not on the basis of my general dislike of Bayes factors!, but simply because this seems to clash with the (my?) concept of Bayesian model choice and also because data should not play a role in that situation, from being used to select a *prior*, hence at least twice to run the inference, to resort to a *single* parameter value (namely the one behind the data) to decide between two distributions, to having no asymptotic justification, to eventually favouring the prior concentrated on the maximum likelihood estimator. And more. But I fear that this reticence to test for prior adequacy also extends to the prior predictive, or Box’s p-value, namely the probability under this prior predictive to observe something “more extreme” than the current observation, to quote from David Spiegelhalter.

## JSM 2018 [#4½]

Posted in Statistics, University life with tags anchor, British Columbia, Canada, handbook of mixture analysis, JSM 2018, overfitting, prior selection, regularisation, Vancouver on August 10, 2018 by xi'an**A**s I wrote my previous blog entry on JSM2018 before the sessions, I did not have the chance to comment on our mixture session, which I found most interesting!, with new entries on the topic and a great discussion by Bettina Grün. Including the important call for linking weights with the other parameters, as both groups being independent does not make sense when the number of components is uncertain. (Incidentally our paper with Kaniav kamary and Kate Lee does create a dependence.) The talk by Deborah Kunkel was about anchored mixture estimation, a joint work with Mario Peruggia, another arXival that I had missed.

The notion of anchoring found in this paper is to allocate specific observations to specific components. These observations are thus *anchored* to these components. Among other things, this modification of the sampling model implies a removal of the unidentifiability problem. Hence formally of the label-switching or lack thereof issue. (Although, as Peter Green repeatedly mentioned, visualising the parameter space as a point process eliminates the issue.) This idea is somewhat connected with the constraint Jean Diebolt and I imposed in our 1990 mixture paper, namely that no component would have less than two observations allocated to it, but imposing which ones are which of course reduces drastically the complexity of the model. Another (related) aspect of anchoring is that the observations that are anchored to the components act as parts of the prior model, modifying the initial priors (which can then become improper as in our 1990 paper). The difficulty of the anchoring approach is to find observations to anchor in an unsupervised setting. The paper proceeds by optimising the allocations, which somewhat turns the prior into a data-dependent prior since all observations are used to set the anchors and then used again for the standard Bayesian processing. In that respect, I would rather follow the sequential procedure developed by Nicolas Chopin and Florian Pelgrin, where the number of components grows by steps with the number of observations.

## JSM 2018 [#4]

Posted in Mountains, Statistics, Travel, University life with tags British Columbia, Canada, COPSS Award, Denver, Elisabeth L. Scott Award, finite mixtures, Jeffreys prior, JSM 2018, location-scale parameterisation, objective Bayes, prior selection, R.A. Fisher Award, reparameterisation, Vancouver, \end{frame} on August 3, 2018 by xi'an**A**s last ½ day of sessions at JSM2018 in an almost deserted conference centre, with a first session set together by Mario Peruggia and a second on Advances in Bayesian Nonparametric Modeling and Computation for Complex Data. Here are the slides of my talk this morning in the Bayesian mixture estimation session.

which I updated last night (Slideshare most absurdly does not let you update versions!)

Since I missed the COPSS Award ceremony for a barbecue with friends on Locarno Beach, I only discovered this morning that the winner this year is Richard Samworth, from Cambridge University, who eminently deserves this recognition, if only because of his contributions to journal editing, as I can attest from my years with JRSS B. Congrats to him as well as to Bin Yu and Susan Murphy for their E.L. Scott and R.A. Fisher Awards! I also found out from an email to JSM participants that the next edition is in Denver, Colorado, which I visited only once in 1993 on a trip to Fort Collins visiting Kerrie Mengersen and Richard Tweedie. Given the proximity to the Rockies, I am thinking of submitting an invited session on ABC issues, which were not particularly well covered by this edition of JSM. (Feel free to contact me if you are interested in joining the session.)

## objectivity in prior distributions for the multinomial model

Posted in Statistics, University life with tags Haldane's prior, Laplace's prior, multinomial distribution, non-informative priors, objective Bayes, prior selection on March 17, 2016 by xi'an**T**oday, Danilo Alvares visiting from the Universitat de Valencià gave a talk at CREST about choosing a prior for the Multinomial distribution. Comparing different Dirichlet priors. In a sense this is an hopeless task, first because there is no reason to pick a particular prior unless one picks a very specific and a-Bayesian criterion to discriminate between priors, second because the multinomial is a weird distribution, hardly a distribution at all in that it results from grouping observations into classes, often based on the observations themselves. A construction that should be included within the choice of the prior maybe? But there lurks a danger of ending up with a data-dependent prior. My other remark about this problem is that, among the token priors, Perk’s prior using 1/k as its hyper-parameter [where k is the number of categories] is rather difficult to justify compared with 1/k² or 1/k³, except for aggregation consistency to some extent. And Laplace’s prior gets highly concentrated as the number of categories grows.

## who’s afraid of the big B wolf?

Posted in Books, Statistics, University life with tags Allan Birnbaum, asymptotics, Bayes factor, Bayesian inference, Jeffreys-Lindley paradox, likelihood ratio, p-values, prior selection, testing of hypotheses, The Likelihood Principle, Theory of Probability on March 13, 2013 by xi'an**A**ris Spanos just published a paper entitled “Who should be afraid of the Jeffreys-Lindley paradox?” in the journal *Philosophy of Science*. This piece is a continuation of the debate about frequentist versus llikelihoodist versus Bayesian (should it be Bayesianist?! or Laplacist?!) testing approaches, exposed in Mayo and Spanos’ *Error and Inference*, and discussed in several posts of the ‘Og. I started reading the paper in conjunction with a paper I am currently writing for a special volume in honour of Dennis Lindley, paper that I will discuss later on the ‘Og…

“…the postdata severity evaluation (…) addresses the key problem with Fisherian p-values in the sense that the severity evaluation provides the “magnitude” of the warranted discrepancy from the null by taking into account the generic capacity of the test (that includes n) in question as it relates to the observed data”(p.88)

**F**irst, the antagonistic style of the paper is reminding me of Spanos’ previous works in that it relies on repeated value judgements (such as *“Bayesian charge”, “blatant misinterpretation”, “Bayesian allegations that have undermined the credibility of frequentist statistics”*, *“both approaches are far from immune to fallacious interpretations”*, *“only crude rules of thumbs”,* &tc.) and rhetorical sleights of hand. (See, e.g., *“In *contrast*, the severity account *ensures* learning from data by employing *trustworthy* evidence (…), the *reliability* of evidence being calibrated in terms of the *relevant* error probabilities”* [my stress].) Connectedly, Spanos often resorts to an unusual [at least for statisticians] vocabulary that amounts to newspeak. Here are some illustrations: *“summoning the generic capacity of the test”, ‘substantively significant”, “custom tailoring **the generic capacity of the test”,* “the fallacy of acceptance”, *“the relevance of **the generic capacity of the particular test”,* yes the term *“generic capacity”* is occurring there with a truly high frequency. Continue reading