## Bayesian goodness of fit

**P**ersi Diaconis and Guanyang Wang have just arXived an interesting reflection on the notion of Bayesian goodness of fit tests. Which is a notion that has always bothered me, in a rather positive sense (!), as

“I also have to confess at the outset to the zeal of a convert, a born again believer in stochastic methods. Last week, Dave Wright reminded me of the advice I had given a graduate student during my algebraic geometry days in the 70’s :`Good Grief, don’t waste your time studying statistics. It’s all cookbook nonsense.’ I take it back! …”David Mumford

The paper starts with a reference to David Mumford, whose paper with Wu and Zhou on exponential “maximum entropy” synthetic distributions is at the source (?) of this paper, and whose name appears in its very title: “A conversation for David Mumford”…, about his conversion from pure (algebraic) maths to applied maths. The issue of (Bayesian) goodness of fit is addressed, with card shuffling examples, the null hypothesis being that the permutation resulting from the shuffling is uniformly distributed if shuffling takes enough time. Interestingly, while the parameter space is compact as a distribution on a finite set, Lindley’s paradox still occurs, namely that the null (the permutation comes from a Uniform) is always accepted provided there is no repetition under a “flat prior”, which is the Dirichlet D(1,…,1) over all permutations. (In this finite setting an improper prior is definitely improper as it does not get proper after accounting for observations. Although I do not understand why the Jeffreys prior is not the Dirichlet(½,…,½) in this case…) When resorting to the exponential family of distributions entertained by Zhou, Wu and Mumford, including the uniform distribution as one of its members, Diaconis and Wang advocate the use of a conjugate prior (exponential family, right?!) to compute a Bayes factor that simplifies into a ratio of two intractable normalising constants. For which the authors suggest using importance sampling, thermodynamic integration, or the exchange algorithm. Except that they rely on the (dreaded) harmonic mean estimator for computing the Bayes factor in the following illustrative section! Due to the finite nature of the space, I presume this estimator still has a finite variance. (Remark 1 calls for convergence results on exchange algorithms, which can be found I think in the just as recent arXival by Christophe Andrieu and co-authors.) An interesting if rare feature of the example processed in the paper is that the sufficient statistic used for the permutation model can be directly simulated from a Multinomial distribution. This is rare as seen when considering the benchmark of Ising models, for which the summary and sufficient statistic cannot be directly simulated. (If only…!) In fine, while I enjoyed the paper a lot, I remain uncertain as to its bearings, since defining an objective alternative for the goodness-of-fit test becomes quickly challenging outside simple enough models.

## Leave a Reply