**W**hile answering a question on X validated on the posterior mean being a weighted sum of the prior mean and of the maximum likelihood estimator, when the weights do not depend on the data, which is true in conjugate natural exponential family settings, I re-read this wonderful 1979 paper of Diaconis & Ylvisaker establishing the converse, namely that when the linear combination holds, the prior need be conjugate! This holds within exponential families, but I cannot think of a reasonable case outside exponential families where the linearity holds (again with constant weights, as otherwise it always holds in dimension one, albeit with weights possibly outside [0,1]).

## Archive for exponential families

## linearity, reversed

Posted in Books, Kids with tags Annals of Statistics, cross validated, exponential families, linearity, Persi Diaconis, posterior mean on September 19, 2020 by xi'an## a glaringly long explanation

Posted in Statistics with tags ABC, Bayesian Choice, cross validated, exponential families, proof, socks, sufficient statistics, teaching, Thomas Bayes' portrait, undergraduates, Université Paris Dauphine on December 19, 2018 by xi'an**I**t is funny that, when I am teaching the rudiments of Bayesian statistics to my undergraduate students in Paris-Dauphine, including ABC via Rasmus’ socks, specific questions about the book (The Bayesian Choice) start popping up on X validated! Last week was about the proof that ABC is exact when the tolerance is zero. And the summary statistic sufficient.

This week is about conjugate distributions for exponential families (not that there are many others!). Which led me to explain both the validation of the conjugacy and the derivation of the posterior expectation of the mean of the natural sufficient statistic in far more details than in the book itself. Hopefully in a profitable way.

## indecent exposure

Posted in Statistics with tags ABC, Bayesian optimisation, Bretagne, Brittany, exponential families, image analysis, image processing, inference, Lugano, maximum likelihood estimation, MCqMC 2018, pre-processing, Rennes on July 27, 2018 by xi'an**W**hile attending my last session at MCqMC 2018, in Rennes, before taking a train back to Paris, I was confronted by this radical opinion upon our previous work with Matt Moores (Warwick) and other coauthors from QUT, where the speaker, Maksym Byshkin from Lugano, defended a new approach for maximum likelihood estimation using novel MCMC methods. Based on the point fixe equation characterising maximum likelihood estimators for exponential families, when theoretical and empirical moments of the natural statistic are equal. Using a Markov chain with stationary distribution the said exponential family, the fixed point equation can be turned into a zero divergence equation, requiring simulation of pseudo-data from the model, which depends on the unknown parameter. Breaking this circular argument, the authors note that simulating pseudo-data that reproduce the observed value of the sufficient statistic is enough. Which is related with Geyer and Thomson (1992) famous paper about Monte Carlo maximum likelihood estimation. From there I was and remain lost as I cannot see why a derivative of the expected divergence with respect to the parameter θ can be computed when this divergence is found by Monte Carlo rather than exhaustive enumeration. And later used in a stochastic gradient move on the parameter θ… Especially when the null divergence is imposed on the parameter. In any case, the final slide shows an application to a large image and an Ising model, solving the problem (?) in 140 seconds and suggesting indecency, when our much slower approach is intended to produce a complete posterior simulation in this context.

## Larry Brown (1940-2018)

Posted in Books, pictures, Statistics, University life with tags decision theory, exponential families, James-Stein estimator, Larry Brown, mathematical statistics, Philadelphia, Wharton Business School on February 21, 2018 by xi'an**J**ust learned a few minutes ago that my friend Larry Brown has passed away today, after fiercely fighting cancer till the end. My thoughts of shared loss and deep support first go to my friend Linda, his wife, and to their children. And to all their colleagues and friends at Wharton. I have know Larry for all of my career, from working on his papers during my PhD to being a temporary tenant in his Cornell University office in White Hall while he was mostly away in sabbatical during the academic year 1988-1989, and then periodically meeting with him in Cornell and then Wharton along the years. He and Linday were always unbelievably welcoming and I fondly remember many times at their place or in superb restaurants in Phillie and elsewhere. And of course remembering just as fondly the many chats we had along these years about decision theory, admissibility, James-Stein estimation, and all aspects of mathematical statistics he loved and managed at an ethereal level of abstraction. His book on exponential families remains to this day one of the central books in my library, to which I kept referring on a regular basis… For certain, I will miss the friend and the scholar along the coming years, but keep returning to this book and have shared memories coming back to me as I will browse through its yellowed pages and typewriter style. Farewell, Larry, and thanks for everything!

## admissible estimators that are not Bayes

Posted in Statistics with tags admissibility, Bayes estimators, Cornell University, decision theory, exponential families, hypothesis testing, loss function on December 30, 2017 by xi'an**A** question that popped up on X validated made me search a little while for point estimators that are both admissible (under a certain loss function) and not generalised Bayes (under the same loss function), before asking Larry Brown, Jim Berger, or Ed George. The answer came through Larry’s book on exponential families, with the two examples attached. (Following our 1989 collaboration with Roger Farrell at Cornell U, I knew about the existence of testing procedures that were both admissible and not Bayes.) The most surprising feature is that the associated loss function is strictly convex as I would have thought that a less convex loss would have helped to find such counter-examples.