In the past weeks I have received and read several papers (and X validated entries)where the Bayes factor is used to compare priors. Which does not look right to me, not on the basis of my general dislike of Bayes factors!, but simply because this seems to clash with the (my?) concept of Bayesian model choice and also because data should not play a role in that situation, from being used to select a prior, hence at least twice to run the inference, to resort to a single parameter value (namely the one behind the data) to decide between two distributions, to having no asymptotic justification, to eventually favouring the prior concentrated on the maximum likelihood estimator. And more. But I fear that this reticence to test for prior adequacy also extends to the prior predictive, or Box’s p-value, namely the probability under this prior predictive to observe something “more extreme” than the current observation, to quote from David Spiegelhalter.
Archive for Bayesian Analysis
leave Bayes factors where they once belonged
Posted in Statistics with tags Bayes factors, Bayesian Analysis, Bayesian decision theory, cross validated, prior comparison, prior predictive, prior selection, The Bayesian Choice, The Beatles, using the data twice, xkcd on February 19, 2019 by xi'anBayesian intelligence in Warwick
Posted in pictures, Statistics, Travel, University life, Wines with tags ABC, AI, artificial intelligence, Bayesian Analysis, Bayesian intelligence, CRiSM, effective dimension, estimating constants, Monte Carlo integration, neural network, paradoxes, seminar, University of Warwick on February 18, 2019 by xi'anThis is an announcement for an exciting CRiSM Day in Warwick on 20 March 2019: with speakers
10:00-11:00 Xiao-Li Meng (Harvard): “Artificial Bayesian Monte Carlo Integration: A Practical Resolution to the Bayesian (Normalizing Constant) Paradox”
11:00-12:00 Julien Stoehr (Dauphine): “Gibbs sampling and ABC”
14:00-15:00 Arthur Ulysse Jacot-Guillarmod (École Polytechnique Fedérale de Lausanne): “Neural Tangent Kernel: Convergence and Generalization of Deep Neural Networks”
15:00-16:00 Antonietta Mira (Università della Svizzera italiana e Università degli studi dell’Insubria): “Bayesian identifications of the data intrinsic dimensions”
[whose abstracts are on the workshop webpage] and free attendance. The title for the workshop mentions Bayesian Intelligence: this obviously includes human intelligence and not just AI!
statistics in Nature [a tale of the two Steves]
Posted in Books, pictures, Statistics with tags Bayesian Analysis, causality, clinical trials, frequentism, Nature, p-value hacking, placebo effect, statistical evidence, Stephen Senn, variability on January 15, 2019 by xi'anIn the 29 November issue of Nature, Stephen Senn (formerly at Glasgow) wrote an article about the pitfalls of personalized medicine, for the statistics behind the reasoning are flawed.
“What I take issue with is the de facto assumption that the differential response to a drug is consistent for each individual, predictable and based on some stable property, such as a yet-to-be-discovered genetic variant.”S. Senn
One (striking) reason being that the studies rest on a sort of low-level determinism that does not account for many sources of variability. Over-confidence in causality results. Stephen argues that improvement lies in insisting on repeated experiments on the same subjects (with an increased challenge in modelling since this requires longitudinal models with dependent observations). And to “drop the use of dichotomies”, favouring instead continuous modeling of measurements.
And in the 6 December issue, Steven Goodman calls (in the World view tribune) for probability statements to be attached as confidence indices to scientific claims. That he takes great pain to distinguish from p-values and links with Bayesian analysis. (Bayesian analysis that Stephen regularly objects to.) While I applaud the call, I am quite pessimistic about the follow-up it will generate, the primary reply being that posterior probabilities can be manipulated as well as p-values. And that Bayesian probabilities are not “real” probabilities (dixit Don Fraser or Deborah Mayo).
talks at CIRM with special tee-shirts
Posted in Books, pictures, Statistics, University life with tags Þe Norse face, Bayesian Analysis, Centre International de Rencontres Mathématiques, CIRM, CNRS, HMC, JASP, logo, Luminy, Marseiile, master class, Monte Carlo Statistical Methods, STAN, tee-shirt, Université Aix Marseille, videoed lectures, ye Norse farce on November 21, 2018 by xi'anX entropy
Posted in Books, Kids, pictures, Statistics, Travel, University life with tags Bayesian Analysis, Bayesian econometrics on November 16, 2018 by xi'anAnother discussion on X validated related to the maximum entropy priors and their dependence on the dominating measure μ chosen to define the score. With the same electrical engineering student as previously. In the wee hours at Casa Matematicà Oaxaca. As I took the [counter-]example of a Lebesgue dominating measure versus a Normal density times the Lebesgue measure producing the same maximum entropy distribution [with obviously the same density wrt to the Lebesgue measure] when the constraints involve the second moment, this confused the student and I spent some time constructing another example with different outcomes, when the Lebesgue measure versus the [artificial] dx/√|x| measure.
I am actually surprised at how limited the discussion of that point occurs in the literature (or at least in my googling attempt). Just a mention made in Bayesian Analysis in Statistics and Econometrics.
optimal proposal for ABC
Posted in Statistics with tags ABC, ABC-PMC, ABC-SMC, adaptive importance sampling, Bayesian Analysis, computational astrophysics, effective sample size, Ewan Cameron, kernel density estimator, Kullback-Leibler divergence, mixtures of distributions on October 8, 2018 by xi'anAs pointed out by Ewan Cameron in a recent c’Og’ment, Justin Alsing, Benjamin Wandelt, and Stephen Feeney have arXived last August a paper where they discuss an optimal proposal density for ABC-SMC and ABC-PMC. Optimality being understood as maximising the effective sample size.
“Previous studies have sought kernels that are optimal in the (…) Kullback-Leibler divergence between the proposal KDE and the target density.”
The effective sample size for ABC-SMC is actually the regular ESS multiplied by the fraction of accepted simulations. Which surprisingly converges to the ratio
E[q(θ)/π(θ)|D]/E[π(θ)/q(θ)|D]
under the (true) posterior. (Where q(θ) is the importance density and π(θ) the prior density.] When optimised in q, this usually produces an implicit equation which results in a form of geometric mean between posterior and prior. The paper looks at approximate ways to find this optimum. Especially at an upper bound on q. Something I do not understand from the simulations is that the starting point seems to be the plain geometric mean between posterior and prior, in a setting where the posterior is supposedly unavailable… Actually the paper is silent on how the optimal can be approximated in practice, for the very reason I just mentioned. Apart from using a non-parametric or mixture estimate of the posterior after each SMC iteration, which may prove extremely costly when processed through the optimisation steps. However, an interesting if side outcome of these simulations is that the above geometric mean does much better than the posterior itself when considering the effective sample size.
off to Vancouver
Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags Bayesian Analysis, British Columbia, Canada, default prior, Joint Statistical Meeting, JSM 2018, mixture of distributions, objective Bayes, summer of British conferences, Vancouver Island on July 29, 2018 by xi'anI am off today to Vancouver for JSM2018, eight years after I visited the West Coast for another JSM! And a contender for the Summer of British Conferences, since it is in British Columbia.
And again looking forward the city, (some of) the meeting, and getting together with long-time-no-see friends. Followed by a fortnight of vacations on Vancouver Island where ‘Og posting may get sparse…
I hope I can take advantage of the ten hours in the plane from Paris to write my talk from scratch about priors for mixtures of distributions. Based on our papers with Clara Grazian and with Kaniav Kamary and Kate Lee. Still having some leeway since my talk is on Thursday morning, on the last day of the meeting…