**A**n interesting misconception read on X validated today, with a confusion between the absolute value of the likelihood function and its variability. Which I have trouble explaining except possibly by the extrapolation from the discrete case and a confusion between the probability density of the data [scaled as a probability] and the likelihood function [scale-less]. I also had trouble convincing the originator of the question of the irrelevance of the scale of the likelihood *per se*, even when demonstrating that |𝚺| could vanish from the posterior with no consequence whatsoever. It is only when I thought of the case when the likelihood is constant in 𝜃 that I managed to make my case.

## Archive for likelihood function

## my likelihood is dominating my prior [not!]

Posted in Kids, Statistics with tags Bayesian inference, cross validated, likelihood function, Likelihood Principle, magnitude, scaling on August 29, 2019 by xi'an## are there a frequentist and a Bayesian likelihoods?

Posted in Statistics with tags Bayes factor, Bayes formula, cross validated, dominating measure, Harold Jeffreys, likelihood function, Metron, probability theory, R.A. Fisher, University of Amsterdam, wikipedia on June 7, 2018 by xi'an**A** question that came up on X validated and led me to spot rather poor entries in Wikipedia about both the likelihood function and Bayes’ Theorem. Where unnecessary and confusing distinctions are made between the frequentist and Bayesian versions of these notions. I have already discussed the later (Bayes’ theorem) a fair amount here. The discussion about the likelihood is quite bemusing, in that the likelihood function is the … function of the parameter equal to the density indexed by this parameter at the observed value.

“What we can find from a sample is the likelihood of any particular value of r, if we define the likelihood as a quantity proportional to the probability that, from a population having the particular value of r, a sample having the observed value of r, should be obtained.”R.A. Fisher,On the “probable error’’ of a coefficient of correlation deduced from a small sample.Metron1, 1921, p.24

By mentioning an informal side to likelihood (rather than to likelihood function), and then stating that the likelihood is not a probability in the frequentist version but a probability in the Bayesian version, the W page makes a complete and unnecessary mess. Whoever is ready to rewrite this introduction is more than welcome! (Which reminded me of an earlier question also on X validated asking why a common reference measure was needed to define a likelihood function.)

This also led me to read a recent paper by Alexander Etz, whom I met at E.J. Wagenmakers‘ lab in Amsterdam a few years ago. Following Fisher, as Jeffreys complained about

“..likelihood, a convenient term introduced by Professor R.A. Fisher, though in his usage it is sometimes multiplied by a constant factor. This is the probability of the observations given the original information and the hypothesis under discussion.”H. Jeffreys,Theory of Probability, 1939, p.28

Alexander defines the likelihood up to a constant, which causes extra-confusion, for free!, as there is no foundational reason to introduce this degree of freedom rather than imposing an exact equality with the density of the data (albeit with an arbitrary choice of dominating measure, never neglect the dominating measure!). The paper also repeats the message that the likelihood is not a probability (density, *missing in the paper*). And provides intuitions about maximum likelihood, likelihood ratio and Wald tests. But does not venture into a separate definition of the likelihood, being satisfied with the fundamental notion to be plugged into the magical formula

posterior∝prior×likelihood

## never mind the big data here’s the big models [workshop]

Posted in Kids, pictures, Statistics, Travel, University life with tags approximate likelihood, Bayesian model comparison, Bayesian statistics, big data, big models, GAMs, gaussian process, latent Gaussian models, likelihood function, misspecified model, model criticism, modelliing, point processes, Sex Pistols, spatial statistics, University of Warwick on December 22, 2015 by xi'an**M**aybe the last occurrence this year of the pastiche of the iconic LP of the Sex Pistols!, made by Tamara Polajnar. The last workshop as well of the big data year in Warwick, organised by the Warwick Data Science Institute. I appreciated the different talks this afternoon, but enjoyed particularly Dan Simpson’s and Rob Scheichl’s. The presentation by Dan was so hilarious that I could not resist asking him for permission to post the slides here:

Not only hilarious [and I have certainly missed 67% of the jokes], but quite deep about the meaning(s) of modelling and his views about getting around the most blatant issues. Ron presented a more computational talk on the ways to reach petaflops on current supercomputers, in connection with weather prediction models used (or soon to be used) by the Met office. For a prediction area of 1 km². Along with significant improvements resulting from multiscale Monte Carlo and quasi-Monte Carlo. Definitely impressive! And a brilliant conclusion to the Year of Big Data (and big models).

## never mind the big data here’s the big models [workshop]

Posted in Kids, pictures, Statistics with tags Bayesian model comparison, big data, big models, likelihood function, misspecified model, model criticism, Sex Pistols, University of Warwick on December 10, 2015 by xi'an**A** perfect opportunity to recycle the pastiche of the iconic LP of the Sex Pistols!, that Mark Girolami posted for the ATI Scoping workshop last month in Warwick. There is an open workshop on the theme of big data/big models next week in Warwick, organised by the Warwick Data Science Institute. It will take place on December 15, from noon till 5:30pm in the Zeeman Building. Invited speakers are

*“To avoid fainting, keep repeating ‘It’s only a model’…”*

## intractable likelihoods (even) for Alan

Posted in Kids, pictures, Statistics with tags ABC, Alan Turing Institute, consensus, decision theory, intractable likelihood, likelihood function, misspecified model, network, privacy, RKHS, Sex Pistols, summary statistics, University of Warwick on November 19, 2015 by xi'an**I**n connection with the official launch of the Alan Turing Institute (or ATI, of which Warwick is a partner), it funded an ATI Scoping workshop ~~yesterday~~ a week ago in Warwick around the notion(s) of intractable likelihood(s) and how this could/should fit within the themes of the Institute [hence the scoping]. This is one among many such scoping workshops taking place at all partners, as reported on the ATI website. Workshop that was quite relaxed and great fun, if only for getting together with most people (and friends) in the UK interested in the topic. But also pointing out some new themes I had not previously though of as related to ilike. For instance, questioning the relevance of likelihood for inference and putting forward decision theory under model misspecification, connecting with privacy and ethics [hence making intractable “good”!], introducing uncertain likelihood, getting more into network models, RKHS as a natural summary statistic, swarm of solutions for consensus inference… (And thanks to Mark Girolami for this homage to the iconic LP of the Sex Pistols!, that I played maniacally all over 1978…) My own two-cents into the discussion were mostly variations of other discussions, borrowing from ABC (and ABC slides) to call for a novel approach to approximate inference:

## Statistics slides (4)

Posted in Books, Kids, Statistics, University life with tags asymptotics, Bayesian statistics, Don Rubin, EM algorithm, likelihood function, likelihood surface, missing values, Paris, score function, Université Paris Dauphine on November 10, 2014 by xi'an**H**ere is the fourth set of slides for my third year statistics course, trying to build intuition about the likelihood surface and why on Earth would one want to find its maximum?!, through graphs. I am yet uncertain whether or not I will reach the point where I can teach more asymptotics so maybe I will also include asymptotic normality of the MLE under regularity conditions in this chapter…