**A**n interesting question from X validated about constructing pseudo-priors for Bayesian model selection. Namely, how useful are these for the concept rather than the implementation? The only case where I am aware of pseudo-priors being used is in Bayesian MCMC algorithms such as Carlin and Chib (1995), where the distributions are used to complement the posterior distribution conditional on a single model (index) into a joint distribution across all model parameters. The trick of this construction is that the pseudo-priors can be essentially anything, including depending on the data as well. And while the impact the ability of the resulting Markov chain to move between spaces, they have no say on the resulting inference, either when choosing a model or when estimating the parameters of a chosen model. The concept of pseudo-priors was also central to the mis-interpretations found in Congdon (2006) and Scott (2002). Which we reanalysed with Jean-Michel Marin in Bayesian Analysis (2008) as the distinction between model-based posteriors and joint pseudo-posteriors.

## Archive for cross validated

## are pseudopriors required in Bayesian model selection?

Posted in Books, Kids, pictures, Statistics, University life with tags Bayesian Analysis, canons, cross validated, Invalides, Joe Abercrombie, joint pseudo-posterior, model choice, Paris, posterior probability, pseudo-priors, The Last Argument of Kings on February 29, 2020 by xi'an## an elegant sampler

Posted in Books, Kids, R, University life with tags cross validated, MCMC, Metropolis-Hastings algorithm, R, random walk, sampling from an atomic population, simplex, uniform simulation on January 15, 2020 by xi'an**F**ollowing an X validated question on how to simulate a multinomial with fixed average, W. Huber produced a highly elegant and efficient resolution with the compact R code

tabulate(sample.int((k-1)*n, s-n) %% n + 1, n) + 1

where *k* is the number of classes, *n* the number of draws, and *s* equal to *n* times the fixed average. The R function *sample.int* is an alternative to *sample* that seems faster. Breaking the outcome of

sample.int((k-1)*n, s-n)

as nonzero positions in an *n x (k-1)* matrix and adding a adding a row of *n* 1’s leads to a simulation of integers between 1 and *k* by counting the 1’s in each of the *n* columns, which is the meaning of the above picture. Where the colour code is added after counting the number of 1’s. Since there are *s* 1’s in this matrix, the sum is automatically equal to *s*. Since the *s-n* positions are chosen uniformly over the *n x (k-1)* locations, the outcome is uniform. The rest of the R code is a brutally efficient way to translate the idea into a function. (By comparison, I brute-forced the question by suggesting a basic Metropolis algorithm.)

## sampling-importance-resampling is not equivalent to exact sampling [triste SIR]

Posted in Books, Kids, Statistics, University life with tags asymptotics, cross validated, importance sampling, infinite variance estimators, sampling w/o replacement, self-normalised importance sampling, SIR on December 16, 2019 by xi'an**F**ollowing an X validated question on the topic, I reassessed a previous impression I had that sampling-importance-resampling (SIR) is equivalent to direct sampling for a given sample size. (As suggested in the above fit between a N(2,½) target and a N(0,1) proposal.) Indeed, when one produces a sample

and resamples with replacement from this sample using the importance weights

the resulting sample

is neither “i.” nor “i.d.” since the resampling step involves a self-normalisation of the weights and hence a global bias in the evaluation of expectations. In particular, if the importance function g is a poor choice for the target f, meaning that the exploration of the whole support is imperfect, if possible (when both supports are equal), a given sample may well fail to reproduce the properties of an iid example ,as shown in the graph below where a Normal density is used for g while f is a Student t⁵ density:

## 45 votes for Jensen’s inequality

Posted in Books, Statistics with tags cross validated, Jensen's inequality, uniform distribution, vote on November 27, 2019 by xi'an**F**ollowing a question on X validated as to why the mean of the log of a uniform distribution is not log(0.5), I replied with the obvious link to Jensen’s inequality and the more general if equally obvious remark that expectation was rarely invariant under transforms and ended up with an high number of up-votes on that answer. Which bemuses me given the basic question and equally basic answer..!

## conditioning on zero probability events

Posted in Books, Kids, pictures, Statistics, University life with tags conditional probability, cross validated, StackExchange, sufficiency, zero measure set on November 15, 2019 by xi'an**A**n interesting question on X validated as to how come a statistic T(X) can be sufficient when its support depends on the parameter θ behind the distribution of X. The reasoning there being that the distribution of X given T(X)=t does depend on θ since it is not defined for some values of θ … Which is not correct in that the conditional distribution of X depends on the realisation of T, meaning that if this realisation is impossible, then the conditional is arbitrary and of no relevance. Which also led me to tangentially notice and bemoan that most (Stack) exchanges on conditioning on zero probability events are pretty unsatisfactory in that they insist on interpreting P(X=x) [equal to zero] in a literal sense when it is merely a notation in the continuous case. And undefined when X has a discrete support. (Conditional probability is always a sore point for my students!)

## Why do we draw parameters to draw from a marginal distribution that does not contain the parameters?

Posted in Statistics with tags accept-reject algorithm, Animal Farm, auxiliary variables, cross validated, importance sampling, marginalisation, multiple importance methods, probability basics on November 3, 2019 by xi'an**A** revealing question on X validated of a simulation concept students (and others) have trouble gripping with. Namely using auxiliary variates to simulate from a marginal distribution, since these auxiliary variables are later dismissed and hence appear to them (students) of no use at all. Even after being exposed to the accept-reject algorithm. Or to multiple importance sampling. In the sense that a realisation of a random variable can be associated with a whole series of densities in an importance weight, all of them being valid (but some more equal than others!).