## Datblygwch Fferm Tynton yn Ganolfan Ymwelwyr a Gwybodaeth

Rydym yn galw ar Lywodraeth Cymru i gydnabod cyfraniad pwysig Dr Richard Price nid yn unig i’r Oes Oleuedig yn y ddeunawfed ganrif, ond hefyd i’r broses o greu’r byd modern yr ydym yn byw ynddo heddiw, a datblygu ei fan geni a chartref ei blentyndod yn ganolfan wybodaeth i ymwelwyr lle gall pobl o bob cenedl ac oed ddarganfod sut mae ei gyfraniadau sylweddol i ddiwinyddiaeth, mathemateg ac athroniaeth wedi dylanwadu ar y byd modern.

Filed under: Books, pictures, Statistics, Travel, University life Tagged: ISBA, Richard Price, Richard Price Society, Thomas Bayes, Wales, Welsh ]]>

Î(θ,**u**)q(**u**)/C

and a Metropolis-Hastings proposal on that target simulating from k(θ,θ’)q(**u’**) *[meaning the auxiliary is simulated independently]* recovers the pseudo-marginal Metropolis-Hastings ratio

Î(θ’,**u**‘)k(θ’,θ)/Î(θ,**u**)k(θ,θ’)

(which is a nice alternative proof that the method works!). The novel idea in the paper is that the proposal on the auxiliary **u** can be of a different form, while remaining manageable. For instance, as a two-block Gibbs sampler. Or an elliptical slice sampler for the **u** component. The argument being that an independent update of **u** may lead the joint chain to get stuck. Among the illustrations in the paper, an Ising model (with no phase transition issue?) and a Gaussian process applied to the Pima Indian data set (despite a recent prohibition!). From the final discussion, I gather that the modification should be applicable to every (?) case when a pseudo-marginal approach is available, since the auxiliary distribution q(**u**) is treated as a black box. Quite an interesting read and proposal!

Filed under: Books, Statistics, University life Tagged: Alan Turing Institute, auxiliary variable, doubly intractable problems, pseudo-marginal MCMC, slice sampling, University of Warwick ]]>

Filed under: Kids, Uncategorized Tagged: International Day for the Elimination of Violence against Women, Orange Day, UNiTE to End Violence against Women ]]>

#target is N(0,1) #proposal is N(0,.01) T=1e5 prop=x=rnorm(T,sd=.01) ratop=dnorm(prop,log=TRUE)-dnorm(prop,sd=.01,log=TRUE) ratav=ratop[1] logu=ratop-log(runif(T)) for (t in 2:T){ if (logu[t]>ratav){ x[t]=prop[t];ratav=ratop[t]}else{x[t]=x[t-1]} }

It produces outputs of the following shape

which is quite amazing because of the small variance. The reason for the lengthy freezes of the chain is the occurrence with positive probability of realisations from the proposal with very small proposal density values, as they induce very small Metropolis-Hastings acceptance probabilities and are almost “impossible” to leave. This is due to the lack of control of the target, which is flat over the domain of the proposal for all practical purposes. Obviously, in such a setting, the outcome is unrelated with the N(0,1) target!

It is also unrelated with the normal proposal in that switching to a t distribution with 3 degrees of freedom produces a similar outcome:

It is only when using a Cauchy proposal that the pattern vanishes:

Filed under: Kids, pictures, R, Statistics, University life Tagged: acceptance probability, convergence assessment, heavy-tail distribution, independent Metropolis-Hastings algorithm, Metropolis-Hastings algorithm, normal distribution, Student's t distribution ]]>

*“In this paper we have demonstrated the potential benefits, both theoretical and practical, of the independence sampler over the random walk Metropolis algorithm.”*

**P**eter Neal and Tsun Man Clement Lee arXived a paper on optimising the independent Metropolis-Hastings algorithm. I was a bit surprised at this “return” of the independent sampler, which I hardly mention in my lectures, so I had a look at the paper. The goal is to produce an equivalent to what Gelman, Gilks and Wild (1996) obtained for random walk samplers. In the formal setting when the target is a product of n identical densities f, the optimal number k of components to update in one Metropolis-Hastings (within Gibbs) round is approximately 2.835/I, where I is the symmetrised Kullback-Leibler distance between the (univariate) target f and the independent proposal q. When I is finite. The most surprising part is that the optimal acceptance rate is again 0.234, as in the random walk case. This is surprising in that I usually associate the independent Metropolis-Hastings algorithm with high acceptance rates. But this is of course when calibrating the proposal q, not the block size k of the Gibbs part. Hence, while this calibration of the independent Metropolis-within-Gibbs sampler is worth the study and almost automatically applicable, it remains that it only applies to a certain category of problems where blocking can take place. As in the disease models illustrating the paper. And requires an adequate choice of proposal distribution for, otherwise, the above quote becomes inappropriate.

Filed under: Books, Statistics Tagged: 0.234, block sampling, Gibbs sampler, independent Metropolis-Hastings algorithm, Metropolis-within-Gibbs algorithm, optimal acceptance rate, random walk ]]>

Filed under: Books, Kids, Statistics Tagged: continuity, importance sampling, infinite variance estimators, moments, Monte Carlo experiment, Monte Carlo Statistical Methods ]]>

If a sample taken from an arbitrary distribution on {0,1}⁶ is censored from its (0,0,0,0,0,0) elements, and if the marginal probabilities are know for all six components of the random vector, what is an estimate of the proportion of (missing) (0,0,0,0,0,0) elements?

Since the censoring modifies all probabilities by the same renormalisation, i.e. divides them by the probability to be different from (0,0,0,0,0,0), *ρ*, this probability can be estimated by looking at the marginal probabilities to be equal to 1, which equal the original and known marginal probabilities divided by *ρ*. Here is a short R code illustrating the approach that I wrote in the taxi home yesterday night:

#generate vectors N=1e5 zprobs=c(.1,.9) #iid example smpl=matrix(sample(0:1,6*N,rep=TRUE,prob=zprobs),ncol=6) pty=apply(smpl,1,sum) smpl=smpl[pty>0,] ps=apply(smpl,2,mean) cor=mean(ps/rep(zprobs[2],6)) #estimated original size length(smpl[,1])*cor

A broader question is how many values (and which values) of the sample can be removed before this recovery gets impossible (with the same amount of information).

Filed under: Books, Kids, R Tagged: conditional probability, cross validated, mathematical puzzle, R ]]>

Filed under: Kids, pictures Tagged: fundamentals ]]>

Filed under: pictures, Travel, Wines Tagged: Cabernet, Columbia Valley, Petit Verdot, Seattle, Walla Walla, Washington State ]]>

*“Within this unified context, it is possible to interpret that all the MIS algorithms draw samples from a equal-weighted mixture distribution obtained from the set of available proposal pdfs.”*

**I**n a very special (important?!) week for importance sampling!, Elvira et al. arXived a paper about generalized multiple importance sampling. The setting is the same as in earlier papers by Veach and Gibas (1995) or Owen and Zhou (2000) [and in our AMIS paper], namely a collection of importance functions and of simulations from those functions. However, there is no adaptivity for the construction of the importance functions and no Markov (MCMC) dependence on the generation of the simulations.

“One of the goals of this paper is to provide the practitioner with solid theoretical results about the superiority of some specific MIS schemes.”

One first part deals with the fact that a random point taken from the conjunction of those samples is distributed from the equiweighted mixture. Which was a fact I had much appreciated when reading Owen and Zhou (2000). From there, the authors discuss the various choices of importance weighting. Meaning the different degrees of Rao-Blackwellisation that can be applied to the sample. As we discovered in our population Monte Carlo research [which is well-referred within this paper], conditioning too much leads to useless adaptivity. Again a sort of epiphany for me, in that a whole family of importance functions could be used for the same target expectation and the very same simulated value: it all depends on the degree of conditioning employed for the construction of the importance function. To get around the annoying fact that self-normalised estimators are never unbiased, the authors borrow Liu’s (2000) notion of proper importance sampling estimators, where the ratio of the expectations is returning the right quantity. (Which amounts to recover the correct normalising constant(s), I believe.) They then introduce five (5!) different possible importance weights that all produce proper estimators. However, those weights correspond to different sampling schemes, so do not apply to the same sample. In other words, they are not recycling weights as in AMIS. And do not cover the adaptive cases where the weights and parameters of the different proposals change along iterations. Unsurprisingly, the smallest variance estimator is the one based on sampling without replacement and an importance weight made of the entire mixture. But this result does not apply for the self-normalised version, whose variance remains intractable.

I find this survey of existing and non-existing multiple importance methods quite relevant and a must-read for my students (and beyond!). My reservations (for reservations there must be!) are that the study stops short of pushing further the optimisation. Indeed, the available importance functions are not equivalent in terms of the target and hence weighting them equally is sub-efficient. The adaptive part of the paper broaches upon this issue but does not conclude.

Filed under: Books, Statistics, University life Tagged: adaptive mixture importance sampling, AMIS, importance sampling, MCMC, Monte Carlo Statistical Methods, multiple importance methods, multiple mixtures, population Monte Carlo, simulation ]]>