**A** chance occurrence on X validated: a question on an incomprehensible formula for Bayesian model choice: which, most unfortunately!, appeared in Bayesian Essentials with R! Eeech! It looks like one line in our *L ^{A}T_{E}X* file got erased and the likelihood part in the denominator altogether vanished. Apologies to all readers confused by this nonsensical formula!

## Archive for typos

## a typo that went under the radar

Posted in Books, R, Statistics, University life with tags Bayesian Core, Bayesian Essentials with R, Bayesian model choice, cross validated, Jean-Michel Marin, model posterior probabilities, R, typos on January 25, 2017 by xi'an## Example 7.3: what a mess!

Posted in Books, Kids, R, Statistics, University life with tags beta distribution, cross validated, George Casella, Gibbs sampling, Introducing Monte Carlo Methods with R, Metropolis-Hastings algorithm, typos on November 13, 2016 by xi'an**A** rather obscure question on Metropolis-Hastings algorithms on X Validated ended up being about our first illustration in Introducing Monte Carlo methods with R. And exposing some inconsistencies in the following example… Example 7.2 is based on a [toy] joint Beta x Binomial target, which leads to a basic Gibbs sampler. We thought this was straightforward, but it may confuse readers who think of using Gibbs sampling for posterior simulation as, in this case, there is neither observation nor posterior, but simply a (joint) target in (x,θ).

And then it indeed came out that we had incorrectly written Example 7.3 on the [toy] Normal posterior, using at times a Normal mean prior with a [prior] variance scaled by the sampling variance and at times a Normal mean prior with a [prior] variance unscaled by the sampling variance. I am rather amazed that this did not show up earlier. Although there were already typos listed about that example.

## from down-under, Lake Menteith upside-down

Posted in Books, R, Statistics with tags Bayesian Core, image processing, Lake of Menteith, Loch Lomond, typos on January 23, 2013 by xi'an**T**he dataset used in *Bayesian Core* for the chapter on image processing is a Landsat picture of Lake of Menteith in Scotland (close to Loch Lomond). (Yes, Lake of Menteith, not Loch Menteith!) Here is the image produced in the book. I just got an email from Matt Moores at QUT that the image is both rotated and flipped:

The image of Lake Mentieth in figure 8.6 ofBayesian Coreis upside-down and back-to-front, so to speak. Also, I recently read a paper by Lionel Cucala & J-M Marin that has the same error.

This is due to the difference between matrix indices and image coordinates: matrices in R are indexed by [row,column] but image coordinates are [x,y]. Also, y=1 is the first row of the matrix, but the bottom row of pixels in an image.

Only a one line change to the R code is required to display the image in the correct orientation:image(1:100,1:100,t(as.matrix(lm3)[100:1,]),col=gray(256:1/256),xlab="",ylab="")

As can be checked on Googlemap, the picture is indeed rotated by a -90⁰ angle and the transpose correction does the job!

## yet more questions about Monte Carlo Statistical Methods

Posted in Books, Statistics, University life with tags Brigham Young University, Cauchy-Schwarz inequality, finite variance, importance sampling, Monte Carlo Statistical Methods, Provo, simulation, textbook, typos, Utah, variance reduction on December 8, 2011 by xi'an**A**s a coincidence, here is the third email I this week about typos in * Monte Carlo Statistical Method*, from Peng Yu this time. (Which suits me well in terms of posts as I am currently travelling to Provo, Utah!)

I’m reading the section on importance sampling. But there are a fewcases in your book MCSM2 that are not clear to me.

On page 96: “Theorem 3.12 suggests looking for distributions g forwhich |h|f/g is almost constant with finite variance.”

What is the precise meaning of “almost constant”? If |h|f/g is almostconstant, how come its variance is not finite?

“Almost constant” is not a well-defined property, I am afraid. By this sentence on page 96 we meant using densities g that made *|h|f/g* as little varying as possible while being manageable. Hence the insistence on the finite variance. Of course, the closer *|h|f/g* is to a constant function the more likely the variance is to be finite.

“It is importantto note that although the finite variance constraint is not necessary for theconvergence of (3.8) and of (3.11), importance sampling performs quite poorlywhen (3.12) ….”

It is not obvious to me why when (3.12) importance sampling performspoorly. I might have overlooked some very simple facts. Would youplease remind me why it is the case?From the previous discussion in the same section, it seems that h(x) ismissing in (3.12). I think that (3.12) should be (please compare withthe first equation in section 3.3.2)

The preference for a finite variance of *f/g* and against (3.12) is that we would like the importance function *g* to work well for most integrable functions *h*. Hence a requirement that the importance weight *f/g* itself behaves well. It guarantees some robustness across the *h*‘s and also avoids checking for the finite variance (as in your displayed equation) for all functions *h* that are square-integrable against *g*, by virtue of the Cauchy-Schwarz inequality.

## confusing errata in Monte Carlo Statistical Methods

Posted in Books, Statistics, University life with tags errata, Monte Carlo Statistical Methods, simulation, slice sampling, typos, uniform distribution on December 7, 2011 by xi'an**F**ollowing the earlier errata on * Monte Carlo Statistical Methods*, I received this email from Nick Foti:

I have a quick question about example 8.2 inwhich derives a slice sampler for a truncated N(-3,1) distribution (note, the book says it is a N(3,1) distribution, but the code and derivation are for a N(-3,1)). My question is that the slice set AMonte Carlo Statistical Methods^{(t+1)}is described as

which makes sense if u ~ U(0,1) as it corresponds to the previously described algorithm. However, in the errata for the book it says that u should actually be u^{(t+1)}which according to the algorithm should be distributed as U(0,f_{1}(x)). So unless something clever is being done with ratios of the f_{1}‘s, it seems like the u^{(t+1)}should be U(0,1) in this case, right?

**T**here is indeed a typo in Example 8.4: the mean 3 should be -3… As for the errata, it addresses the missing time indices. Nick is right in assuming that those uniforms are indeed on *(0,1)*, rather than on *(0,f _{1}(x))* as in Algorithm A.31. Apologies to all those confused by the errata!

## new typos in Monte Carlo Statistical Methods

Posted in Books, Statistics, University life with tags exponential distribution, gamma distribution, importance sampling, Metropolis-Hastings algorithms, Monte Carlo Statistical Methods, scale, simulation, truncated normal, typos on December 7, 2011 by xi'an**T**hanks to Jay Bartroff for pointing out those typos after teaching from ** Monte Carlo Statistical Methods**:

- On page 52, the gamma
*Ga(α, β)*distribution uses*β*as a rate parameter while in other places it is a scale parameter, see, e.g. eqn (2.2)*[correct, I must say the parameterisation of the gamma distribution is a pain and, while we tried to homogenise the notation with the second parameter being the rate, there are places like this where either the rate convention (as in the exponential distribution) or the scale convention (as in the generation) is the natural one…]*

- Still on page 52, in Example 2.20, truncated normals are said to be discussed after Example 1.5, but they’re not.
*[There is a mention made of constrained parameters right after but this is rather cryptic!]*

- On page 53, the ratio
*f/g*following the second displayed eqn is missing some terms_{α}*[or, rather, the equality sign should be a proportional sign]*

- Still on page 53, in eqn (2.11), the whole expression, rather than the square root, should be divided by 2
*[yes, surprising typo given that it was derived correctly in the original paper!]*

- On page 92, the exact constraint is that supp(g) actually needs only contain the intersection of supp(f) and supp(h), such as when approximating tail probabilities
*[correct if the importance sampling method is only used for a single function h, else the condition stands as is]*

- On page 94, f
_{Y}does not need that integral in the denominator*[correct, we already corrected for the truncation by subtracting 4.5 in the exponential]*

- On page 114, Problem 3.22, ω
_{i}is missing a factor of 1/n*[correct]*

- On page 218, in Example 6.24, P
_{00}=0*[indeed, our remark that P*_{xx}>0 should start with x=1. Note that this does not change the aperiodicity, though]

- On page 282, the
*log α*after the 2nd displayed equation should be*e*^{α}[correct, this was pointed out in a previous list of typos, but clearly not corrected in the latest printing!]

- On page 282, in the 5th displayed equation there are missing factors
*π(α’|b)/π(α*in rejection probability_{0}|b)*[actually, no, because, those terms being both proposals and priors, they cancel in the ratio. We could add a sentence to this effect to explain why, though.]*

- On page 634, the reference page for exponential distribution is mistakenly given as 99
*[wow, very thorough reading! There is an exponential distribution involved on page 99 but I agree this is not the relevant page…]*

## more typos in Monte Carlo statistical methods

Posted in Books, Statistics, University life with tags capture-recapture, EM algorithm, frequentist inference, integer set, Jensen's inequality, missing data, Monte Carlo Statistical Methods, optimisation, typos, UNC on October 28, 2011 by xi'an**J**an Hanning kindly sent me this email about several difficulties with Chapters 3, *Monte Carlo Integration*, and 5, *Monte Carlo Optimization*, when teaching out of our book *Monte Carlo Statistical Methods**[my replies in italics between square brackets, apologies for the late reply and posting, as well as for the confusion thus created. Of course, the additional typos will soon be included in the typo lists on my book webpage.]*:

- I seem to be unable to reproduce
on page 88 – especially the chi-square column does not look quite right.*Table 3.3**[No, they definitely are not right: the true χ² quantiles should be 2.70, 3.84, and 6.63, at the levels 0.1, 0.05, and 0.01, respectively. I actually fail to understand how we got this table*that*wrong…]* - The second question I have is the choice of the U(0,1) in this
. It feels to me that a choice of Beta(23.5,18.5) for*Example 3.6**p*and Beta(36.5,5.5) for_{1}*p*might give a better representation based on the data we have. Any comments?_{2}*[I am plainly uncertain about this… Yours is the choice based on the posterior Beta coefficient distributions associated with Jeffreys prior, hence making the best use of the data. I wonder whether or not we should remove this example altogether… It is certainly “better” than the uniform. However, in my opinion, there is no proper choice for the distribution of the**p*]_{i}‘s because we are mixing there a likelihood-ratio solution with a Bayesian perspective on the predictive distribution of the likelihood-ratio. If anything, this exposes the shortcomings of a classical approach, but it is likely to confuse the students! Anyway, this is a very interesting problem. - My students discovered that
has the following typos, copying from their e-mail: “x_x” should be “x_i”*Problem 5.19**[sure!]*. There are a few “( )”s missing here and there*[yes!]*. Most importantly, the likelihood/density seems incorrect. The normalizing constant should be the reciprocal of the one showed in the book*[oh dear, indeed, the constant in the exponential density did not get to the denominator…]*. As a result, all the formulas would differ except the ones in part (a).*[they clearly need to be rewritten, sorry about this mess!]* - I am unsure about the
*if and only if*part of the**Theorem 5.15***[namely that the likelihood sequence is stationary*if and only if*the Q function in the E step has reached a stationary point]*. It appears to me that a condition for the “if part” is missing*[the “only if” part is a direct consequence of Jensen’s inequality]*. Indeed Theorem 1 of Dempster et al 1977 has an extra condition [*note that the original proof for convergence of EM has a flaw, as discussed here]*. Am I missing something obvious?*[maybe: it seems to me that, once Q reaches a fixed point, the likelihood L does not change… It is thus tautological, not a proof of convergence! But the theorem says a wee more, so this needs investigating. As Jan remarked, there is no symmetry in the Q function…]* - Should there be a (n-m) in the last term of formula
?*(5.17)**[yes, indeed!, multiply the last term by (n-m)]* - Finally, I am a bit confused about the likelihood in
*Example 5.22**[which is a capture-recapture model]*. Assume that H_{ij}=k*[meaning the animal i is in state k at time j]*. Do you assume that you observed X_{ijr}*[which is the capture indicator for animal i at time j in zone k: it is equal to 1 for at most one k]*as a Binomial B(n,p_{r}) even for r≠k?*[no, we observe all X*The nature of the problem seems to suggest that the answer is no_{ijr}‘s with r≠k equal to zero]*[for other indices,**X*If that is the case I do not see where the power on top of (1-p_{ijr}is always zero, indeed]_{k}) in the middle of the page 185 comes from*[when the capture indices are zero, they do not contribute to the sum, which explains for this condensed formula. Therefore, I do not think there is anything wrong with this over-parameterised representation of the missing variables.]* - In Section 5.3.4, there seems to be a missing minus sign in the approximation formula for the variance [
*indeed, shame on us for missing the minus in the observed information matrix!]* - I could not find the definition of in Theorem 6.15. Is it all natural numbers or all integers? May be it would help to include it in Appendix B. [
*Surprising! This is the set of all positive integers, I thought this was a standard math notation…]* - In Definition 6.27, you probably want to say covering of
*A*and not*X*.*[Yes, we were already thinking of the next theorem, most likely!]* - In Proposition 6.33 – all x in A instead of all x in X.
*[Yes, again! As shown in the proof. Even though it also holds for all x in X]*

Thanks a ton to Jan and to his UNC students (and apologies for leading them astray with those typos!!!)