## Archive for EM algorithm

## EM gets the Nobel (of statistics)

Posted in Statistics with tags EM algorithm, Harvard University, International Prize in Statistics, longitudinal studies, Nan Laird, random effects on March 23, 2021 by xi'an## folded Normals

Posted in Books, Kids, pictures, R, Running, Statistics with tags Annals of Applied Statistics, EM algorithm, folded normal, La Butte aux Cailles, letter to the editor, maximum likelihood estimation, nlm, outdoor swimming, Paris, R, Rfast, swimming pool, wikipedia on February 25, 2021 by xi'an**W**hile having breakfast (after an early morn swim at the vintage La Butte aux Cailles pool, which let me in free!), I noticed a letter to the Editor in the Annals of Applied Statistics, which I was unaware existed. (The concept, not this specific letter!) The point of the letter was to indicate that finding the MLE for the mean and variance of a folded normal distribution was feasible without resorting to the EM algorithm. Since the folded normal distribution is a special case of mixture (with fixed weights), using EM is indeed quite natural, but the author, Iain MacDonald, remarked that an optimiser such as R nlm() could be called instead. The few lines of relevant R code were even included. While this is a correct if minor remark, I am a wee bit surprised at seeing it included in the journal, the more because the authors of the original paper using the EM approach were given the opportunity to respond, noticing EM is much faster than nlm in the cases they tested, and Iain MacDonald had a further rejoinder! The more because the Wikipedia page mentioned the use of optimisers much earlier (and pointed out at the R package Rfast as producing MLEs for the distribution).

## hard birthday problem

Posted in Books, Kids, R, Statistics with tags birthday problem, cross validated, EM algorithm, mathematical puzzle, Poisson approximation, R.A. Fisher, SAS, strong birthday problem, William Feller, Wordpress on February 4, 2021 by xi'anFrom an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,

borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)

which fits rather nicely simulation frequencies, as shown using

apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)

Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$

from which an estimate of $M$ can be derived. With the birthday problem as illustration (pp.105-106)!

It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).

## a neat EM resolution

Posted in Books, Kids, Statistics, University life with tags cross validated, EM algorithm, final exam, maximum likelihood estimation, missing data, MLE, normal model on February 3, 2021 by xi'anRead (and answered) this question on X validation about finding the maximum likelihood estimator of a 2×2 Gaussian covariance matrix when some observations are partly missing. The neat thing is that, in this case, the maximisation step is identical to the maximum likelihood estimation of the 2×2 Gaussian covariance matrix by redefining the empirical covariance matrix into Z and maximising

in *Σ*. Nothing involved but fun to explain, nonetheless. (In my final exam this year, no student even approached the EM questions!)

## artificial EM

Posted in Books, Kids, R, Statistics, University life with tags cross validated, EM algorithm, fixed point, joint distribution, maximum likelihood estimation, something is wrong on the Internet, wikipedia on October 28, 2020 by xi'an**W**hen addressing an X validated question on the use of the EM algorithm when estimating a Normal mean, my first comment was that it was inappropriate since there is no missing data structure to anchor by (right preposition?). However I then reflected upon the infinite number of ways to demarginalise the normal density into a joint density

∫ f(x,z;μ)dz = φ(x–μ)

from the (slice sampler) call to an indicator function for f(x,z;μ) to a joint Normal distribution with an arbitrary correlation. While the joint Normal representation produces a sequence converging to the MLE, the slice representation utterly fails as the indicator functions make any starting value of μ a fixed point for EM.

Incidentally, when quoting from Wikipedia on the purpose of the EM algorithm, the following passage

Finding a maximum likelihood solution typically requires taking the derivatives of the likelihood function with respect to all the unknown values, the parameters and the latent variables, and simultaneously solving the resulting equations.

struck me as confusing and possibly wrong since it seems to suggest to seek a maximum in *both* the parameter and the latent variables. Which does not produce the same value as the observed likelihood maximisation.