Archive for EM algorithm

EM degeneracy

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on June 16, 2021 by xi'an

At the MHC 2021 conference today (to which I biked to attend for real!, first time since BayesComp!) I listened to Christophe Biernacki exposing the dangers of EM applied to mixtures in the presence of missing data, namely that the algorithm has a rising probability to reach a degenerate solution, namely a single observation component. Rising in the proportion of missing data. This is not hugely surprising as there is a real (global) mode at this solution. If one observation components are prohibited, they should not be accepted in the EM update. Just as in Bayesian analyses with improper priors, the likelihood should bar single or double  observations components… Which of course makes EM harder to implement. Or not?! MCEM, SEM and Gibbs are obviously straightforward to modify in this case.

Judith Rousseau also gave a fascinating talk on the properties of non-parametric mixtures, from a surprisingly light set of conditions for identifiability to posterior consistency . With an interesting use of several priors simultaneously that is a particular case of the cut models. Namely a correct joint distribution that cannot be a posterior, although this does not impact simulation issues. And a nice trick turning a hidden Markov chain into a fully finite hidden Markov chain as it is sufficient to recover a Bernstein von Mises asymptotic. If inefficient. Sylvain LeCorff presented a pseudo-marginal sequential sampler for smoothing, when the transition densities are replaced by unbiased estimators. With connection with approximate Bayesian computation smoothing. This proves harder than I first imagined because of the backward-sampling operations…

EM gets the Nobel (of statistics)

Posted in Statistics with tags , , , , , on March 23, 2021 by xi'an

folded Normals

Posted in Books, Kids, pictures, R, Running, Statistics with tags , , , , , , , , , , , , on February 25, 2021 by xi'an

While having breakfast (after an early morn swim at the vintage La Butte aux Cailles pool, which let me in free!), I noticed a letter to the Editor in the Annals of Applied Statistics, which I was unaware existed. (The concept, not this specific letter!) The point of the letter was to indicate that finding the MLE for the mean and variance of a folded normal distribution was feasible without resorting to the EM algorithm. Since the folded normal distribution is a special case of mixture (with fixed weights), using EM is indeed quite natural, but the author, Iain MacDonald, remarked that an optimiser such as R nlm() could be called instead. The few lines of relevant R code were even included. While this is a correct if minor remark, I am a wee bit surprised at seeing it included in the journal, the more because the authors of the original paper using the EM approach were given the opportunity to respond, noticing EM is much faster than nlm in the cases they tested, and Iain MacDonald had a further rejoinder! The more because the Wikipedia page mentioned the use of optimisers much earlier (and pointed out at the R package Rfast as producing MLEs for the distribution).

hard birthday problem

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , on February 4, 2021 by xi'an

Click to access birthday.pdf

From an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,

\frac{T!}{(\bar N-r)!}\frac{m!}{T^m}  \sum_{(r_1,\ldots,r_m);\\\sum_1^m r_i=r\ \&\\\sum_1^m ir_i=m}1\Big/\prod_{j=1}^m r_j! (j!)^{r_j}

borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)

{T \choose r}\sum_{\nu=0}^r (-1)^{\nu}{r\choose\nu}\left(1-\frac{T-r+\nu}T \right)^m

which fits rather nicely simulation frequencies, as shown using

apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)

Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$

\lambda(M)=\bar{N}\exp\{-M/\bar N\}

from which an estimate of $M$ can be derived. With the birthday problem as illustration (pp.105-106)!

It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).

a neat EM resolution

Posted in Books, Kids, Statistics, University life with tags , , , , , , on February 3, 2021 by xi'an

Read (and answered) this question on X validation about finding the maximum likelihood estimator of a 2×2 Gaussian covariance matrix when some observations are partly missing.  The neat thing is that, in this case, the maximisation step is identical to the maximum likelihood estimation of the 2×2 Gaussian covariance matrix by redefining the empirical covariance matrix into Z and maximising

-n\log|\Sigma|-\text{trace}(Z\Sigma^{-1})

in Σ. Nothing involved but fun to explain, nonetheless. (In my final exam this year, no student even approached the EM questions!)