## observed vs. complete in EM algorithm

Posted in Statistics with tags , , , , , on November 17, 2022 by xi'an

While answering a question related with the EM  algorithm on X validated, I realised a global (or generic) feature of the (objective) E function, namely that

$E(\theta'|\theta)=\mathbb E_{\theta}[\log\,f_{X,Z}(x^\text{obs},Z|\theta')|X=x^\text{obs}]$

can always be written as

$\log\,f_X(x^\text{obs};\theta')+\mathbb E_{\theta}[\log\,f_{Z|X}(Z|x^\text{obs},\theta')|X=x^\text{obs}]$

therefore always includes the (log-) observed likelihood, at least in this formal representation. While the proof that EM is monotonous in the values of the observed likelihood uses this decomposition as well, in that

$\log\,f_X(x^\text{obs};\theta')=\log\,\mathbb E_{\theta}\left[\frac{f_{X,Z}(x^\text{obs},Z;\theta')}{f_{Z|X}(Z|x^\text{obs},\theta)}\big|X=x^\text{obs}\right]$

I wonder if the appearance of the actual target in the temporary target E(θ’|θ) can be exploited any further.

## Statistics slides (4)

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , on November 10, 2014 by xi'an

Here is the fourth set of slides for my third year statistics course, trying to build intuition about the likelihood surface and why on Earth would one want to find its maximum?!, through graphs. I am yet uncertain whether or not I will reach the point where I can teach more asymptotics so maybe I will also include asymptotic normality of the MLE under regularity conditions in this chapter…

## 10 Little’s simple ideas

Posted in Books, Statistics, University life with tags , , , , , , , , on July 17, 2013 by xi'an

“I still feel that too much of academic statistics values complex mathematics over elegant simplicity — it is necessary for a research paper to be complicated in order to be published.” Roderick Little, JASA, p.359

Roderick Little wrote his Fisher lecture, recently published in JASA, around ten simple ideas for statistics. Its title is “In praise of simplicity not mathematistry! Ten simple powerful ideas for the statistical scientist”. While this title is rather antagonistic, blaming mathematical statistics for the rise of mathematistry in the field (a term borrowed from Fisher, who also invented the adjective ‘Bayesian’), the paper focus on those 10 ideas and very little on why there is (would be) too much mathematics in statistics:

1. Make outcomes univariate
2. Bayes rule, for inference under an assumed model
3. Calibrated Bayes, to keep inference honest
4. Embrace well-designed simulation experiments
5. Distinguish the model/estimand, the principles of estimation, and computational methods
6. Parsimony — seek a good simple model, not the “right” model
7. Model the Inclusion/Assignment and try to make it ignorable
8. Consider dropping parts of the likelihood to reduce the modeling part
9. Potential outcomes and principal stratification for causal inferenc
10. Statistics is basically a missing data problem

“The mathematics of problems with infinite parameters is interesting, but with finite sample sizes, I would rather have a parametric model. “Mathematistry” may eschew parametric models because the asymptotic theory is too simple, but they often work well in practice.” Roderick Little, JASA, p.365

Both those rules and the illustrations that abund in the paper are reflecting upon Little’s research focus and obviously apply to his model in a fairly coherent way. However, while a mostly parametric model user myself, I fear the rejection of non-parametric techniques is far too radical. It is more and more my convinction that we cannot handle the full complexity of a realistic structure in a standard Bayesian manner and that we have to give up on the coherence and completeness goals at some point… Using non-parametrics and/or machine learning on some bits and pieces then makes sense, even though it hurts elegance and simplicity.

“However, fully Bayes inference requires detailed probability modeling, which is often a daunting task. It seems worth sacrifycing some Bayesian inferential purity if the task can be simplified.” Roderick Little, JASA, p.366

I will not discuss those ideas in detail, as some of them make complete sense to me (like Bayesian statistics laying its assumptions in the open) and others remain obscure (e.g., causality) or with limited applicability. It is overall a commendable Fisher lecture that focus on methodology and the practice of statistical science, rather than on theory. I however do not see the reason why maths should be blamed for this state of the field. Nor why mathematical statistics journals like AoS would carry some responsibility in the lack of further applicability in other fields.  Students of statistics do need a strong background in mathematics and I fear we are losing ground in this respect, at least judging by the growing difficulty in finding measure theory courses abroad for our exchange undergradutes from Paris-Dauphine. (I also find the model misspecification aspects mostly missing from this list.)