## a well-hidden E step

**A** recent question on X validated ended up being quite interesting! The model under consideration is made of parallel Markov chains on a finite state space, all with the same Markov transition matrix, **M**, which turns into a hidden Markov model when the only summary available is the number of chains in a given state at a given time. When writing down the EM algorithm, the E step involves the expected number of moves from a given state to a given state at a given time. The conditional distribution of those numbers of chains is a product of multinomials across times and starting states, with no Markov structure since the number of chains starting from a given state is known at each instant. Except that those multinomials are constrained by the number of “arrivals” in each state at the next instant and that this makes the computation of the expectation intractable, as far as I can see.

A solution by Monte Carlo EM means running the moves for each instant under the above constraints, which is thus a sort of multinomial distribution with fixed margins, enjoying a closed-form expression but for the normalising constant. The direct simulation soon gets too costly as the number of states increases and I thus considered a basic Metropolis move, using one margin (row or column) or the other as proposal, with the correction taken on another margin. This is very basic but apparently enough for the purpose of the exercise. If I find time in the coming days, I will try to look at the ABC resolution of this problem, a logical move when starting from non-sufficient statistics!

February 3, 2017 at 10:33 am

[…] Please comment on the article here: R – Xi’an’s Og […]

February 3, 2017 at 10:22 am

Hi Xi’an,

Thanks for the post!

I find it a bit confusing, however, as I am unaware of the Statistical terms involved. Would you be able to refer me to resources that can act as a starting point for somebody who would like to understand the concepts you covered here?

February 3, 2017 at 9:12 pm

Errr…, I cannot really help you if you do not have the Statistics background to read a Statistics blog. This entry is about EM that does not work for a special HMM and about the MCEM I devised instead: I cannot imagine this makes much sense without a Statistical training and suggest you first acquire the expertise to understand the terms used therein. Taking a computational statistics course or reading a computational statistics textbook for instance.