Lorenzo Rimella, Chris Jewell, and Paul Fearnhead have recently arXived a paper entitled Simulation Based Composite Likelihood, where they consider a composite likelihood approximation for running inference on HMM parameters under the specific scenario of HMMs on finite, high-dimension N, state spaces X with huge cost of order card (Χ)2N when computing the likelihood by the forward algorithm:
“Inference for high-dimensional hidden Markov models is challenging due to the exponential-in-dimension computational cost of the forward algorithm.”
The authors make an assumption (2) of total factorisation across dimensions for both current hidden and current observed terms, given the previous hidden states, which is very very strong, if not resulting in a complete separation into independent component-wise HMMs. This helps however in deriving a Monte Carlo approximation of the likelihood of one component of the HMM sequence, the full likelihood being then approximated in a composite (likelihood) manner by the product of these component marginals. The remaining difficulty of computing the marginals of the component-wise observed (pseudo-) Markov chains is attenuated
“by fixing the state of all but one component n of the latent process, [since] we can leverage the factorisation and calculate probabilities related to the time-trajectory of the remaining [latent] state”
but it requires simulation of the hidden chain, overall of order O(PTN²card (X)²) when P is the number of MCMC simulations, which can be improved by a factor N by removing a feedback step through a further marginal likelihood approximation. Interestingly falling into a prediction-correction pattern usual in sequential simulations. All this demonstrates craftsmanship of a high order, even though the issue of using an approximate composite likelihood does not seem to be addressed.