**F**ollowing the previous post, I went and had a (long) look at Puolamäki and Kaski’s paper. I must acknowledge that, despite having several runs through the paper, I still have trouble with the approach… From what I understand, the authors use a Bernoulli mixture pseudo-model to reallocate the observations to components. That is, given an MCMC output with simulated allocations variables (a.k.a., hidden or latent variables), they create a (*T*x*K*)x*n* matrix of component binary indicators e.g., for a three component mixture,

0 1 0 0 1 0…

1 0 0 0 0 0…

0 0 1 1 0 1…

0 1 0 0 1 1…

and estimate a probability to be in component *j* for each of the *n* observations, according to the (pseudo-)likelihood

It took me a few days, between morning runs and those wee hours when I cannot get back to sleep (!), to make some sense of this Bernoulli modelling. The allocation vectors are used *together* to estimate the probabilities of being “in” component j *together*. However the data—which is the outcome of an MCMC simulation and *de facto* does not originate from that Bernoulli mixture—does not seem appropriate, both because it is produced by an MCMC simulation and is made of blocks of highly correlated rows [which sum up to one]. The Bernoulli likelihood above also defines a new model, with many more parameters than in the original mixture model. And I fail to see why perfect, partial or inexistent label switching [in the MCMC sequence] is not going to impact the estimation of the Bernoulli mixture. And why an argument based on a fixed parameter value (Theorem 3) extends to an MCMC outcome where parameters themselves are subjected to some degree of label switching. Bemused, I remain…