Thank you for your comment. This is the answer for your comment at May 7, 2021 at 8:06 am.

In your comment at May 7, 2021 at 8:06 am. I cannot determine the statistical model you mentioned, hence I cannot say anything about your comment. However, we have at least the following theoretical and numerical results.

If a statistical model is given by p(x)=aN(x|0)+(1-a)N(x|b), where dimension of b is 2, then its real log canonical threshold and numerical experiment are shown in the book, Example.67.

In Takumi’s Multinomial paper (1), p(x)=aM(x|b)+(1-a)M(x|c), the theoretical result about the real log canonical threshold is derived by using resolution of singularities which coincided with numerical experiment.

In a general normal mixture using the prior that is positive and finite (index of Dirichlet =1), then the real log canonical threshold was studied in the Yamazaki’s pioneer work (2).

In the book, Example.52 and Fig.7.2 in p.222 show that the posterior of a normal mixture also has a phase transition according to the location of the true distributions.

Model selection phenomena in a normal mixture is shown in Fig.2.8 in p.58 (DIC fails). Numerical experiments in model selection phenomena by different Dirichlet indices are shown in tables 1and 2 in (3).

(1) Takumi Watanabe, Asymptotic Behavior of Bayesian Generalization Error in Multinomial Mixtures. IEICE Technical report, IBISML2019-18, pp. 1-8, 2020.

(2) Keisuke Yamazaki, et.al. Singularities in mixture models and upper bounds of stochastic complexity, Neural Networks, Vol. 16, pp. 1029-1038, 2003.

(3) S. Watanabe, WAIC and WBIC for mixture models. Behaviormetrika vol. 48, pp.5–21, 2021.

Actually, when discussing with Judith Rousseau, she pointed out to me that the result in their 2011 paper does not apply to the location mixture.

]]>Thank you for your interest in singular cases.

This is the answer for your comment at May 6, 2021 at 7:15 pm.

Let a sample be generated from one L-dimensional multinomial distribution M(x), and a statistical model be p(x)=(1-a)M(x-b)+aM(x-c). Then M(x)=p(x) if and only if a=(0,1) or b=c=0. This is a singular case. If the index of Dirichlet prior >(L-1)/2, then the posterior becomes a=free, (b,c)~(0,0). If index<(L-1)/2, then the posterior becomes a~0 or a~1, b,c=free. Both cases are rather stable. If index=(L-1)/2, (critical point), then a~0 or a~1, b,c~0, but the posterior is unstable (convergence of MCMC becomes very slow). We expect that a normal mixture has the almost same behavior. The generalization loss depends on the index of prior. In a numerical experiment, the critical point can be found by the cross validation or WAIC. At the critical point, the variances of them become very large.

Thanks again for taking the trouble to reply to my questions! I remain however puzzled because Rousseau & Mengersen show that, as n goes to infinity, “quite generally the posterior distribution has a stable and interesting behaviour, since it tends to empty the extra component” when the Dirichlet weight is smaller than d/w.

]]>Thank you for your reading again.

The model in p.282 is same as Rousseau & Mengersen (2011),

however, in p.282, we study another phase transition, which is caused by increasing sample size, n. Assume that a true distribution is a mixture of two near-located normal distributions. If n is small, then the posterior is almost equal to the case that the true is one normal distribution. If n tends to large, the posterior becomes to the case that the true is a mixture of two distributions. Hence a phase transition is caused by n increasing, which is illustrated in Fig.9.6. It can be observed by the generalization loss, cross validation, and WAIC.

In this model (mixture of two free normal distributions), the critical point according to the index of Dirichlet prior a is still unknown. However, It was proved by Takumi Watanabe by (1) that the

critical point of the mixture of two L-dimensional multinomial distributions is a=(L-1)/2. Since the parameter dimension of one multinomial distribution is (L-1), this result is formally equal to the consistency condition of Rousseau & Mengersen (2011). Takumi also proved that the real log canonical threshold is (L-1)/2+min(a/2,(L-1)/4), resulting that the asymptotic free energy and the generalization error are also clarified.

(1) Takumi Watanabe, Asymptotic Behavior of Bayesian Generalization Error in Multinomial Mixtures. IEICE Technical report, IBISML2019-18, pp. 1-8, 2020.

Thank you for question.

(1) In hypothesis test, we prepare Null (prior_0,model_0) and Alternative (prior_1,prior_1). Then the ratio of two marginal likelihoods is equal to the statistic of the most powerful test. Then we can determine the reject region for a given level, based on the assumption that a sample is generated from Null.

(2) In Bayesian model comparison, two pairs (prior_0,model_0) and (prior_1,model_1) are compared by the posterior probabilities using a sample and priors of pairs. It results in the analysis of the ratio of two marginal likelihoods.

Both test and comparison result in the ratio of marginal likelihoods, however, the determined regions are different. An example is given in Example.64.

Interestingly, you differentiate between hypothesis testing and model choice through essentially not setting prior probabilities on the hypotheses and setting prior probabilities on the models. Which makes the Bayes factor only adequate in the second situation, if I am not confused.

]]>Thanks, I was looking at the general location Normal mixture on p.282. In the case there is a single Normal component, which corresponds to case (2) in the discussion. This would be closer to the case covered by Rousseau & Mengersen (2011), wouldn’t it?

]]>Thank you, I can now see the minus (-) in the earlier equations leading to a plus (+) there..

]]>Thank you for your reading again.

C_n in p.88 is derived from (3.15), (3.17), and (3.22).