For DPMs, the number of ‘components’ is the number of ‘atoms’ in the mixing distribution, or the number of distinct stochastic processes necessary to explain the data. Is this a sufficient definition?

]]>I’m not sure that they do have a different number of parameters from a model point of view. I’d suspect (having put no previous thought into this) that any mixture model (fixing the family and the parameters in the mixture components) has two parameters: The number of components and the vector of weights. If you split up the weights, you run into the ‘different paths’ problem that Xi’an mentioned.

]]>From the approximation point of view, if the procedure converges algebraically, the ratio of errors would be 1 + o(1), which would make the test impossible (the problem, as you say in the comment above, is more that order k mixtures can approximate order k+1 mixtures *extremely* well). Things are slightly better if the convergence of the scheme is geometric, but even so the best I’d expect is to be able to test if the number of components is the correct order of magnitude.

]]>Loss (price) or prior, you have indeed to put into the problem the definition of what makes a component. Otherwise, you can add an identical component and move from k to (k+1), for the same fit. I think the problem is deeper than a mere zero mean test in that the null hypothesis (k) is reached on the (k+1) parameter space following many different paths. (I cannot track the reference at the moment, but there is an old paper on this theme…)

]]>At least doing a test to see if we can reject k=1 seems plausible to me.

Since k+1 has more parameters than k, I think it is valid to ask if it is worthwhile to pay the price… is the better fit we get with k+1 Just like what you’d get if you had k, and tried to fit k+1 parameters?

But I think any kind of test like this will have the additional assumption that we have a mixture of Normals. And how woud you know if you rejected k in favor of k+1, or you rejected the assumption that you have a mixture of Normals?

]]>