What are the distributions on the positive k-dimensional quadrant with parametrizable covariance matrix?
This is the question I posted this morning on StackOverflow, following an exchange two days ago with a user who could not see why the linear transform of a log-normal vector X,
Y = μ + Σ X
could lead to negative components in Y…. After searching a little while, I could not think of a joint distribution on the positive k-dimensional quadrant where I could specify the covariance matrix in advance. Except for a pedestrian construction of (x1,x2) where x1 would be an arbitrary Gamma variate [with a given variance] and x2 conditional on x1 would be a Gamma variate with parameters specified by the covariance matrix. Which does not extend nicely to larger dimensions.
April 4, 2012 at 10:19 am
[…] about the question I posted on Friday (on StackExchange, no satisfactory answer so far!), I looked further at the special case of the […]
March 30, 2012 at 12:59 pm
If I needed a multivariate distribution on R^k, I would use the multivariate lognormal log(Y) ~ N(mu, Sigma). It doesn’t have a “known” covariance matrix (I don’t think – google doesn’t readily reveal one and I’m not near books) but it does have sufficient flexibility to model “second-order” structure.
Why do you want the covariance matrix explicitly?
A more esoteric option (which may be useful if “zeros” are meaningful/ wanted) would be the projection of a multivariate distribution onto R^k_+, which is not intractable (at least in the normal case) but will typically lead to an awkward point mass on the boundaries. Colin Fox from Otago wrote a nice paper on this recently.
Another option (maybe) would be to take a “manifold” view and see R^k_+ as a (boring!) manifold-with-boundary and “promote” a sensible distribution from it’s tangent space. This is how I would make distributions on positive definite matrices (partially because I love computing matrix functions), but I guess it should also work here.
It’s possible that one of these would lead to an easy relationship between the parameters of the base distribution and the covariance of the induced distribution, but it’s not fully obvious that it would….
March 30, 2012 at 1:17 pm
Thanks, Dan. The multivariate log-normal came to my mind too, however, only the covariance matrix of log(X) is set… As to why would one need to start from a fixed covariance matrix, this is a question for my interlocutor. More naïvely, I could end up with a case where correlations are paramount and have to be fixed.
March 30, 2012 at 1:39 pm
Maybe the work that Bessag used to do (I’m thinking specifically MRF models {eg the Auto-Poisson model} with the [big] assumption that a similar construction works for log-normal RVs) could be useful here. Although again, it only specifies conditionals not correlations (but it’s closer….).