Archive for log-normal distribution

exam question

Posted in Kids, Statistics, University life with tags , , , , , , , , , on June 24, 2016 by xi'an

exo2A question for my third year statistics exam that I borrowed from Cross Validated: no student even attempted to solve this question…!

And another one borrowed from the highly popular post on the random variable [almost] always smaller than its mean!

merging MCMC subposteriors

Posted in Books, Statistics, University life with tags , , , , , , , on June 8, 2016 by xi'an

Christopher Nemeth and Chris Sherlock arXived a paper yesterday about an approach to distributed MCMC sampling via Gaussian processes. As in several other papers commented on the ‘Og, the issue is to merge MCMC samples from sub-posteriors into a sample or any sort of approximation of the complete (product) posterior. I am quite sympathetic to the approach adopted in this paper, namely to use a log-Gaussian process representation of each sub-posterior and then to replace each sub-posterior with its log-Gaussian process posterior expectation in an MCMC or importance scheme. And to assess its variability through the posterior variance of the sum of log-Gaussian processes. As pointed out by the authors the closed form representation of the posterior mean of the log-posterior is invaluable as it allows for an HMC implementation. And importance solutions as well. The probabilistic numerics behind this perspective are also highly relevant.

A few arguable (?) points:

  1. The method often relies on importance sampling and hence on the choice of an importance function that is most likely influential but delicate to calibrate in complex settings as I presume the Gaussian estimates are not useful in this regard;
  2. Using Monte Carlo to approximate the value of the approximate density at a given parameter value (by simulating from the posterior distribution) is natural but is it that efficient?
  3. It could be that, by treating all sub-posterior samples as noisy versions of the same (true) posterior, a more accurate approximation of this posterior could be constructed;
  4. The method relies on the exponentiation of a posterior expectation or simulation. As of yesterday, I am somehow wary of log-normal expectations!
  5. If the purpose of the exercise is to approximate univariate integrals, it would seem more profitable to use the Gaussian processes at the univariate level;
  6. The way the normalising missing constants and the duplicate simulations are processed (or not) could deserve further exploration;
  7. Computing costs are in fine unclear when compared with the other methods in the toolbox.

What are the distributions on the positive k-dimensional quadrant with parametrizable covariance matrix? (solved)

Posted in R, Statistics, University life with tags , , , , on April 8, 2012 by xi'an

Paulo (from the Instituto de Matemática e Estatística, Universidade de São Paulo, Brazil) has posted an answer to my earlier question both as a comment on the ‘Og and as a solution on StackOverflow (with a much more readable LaTeX output). His solution is based on the observation that the multidimensional log-normal distribution still allows for closed form expressions of both the mean and the variance and that those expressions can further be inverted to impose the pair (μ,Σ)  on the log-normal vector. In addition, he shows that the only constraint on the covariance matrix is that the covariance σij is larger than iμj.. Very neat!

In the meanwhile, I corrected my earlier R code on the gamma model, thanks to David Epstein pointing a mistake in the resolution of the moment equation and I added the constraint on the covariance, already noticed by David in his question. Here is the full code:

sol=function(mu,sigma){
  solub=TRUE
  alpha=rep(0,3)
  beta=rep(0,2)
  beta[1]=mu[1]/sigma[1]
  alpha[1]=mu[1]*beta[1]
  coef=mu[2]*sigma[1]-mu[1]*sigma[3]
  if (coef<0){
   solub=FALSE}else{
    beta[2]=coef/(sigma[1]*sigma[2]-sigma[3]^2)
    alpha[2]=sigma[3]*beta[2]/sigma[1]^2
    alpha[3]=mu[2]*beta[2]-mu[1]*alpha[2]
    if (alpha[3]    }
list(solub=solub,alpha=alpha,beta=beta)
}

mu=runif(2,0,10);sig=c(mu[1]^2/runif(1),mu[2]^2/runif(1));sol(mu,c(sig,runif(1,max(-sqrt(prod(sig)),
-mu[1]*mu[2]),sqrt(prod(sig)))))

and I did not get any FALSE outcome when running this code several times.