I think rather than being an arm chair expert on GLD, the best way to actually try some of the packages like GLDEX and see how well the technique fits empirical data. It seems to me, the author of this post just read and formed his opinion, without actually trying to see whether the technique actually work…

]]>Thanks in advance!

Krishna

Thanks for the details! The very short answer is no! The short answer is no, I do not think GLDs are worth investing into and no, I do not much about those distributions and their applications, as I only wrote this highly negative book review… So I would very frankly advise you not to send much time on them.

]]>Thank you very much for your reply. I think I should have made my comments more clear.

1. I understand how to find asymptotic variance of MLE and bootstrap method in general (I got a phd in econometrics). However in the papers I found online, e.g. Lakhany&Mausser(2000), Tarsitano(2004) and Su(2007), most attention is paid on “fitting” and how good the (fitted) theoretical moments match the empirical ones, instead of the uncertainty of the resulting estimators. You also mentioned “the glaring absence of confidence statements”. This made me wonder whether it is just difficult for GLD and asked this question. Bootstrapping appear to be straightforward here and I guess someone already tried it, but I could not find a such paper.

2a. I had also thought that “So, on principle, they can be used everywhere a normal distribution, say, is used. “. Then I searched on google and could only find one such paper, Dean&King (2009). And in it the authors only discuss its application in SIMPLE regression! Well, the impression to me is that even the generalization to multiple regression is non-trivial…And I would like to know if there is any other research works have been done on this direction.

2b. Well, given my impression and after some thoughts, i then wondered whether it is worth doing it. In my (applied) work, most of the time I am interested more in the (functional) relationship between response and covariates(regression), or between past values and current value(time series). Much much less is paid to the error distribution. This just makes sense to me. I know we sometimes can use t distribution instead of normal to account for outliers, but I have not seen many people doing this in practice. If we then replace the error by the GLD, much much effort has to be allocated to fitting the error distribution. I agree with you that “I personally do not see any gain in using it”

I must confess that I have not heard of GLD until a few days ago when I looked for alternative distributions for copula. Therefore It is very likely I got wrong in my comments. GLD captures my attention because of its flexibility. But soon I questioned how it can really be used in practice. For example, in Tarsitano (2004), the author investigates the distribution of income data…To me, a more natural question is how income is related to other variables, say education, age and etc. Even if we are really interested in just the distribution, I guess Gaussian Mixture Model can do a quite pretty job (though it has its own problem).

Basically, I am curious why researchers have paid so much effort on this distribution. Is there any interesting applications I have missed out?

]]>1. This is too broad a question for me to answer [here]! Given an estimation method, you may try to figure out the asymptotics of the estimator, which gives a first entry to confidence sets and testing. If this is not possible, bootstrap is often an answer, if not necessarily the most efficient.

2a. GLDs in regression: You have to understand that GLDs are particular parametric distributions. So, on principle, they can be used everywhere a normal distribution, say, is used. I personally do not see any gain in using it, but this is a personal opinion.

2b. I do not get your last point, however I think this is unrelated to GLDs per se, because of the previous reason. GLDs are not more or less robust than other families of distributions…

I am not an expert of GLD at all, so i have a few questions on it:

1. For any estimation method, e.g. MLE, L-moments, can we calculate standard error so as to construct confidence interval or perform hypothesis testing?

2. Can we use GLD within, say, regression analysis? If yes, given its complexity, what is the gain of using it? As far as I know, many regression models are robust to error distribution when sample size is moderate. I ask this question because in all my work I am more interested in the ‘structure’ or ‘pattern’ exhibited in the data so I never need to fit a distribution to the data ONLY.

I understand your position but I remain unconvinced: GLD’s are parametric distributions, thus not universally able to model any situation. They are also mostly implicit in that, while simulation is straightforward, inference is hard for the lack of closed-form likelihood. The last point is about interpretation: once a dataset is fitted with a GLD, what else can be said about the underlying phenomenon?

]]>GLD is actually quite flexible in terms of fitting distribution to data, try fitting this distribution this with some real life data using gld or GLDEX package and you can see that it tends to fit data fairly well. The benefit of GLD comes from the fact we can often approximate the density of our data without having to try lots of different distributions. Don’t forget that in real life, you are not going to know the true distribution. So if you want to estimate the distribution, it would often be better to choose a flexible distribution.

Why would it be beneficial to fit a distribution to data? In many ways, if you can do this reasonably accurately, you can get all the statistical properties under one roof. You can get mean, variance, median, quantiles, whereas currently, you often need to use different statistical techniques to get different statistical properties of the data. E.g. you might estimate the sample mean, but you might estimate the density using kernel density estimation. It would be much more elegant just to fit a distribution and get all the estimated statistical properties in one go.

I like to point out that MLE is not the only way to fit GLD, you can also fit GLD using L moments and other methods. And actually, MLE is not that difficult to achieve, the real difficulty is to find suitable initial values, but this is the same problem for *many* numerical estimation problems and a solution has been proposed in Su (2007) in CDSA.

Of course the fitting method is going to be approximate, but the question is whether it is a sufficiently good estimate (and this is something we check using QQ plots, goodness of fit test etc…) In fact, a number of estimation methods to fit GLD to data is now available in GLDEX package, which incidentally is also covered in this book.

The chapter on fitting mixture was not by Su, it was by Ning, Gao and Dudewicz. A newcomer to this area could just read Su (2007) in JSS or the chapter on using GLDEX and start using the package to fit GLD to data. Also, the book contains a number of applications of GLD which the reviewer never commented on…

There are things I dislike about the book also, there does not seem to be a very integrated effort on distributional fitting methods for GLD and others, as there are different approaches and some approaches used the original Karian and Dudewicz’s prior work and there could have been a better flow between them. The tables are probably not needed given the electronic age we live in (it can be stored electronically) and they are based on a particular method which is known to be unstable. (i.e. fitting method of moments does not mean a good fit to the overall distribution)

As for saving a tree, you can just buy the ebook version :)

I think as researchers, we all have our own judgement about what is useful and what is not, however given GLD has been used in various disciplines over the years, I think it is unfair to create an impression that GLD is totally useless from this review.

]]>thanks. I do not know of a book focussing on fitting all distributions, but encyclopedias like Johnson and Kotz’s obviously incluJohnson and Kotz’sde traditional estimation methods for the standard distributions.

]]>I would welcome any suggestions for books that cover the fitting of statistical distributions in a broader sense.

]]>