Archive for tails

the ABC-SubSim algorithm

Posted in pictures, Statistics with tags , , , , , , on April 29, 2014 by xi'an

cutcut3In a nice coincidence with my ABC tutorial at AISTATS 2014 – MLSS, Manuel Chiachioa, James Beck, Juan Chiachioa, and Guillermo Rus arXived today a paper on a new ABC algorithm, called ABC-SubSim. The SubSim stands for subset simulation and corresponds to an approach developed by one of the authors for rare-event simulation. This approach looks somewhat similar to the cross-entropy method of Rubinstein and Kroese, in that successive tail sets are created towards reaching a very low probability tail set. Simulating from the current subset increases the probability to reach the following and less probable tail set. The extension to the ABC setting is done by looking at the acceptance region (in the augmented space) as a tail set and by defining a sequence of tolerances.  The paper could also be connected with nested sampling in that constrained simulation through MCMC occurs there as well. Following the earlier paper, the MCMC implementation therein is a random-walk-within-Gibbs algorithm. This is somewhat the central point in that the sample from the previous tolerance level is used to start a Markov chain aiming at the next tolerance level. (Del Moral, Doucet and Jasra use instead a particle filter, which could easily be adapted to the modified Metropolis move considered in the paper.) The core difficulty with this approach, not covered in the paper, is that the MCMC chains used to produce samples from the constrained sets have to be stopped at some point, esp. since the authors run those chains in parallel. The stopping rule is not provided (see, e.g., Algorithm 3) but its impact on the resulting estimate of the tail probability could be far from negligible… Esp. because there is no burnin/warmup. (I cannot see how “ABC-SubSim exhibits the benefits of perfect sampling” as claimed by the authors, p. 6!)  The authors re-examined the MA(2) toy benchmark we had used in our earlier survey, reproducing as well the graphical representation on the simplex as shown above.

Good size swans and turkeys

Posted in Books, Statistics with tags , , , , on February 24, 2009 by xi'an

In connection with The Black Swan, Nassim Taleb wrote a small essay called The Fourth Quadrant on The Edge. I found it much more pleasant to read than the book because (a) it directly focus on the difficulty of dealing with fat tail distributions and the prediction of extreme events, and (b) it is delivered in a much more serene tone than the book (imagine, just the single remark about the Frenchs!). The text contains insights on loss functions and inverse problems which, even though they are a bit vague, do mostly make sense. As for The Black Swan, I deplore (a) the underlying determinism of the author, which still seems to believe in an unknown (and possibly unachievable) model that would rule the phenomenon under study and (b) the lack of temporal perspective and of the possibility of modelling jumps as changepoints, i.e. model shifts. Time series have no reason to be stationary, the less so the more they depend on all kinds of exogeneous factors. I actually agree with Taleb that, if there is no information about the form of the tails of the distribution corresponding to the phenomenon under study—assuming there does exist a distribution—, estimating the shape of this tail from the raw data is impossible.

The essay is followed by a technical appendix that expands on fat tails, but not so deeply as to be extremely interesting. A surprising side note is that Taleb seems to associate stochastic volatility with mixtures of Gaussians. In my personal book of models, stochastic volatility is a noisy observation of the exponential of a random walk, something like\nu_t={\exp(ax_{t-1}+b\epsilon_t)},thus with much higher variation (and possibly no moments). To state that Student’s t distributions are more variable than stochastic volatility models is therefore unusual… There is also an analysis over a bizillion datasets of the insanity of computing kurtosis when the underlying distribution may not have even a second moment. I could not agree more: trying to summarise fat tail distributions by their four first moments does not make sense, even though it may sell well. The last part of the appendix shows the equal lack of stability of estimates of the tail index{\alpha},which again is not a surprising phenomenon: if the tail bound K is too low, it may be that the power law has not yet quicked in while, if it is too large, then we always end up with not enough data. The picture shows how the estimate widely varies with K around its theoretical value for the log-normal and three Pareto distributions, based on a million simulations. (And this is under the same assumption of stationarity as above.) So I am not sure what the message is there. (As an aside, there seems to be a mistake in the tail expectation: it should be

\dfrac{\int_K^\infty x x^{-\alpha} dx}{\int_K^\infty x^{-\alpha} dx} = \dfrac{K(\alpha-1)}{(\alpha-2)}

if the density decreases in\alpha\cdotsIt is correct when\alphais the tail power of the cdf.)power estimate