Archive for median

insufficient Gibbs sampling bridges as well!

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on March 3, 2024 by xi'an

Antoine Luciano, Robin Ryder and I posted a revised version of our insufficient Gibbs sampler on arXiv last week (along with three other revisions or new deposits of mine’s!), following comments and suggestions from referees. Thanks to this revision, we realised that the evidence based on an (insufficient) statistic was also available for approximation by a Monte Carlo estimate attached to the completed sample simulated by the insufficient sampler. Better, a bridge sampling estimator can be used in the same conditions as when the full data is available! In this new version, we thus revisited toy examples first explored in some of my ABC papers on testing (with insufficient statistics), as illustrated by both graphs on this post.

mostly MC[bruary]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 18, 2024 by xi'an

insufficient Gibbs at One World ABC [25/01]

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on January 22, 2024 by xi'an

The next [on-line] One World Approximate Bayesian Computation (ABC) Seminar will be delivered by Antoine Luciano, currently writing his PhD with Robin Ryder and I. It will take place at 9am, UK/GMT time, on Thursday 25 January, with members of the stats lab here in CEREMADE attending Antoine’s lecture live at the PariSanté campus. Here is the abstract for the talk:

In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns, and only aggregated, robust and inefficient statistics derived from the data are accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. In this article, operating within a parametric framework, we propose a method to sample from the posterior distribution of parameters conditioned on different robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or one or more quantiles. Leveraging a Gibbs sampler and the simulation of latent augmented data, our approach facilitates simulation according to the posterior distribution of parameters belonging to specific families of distributions. We demonstrate its applicability on the Gaussian, Cauchy, and translated Weibull families.

based on our recent arXival.

insufficient Gibbs sampling

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , on July 29, 2023 by xi'an

We have just arXived our paper on insufficient Gibbs sampling with Antoine Luciano and Robin Ryder, from Université Paris Dauphine. This is Antoine’s first paper and part of his PhD. (In particular, he wrote the entire code.) The idea stemmed from a discussion on ABC benchmarks, like the one when the pair (median, MAD) is the only available observation. With no available joint density, the setting seems to prohibit calling for an MCMC sampler. However, simulating the complete data set conditional on these statistics proves feasible, with a bit of bookkeeping. With obviously much better results [demonstrated above for a Cauchy example] than when calling ABC and at a very similar cost. (If not accounting for the ability of ABC to be parallelised.)  The idea can be extended to other settings, obviously, as long as completion remains achievable. (And a big thanks to our friend Ed George who suggested the title, while at CIRM. I had suggested “Gibbs for boars” as a poster title, in connection with the historical time-line of

Gibbs for Kids (Casella and George) — Gibbs for Pigs (Gianola) — Gibbs for Robust Pigs = Gibbs for Boars

and the abundance of boars on the Luminy campus, but this did not sound convincing enough for Antoine.)

information loss from the median

Posted in Books, Kids, Statistics with tags , , , , , , on April 19, 2022 by xi'an

An interesting side item from a X validated question about calculating the Fisher information for the Normal median (as an estimator of the mean). While this information is not available in closed form, it has a “nice” expression

1+n\mathbb E[Z_{n/2:n}\varphi(Z_{n/2:n})]-n\mathbb E[Z_{n/2:n-1}\varphi(Z_{n/2:n-1})]+
\frac{n(n-1)}{n/2-2}\varphi(Z_{n/2-2:n-2})^2+\frac{n(n-1)}{n-n/2-1}\varphi(Z_{n/2:n-2})^2

which can easily be approximated by simulation (much faster than by estimating the variance of said median). This shows that the median is about 1.57 less informative than the empirical mean. Bonus points for computing the information brought by the MAD statistic! (The information loss against the MLE is 2.69,  since the Monte Carlo ratio of their variances is 0.37.)