## non-negative unbiased estimators

**P**ierre Jacob and Alexandre Thiéry just arXived a highly pertinent paper on the most debated issue of non-negative unbiased estimators (of positive quantities). If you remember that earlier post of mine, I mentioned the issue in connection with the Russian roulette estimator(s) of Mark Girolami et al. And, as Pierre and Alexandre point out in the paper, there is also a clear and direct connection with the Bernoulli factory problem. And with our Vanilla Rao-Blackwellisation technique (sadly overlooked, once more!).

**T**he first thing I learned from the paper is how to turn a converging sequence into an unbiased estimator. If *(E _{n})* is this converging sequence, with limit μ, then

is unbiased..! Amazing. Even though the choice of the distribution of N matters towards getting a finite variance estimator, this transform is simply amazing. (Of course, once one looks at it, one realises it is the “old” trick of turning a series into a sequence and vice-versa. Still…!) And then you can reuse it into getting an unbiased estimator for almost any transform of μ.

**T**he second novel thing in the paper is the characterisation of impossible cases for non-negative unbiased estimators. For instance, if the original sequence has an unbounded support, there cannot be such an estimator. If the support is an half-line, the transform must be ~~monotonous~~ monotonic. If the support is a bounded interval (a,b), then the transform must be bounded from below by a polynomial bound

(where the extra-parameters obviously relate to the transform). (In this later case, the authors also show how to derive a Bernoulli estimator from the original unbiased estimator.)

October 3, 2013 at 5:53 pm

Thanks for the review! :-)

I’d be happy to talk more about the links with your vanilla RB paper.

Just to clarify: in the half-line [0,infty) case, the function has to be increasing, not only monotonic. In particular f(x) = 1/x doesn’t work… which is too bad! And for (-infty,0] the function f has to be decreasing.

Dan, I think I see what you mean. You assume V_n is a Monte Carlo estimate based on n samples, right? In which case you’re absolutely right, it smells like a dead end. However you can always assume V_n relies on r_n samples with r_n being an increasing sequence of integers. Then, certainly some choice of r_n makes the method “valid” (I suspect r_n = hypersupermultifactorial(n) would suit many choices of N!). Not necessarily practical…

In the Bernoulli factory literature, it seems that first people were interested in what’s possible and what’s not (existence), and then a considerable additional work was done in the direction of efficiency (precise analysis of the law of the required number of coin flips). I guess we’re still at the Stone Age level of non-negative unbiased estimators.

October 3, 2013 at 1:56 am

This is a brilliant paper!

(Slight warning – late night maths follows, so this could be very wrong)

One of the slightly awkward things (on a slightly shallow reading) about the *really cool* method for unbiasing sequences is that for finite variance it is required that (slight obvious change of notation)

\sum E(X_n – X)^2/P(N=n) < infintiy

But , in the obvious case where V_n is a MC estimate of V and X_n = f(V_n) is a biased but consistent estimate of f(V), I'm not sure you can make that sum converge.

At least the obvious bound, assuming f is Lipschitz, gives E(X_n-X)^2 <= O(1/N), so no proper distribution for N will lead to a convergent series (P(N =n) must grow to stop the sum diverging).

Of course some better analysis may improve the upper bound to the point where we get something summable, but I don't see how.

October 3, 2013 at 9:01 am

Hi Dan,

yes, you’re perfectly right that in some situations of interest, one cannot obtain with this trick an unbiased estimator with a finite variance with a finite (average) computational budget (we hesitated to start discussing this issue in the paper… and we may do so in a subsequent version). For example, we mentioned in the paper that one can “debias” MCMC estimators, but with the telescoping series trick one cannot obtain a finite variance estimator . (in some other cases e.g. multilevel MC, this trick works surprisingly well though!). Note also that one can always make the sum you mentioned finite by taking V_n an MC estimate that uses r_n>>n samples.

At the beginning, I found it very surprising that one could “debias” a MCMC simulation; now that one knows that it is possible to do with a finite (average) computational budget, somebody clever might come up with a finite-variance estimator!

Also, I just wanted to point out that this telescoping series trick is an old-chestnut (surprisingly unknown to statistician)!

October 3, 2013 at 1:43 am

I think you mean “monotonic” not “monotonous”, which means boring! But, maybe you mean that the transform is boring!

October 3, 2013 at 5:49 am

Darn! Yes, I meant

monotonic. I keep doing this mistake as, in French,monotoneworks for both meanings (because, after all, a monotonic function offers little surprises and is therefore somehow boring indeed!)… Thanks.