## non-negative unbiased estimators

Pierre Jacob and Alexandre Thiéry just arXived a highly pertinent paper on the most debated issue of non-negative unbiased estimators (of positive quantities). If you remember that earlier post of mine, I mentioned the issue in connection with the Russian roulette estimator(s) of Mark Girolami et al. And, as Pierre and Alexandre point out in the paper, there is also a clear and direct connection with the Bernoulli factory problem. And with our Vanilla Rao-Blackwellisation technique (sadly overlooked, once more!).

The first thing I learned from the paper is how to turn a converging sequence into an unbiased estimator. If (En) is this converging sequence, with limit μ, then

$\sum_{n=0}^N (E_n-E_{n-1}) / \mathbb{P}(N\ge n)$

is unbiased..! Amazing. Even though the choice of the distribution of N matters towards getting a finite variance estimator, this transform is simply amazing. (Of course, once one looks at it, one realises it is the “old” trick of turning a series into a sequence and vice-versa. Still…!) And then you can reuse it into getting an unbiased estimator for almost any transform of μ.

The second novel thing in the paper is the characterisation of impossible cases for non-negative unbiased estimators. For instance, if the original sequence has an unbounded support, there cannot be such an estimator. If the support is an half-line, the transform must be monotonous monotonic. If the support is a bounded interval (a,b), then the transform must be bounded from below by a polynomial bound

$\epsilon\,\min\{(x-a)^m,(b-x)^n\}$

(where the extra-parameters obviously relate to the transform). (In this later case, the authors also show how to derive a Bernoulli estimator from the original unbiased estimator.)

### 5 Responses to “non-negative unbiased estimators”

1. Pierre Jacob Says:

Thanks for the review! :-)

I’d be happy to talk more about the links with your vanilla RB paper.

Just to clarify: in the half-line [0,infty) case, the function has to be increasing, not only monotonic. In particular f(x) = 1/x doesn't work... which is too bad! And for (-infty,0] the function f has to be decreasing.

Dan, I think I see what you mean. You assume V_n is a Monte Carlo estimate based on n samples, right? In which case you’re absolutely right, it smells like a dead end. However you can always assume V_n relies on r_n samples with r_n being an increasing sequence of integers. Then, certainly some choice of r_n makes the method “valid” (I suspect r_n = hypersupermultifactorial(n) would suit many choices of N!). Not necessarily practical…

In the Bernoulli factory literature, it seems that first people were interested in what’s possible and what’s not (existence), and then a considerable additional work was done in the direction of efficiency (precise analysis of the law of the required number of coin flips). I guess we’re still at the Stone Age level of non-negative unbiased estimators.

2. Dan Simpson Says:

This is a brilliant paper!

(Slight warning – late night maths follows, so this could be very wrong)

One of the slightly awkward things (on a slightly shallow reading) about the *really cool* method for unbiasing sequences is that for finite variance it is required that (slight obvious change of notation)
\sum E(X_n – X)^2/P(N=n) < infintiy

But , in the obvious case where V_n is a MC estimate of V and X_n = f(V_n) is a biased but consistent estimate of f(V), I'm not sure you can make that sum converge.

At least the obvious bound, assuming f is Lipschitz, gives E(X_n-X)^2 <= O(1/N), so no proper distribution for N will lead to a convergent series (P(N =n) must grow to stop the sum diverging).

Of course some better analysis may improve the upper bound to the point where we get something summable, but I don't see how.

• Hi Dan,

yes, you’re perfectly right that in some situations of interest, one cannot obtain with this trick an unbiased estimator with a finite variance with a finite (average) computational budget (we hesitated to start discussing this issue in the paper… and we may do so in a subsequent version). For example, we mentioned in the paper that one can “debias” MCMC estimators, but with the telescoping series trick one cannot obtain a finite variance estimator . (in some other cases e.g. multilevel MC, this trick works surprisingly well though!). Note also that one can always make the sum you mentioned finite by taking V_n an MC estimate that uses r_n>>n samples.

At the beginning, I found it very surprising that one could “debias” a MCMC simulation; now that one knows that it is possible to do with a finite (average) computational budget, somebody clever might come up with a finite-variance estimator!

Also, I just wanted to point out that this telescoping series trick is an old-chestnut (surprisingly unknown to statistician)!

3. Alan J. Izenman Says:

I think you mean “monotonic” not “monotonous”, which means boring! But, maybe you mean that the transform is boring!

• Darn! Yes, I meant monotonic. I keep doing this mistake as, in French, monotone works for both meanings (because, after all, a monotonic function offers little surprises and is therefore somehow boring indeed!)… Thanks.