A discrete Bernoulli factory

Posted in Books, Kids, Statistics with tags , , , , , , on October 18, 2021 by xi'an

A rather confusing (and now closed) question on X validated contained an interesting challenge of simulating an arbitrary discrete distribution using a single (standard) dice. It indeed made me think of the (more challenging) Bernoulli factory problem of simulating B(f(p)) using a B(p) simulator (with p unknown). I still do not see what the optimal solution is but the core challenge is to avoid simulating U(0,1) variate by exploiting the discrete nature of the target. Which may be an issue if the probabilities of the target are irrational and one is considering the cdf inversion approach. An alternative is to use an accept-reject approach, which also works for discrete distributions, by first deriving an instrumental distribution on the discrete support of the target from dice rolls, second finding the maximum of the ratio instrument to target, and third devising a discrete approach to selecting a generation with a probability taking a finite number of values. Which may prove quite costly. Finally, the least debatable approach is to turn the dice into a Uniform generator by using each draw as a digit in the base 5 representation of this Uniform variate, up to the precision desired for the resolution, and then apply the most efficient algorithm for the target distribution.

Buffon machines

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on December 22, 2020 by xi'an

By chance I came across a 2010 paper entitled On Buffon Machines and Numbers by Philippe Flajolet, Maryse Pelletier and Michèle Soria. Which relates to Bernoulli factories, a related riddle, and the recent paper by Luis Mendo I reviewed here. What the authors call a Buffon machine is a device that produces a perfect simulation of a discrete random variable out of a uniform bit generator. Just like (George Louis Leclerc, comte de) Buffon’s needle produces a Bernoulli outcome with success probability π/4. out of a real Uniform over (0,1). Turned into a sequence of Uniform random bits.

“Machines that always halt can only produce Bernoulli distributions whose parameter is a dyadic rational.”

When I first read this sentence it seemed to clash with the earlier riddle, until I realised the later started from a B(p) coin to produce a fair coin, while this paper starts with a fair coin.

The paper hence aims at a version of the Bernoulli factory problem (see Definition 2), although the term is only mentioned at the very end, with the added requirement of simplicity and conciseness translated as a finite expected number of draws and possibly an exponential tail.

It first recalls the (Forsythe–)von Neumann method of generating exponential (and other) variates out of a Uniform generator (see Section IV.2 in Devroye’s generation bible). Expanded into a general algorithm for generating discrete random variables whose pmf’s are related to permutation cardinals,

$\mathbb P(N=n)\propto P_n\lambda^n/n!$

if the Bernoulli generator has probability λ. These include the Poisson and the logarithmic distributions and as a side product Bernoulli distributions with some logarithmic, exponential and trigonometric transforms of λ.

As a side remark, a Bernoulli generator with probability 1/π is derived from Ramanujan identity

$\frac{1}{\pi} = \sum_{n=0}^\infty {2n \choose n}^3 \frac{6n+1}{2^{8n+2}}$

as “a discrete analogue of Buffon’s original. In a neat connection with Mendo’s paper, the authors of this 2010 paper note that Euler’s constant does not appear to be achievable by a Buffon machine.

Simulating a coin with irrational bias using rational arithmetic

Posted in Books, Statistics with tags , , , , , , on December 17, 2020 by xi'an

An arXived paper by Luis Mendo adresses the issue of simulating coins with irrational probabilities from a fair coin, somehow connected with one of the latest riddles. Where I realised only irrational coins could be simulated in a fixed and finite number of throws! The setting of the paper is however similar to the one of a Bernoulli factory in that an unlimited number of coins can be generated (but it relies on a  fair coin). And the starting point is a series representation of the irrational ζ as a sum of positive and rational terms. As well as an earlier paper by the Warwick team of Łatuszyński et al. (2011). The solution is somewhat anticlimactic in that the successive draws of the fair coin lead to a sequence of intervals with length divided by 2 at each step. And stopping when a certain condition is met, requiring some knowledge on the tail error of the series. The paper shows further that the number of inputs used by its algorithm has an exponential tail. The examples provided therein are Euler’s constant

$\gamma =\frac{1}{2} + \sum_{i=1}^\infty \frac{B(i)}{2i(2i+1)(2i+2)}$

where B(j) is the number of binary digits of j, and π/4 which can be written as an alternating series. An idle question that came to me while reading this paper is the influence of the series chosen to represent the irrational ζ as it seems that a faster decrease in the series should lead to fewer terms being used. However, the number of iterations is a geometric random variable with parameter 1/2, therefore the choice of the series curiously does not matter.

Bernoulli factory in the Riddler

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , on December 1, 2020 by xi'an

“Mathematician John von Neumann is credited with figuring out how to take a p biased coin and “simulate” a fair coin. Simply flip the coin twice. If it comes up heads both times or tails both times, then flip it twice again. Eventually, you’ll get two different flips — either a heads and then a tails, or a tails and then a heads, with each of these two cases equally likely. Once you get two different flips, you can call the second of those flips the outcome of your “simulation.” For any value of p between zero and one, this procedure will always return heads half the time and tails half the time. This is pretty remarkable! But there’s a downside to von Neumann’s approach — you don’t know how long the simulation will last.” The Riddler

The associated riddle (first one of the post-T era!) is to figure out what are the values of p for which an algorithm can be derived for simulating a fair coin in at most three flips. In one single flip, p=½ sounds like the unique solution. For two flips, p²,(1-p)^2,2p(1-p)=½ work, but so do p+(1-p)p,(1-p)+p(1-p)=½, and the number of cases grows for three flips at most. However, since we can have 2³=8 different sequences, there are 2⁸ ways to aggregate these events and thus at most 2⁸ resulting probabilities (including 0 and 1). Running a quick R code and checking for proximity to ½ of any of these sums leads to

[1] 0.2062997 0.7937005 #p^3
[1] 0.2113249 0.7886753 #p^3+(1-p)^3
[1] 0.2281555 0.7718448 #p^3+p(1-p)^2
[1] 0.2372862 0.7627143 #p^3+(1-p)^3+p(1-p)^2
[1] 0.2653019 0.7346988 #p^3+2p(1-p)^2
[1] 0.2928933 0.7071078 #p^2
[1] 0.3154489 0.6845518 #p^3+2p^2(1-p)
[1] 0.352201  0.6477993 #p^3+p(1-p)^2+p^2(1-p)
[1] 0.4030316 0.5969686 #p^3+p(1-p)^2+3(1-p)p^2
[1] 0.5


which correspond to

1-p³=½, p³+(1-p)³=½,(1-p)³+(1-p)p²=½,p³+(1-p)³+p²(1-p),(1-p)³+2(1-p)p²=½,1-p²=½, p³+(1-p)³+p²(1-p)=½,(1-p)³+p(1-p)²+p²(1-p)=½,(1-p)³+p²(1-p)+3p(1-p)²=½,p³+p(1-p)²+3(p²(1-p)=½,p³+2p(1-p)²+3(1-p)p²=½,p=½,

(plus the symmetric ones), leading to 19 different values of p producing a “fair coin”. Missing any other combination?!

Another way to look at the problem is to find all roots of the $2^{2^n}$ equations

$a_0p^n+a_1p^{n-1}(1-p)+\cdots+a_{n-1}p(1-p)^{n-1}+a_n(1-p)^n=1/2$

where

$0\le a_i\le{n \choose i}$

(None of these solutions is rational, by the way, except p=½.) I also tried this route with a slightly longer R code, calling polyroot, and finding the same 19 roots for three flips, [at least] 271 for four, and [at least] 8641 for five (The Riddler says 8635!). With an imprecision in the exact number of roots due to rather poor numerical rounding by polyroot. (Since the coefficients of the above are not directly providing those of the polynomial, I went through an alternate representation as a polynomial in (1-p)/p, with a straightforward derivation of the coefficients.)

not a Bernoulli factory

Posted in Books, Kids, pictures, R with tags , , , , , , , on May 20, 2020 by xi'an

A Riddler riddle I possibly misunderstood:

Four isolated persons are given four fair coins, which can be either flipped once or returned without being flipped. If all flipped coins come up heads, the team wins! Else, if any comes up tails, or if no flip at all is done, it looses. Each person is further given an independent U(0,1) realisation. What is the best strategy?

Since the players are separated, I would presume the same procedure is used by all. Meaning that a coin is tossed with probability p, ie if the uniform is less than p, and untouched otherwise. The probability of winning is then

4(1-p)³p½+6(1-p)³p½²+4(1-p)p³½³+p⁴½⁴

which is maximum for p=0.3420391, with a winning probability of 0.2848424.

solve $$x⌊x⌊x⌊x⌋⌋⌋=2020$$

Where the integral part is the integer immediately below x. Puzzle that I first fail solving by brute force, because I did not look at negative x’s… Since the fourth root of 2020 is between 6 and 7, the solution is either x=6+ε or x=-7+ε, with ε in (0,1). The puzzle then becomes either

$$(6+ε)⌊(6+ε)⌊(6+ε)⌊6+ε⌋⌋⌋ = (6+ε)⌊(6+ε)⌊36+6ε⌋⌋ = (6+ε)⌊(6+ε)(36+⌊6ε⌋)⌋ = 2020$$

where there are 6 possible integer values for $$⌊6ε⌋, with only ⌊6ε⌋=5 being possible, turning the equation into$$

$$(6+ε)⌊41(6+ε)⌋ = (6+ε)(246+⌊41ε⌋) = 2020$$

where again only $$⌊42ε⌋$$=40 being possible, ending up with

$$1716+286ε = 2020$$

which has no solution in (0,1). In the second case

(-7+$$ε$$)$$⌊(-7+ε)⌊(-7+ε)⌊-7+ε⌋⌋⌋$$ = (-7+$$ε$$)$$⌊(-7+ε)(49+⌊-7ε⌋)⌋ = 2020$$

shows that only $$⌊-7ε⌋$$=-3 is possible, leading to

(-7+$$ε$$)$$⌊46(-7+ε))⌋$$ = (-7+$$ε$$) ($$-322+⌊46ε⌋)=2020$$

with only $$⌊46ε⌋$$=17 possible, hence

2135-305$$ε$$=2020

and

$$ε$$=115/305.

A brute force simulated annealing resolution returns x=-6.622706 after 10⁸ iterations. A more interesting question is to figure out the discontinuity points of the function

$$ℵ(x) = x$$$$⌊x⌊x⌊x⌋⌋⌋$$

as they seem to be numerous:

For instance, only 854 of the first 2020 integers enjoy a solution to $$ℵ(x)$$=n.