## biased sample!

Posted in Statistics with tags , , , , , , , , , , , on May 21, 2019 by xi'an A chance occurrence led me to this thread on R-devel about R sample function generating a bias by taking the integer part of the continuous uniform generator… And then to the note by Kellie Ottoboni and Philip Stark analysing the reason, namely the fact that R uniform [0,1) pseudo-random generator is not perfectly continuously uniform but discrete, by the nature of numbers on a computer. Knuth (1997) showed that in this case the range of probabilities is larger than (1,1), the largest range being (1,1.03). As noted in the note, exploiting directly the pseudo-random bits of the pseudo-random generator. Shocking, isn’t it!  A fast and bias-free alternative suggested by Lemire is available as `dqsample::sample`

As an update of June 2019, sample is now fixed.

## Le Monde puzzle [#919]

Posted in Books, Kids, Statistics, University life with tags , , , , on July 19, 2015 by xi'an A rather straightforward Le Monde mathematical puzzle:

Find 3 digit integers x such that the integer created by collating x with (x+1) gives a 6 digit integer that is a perfect square.

Easy once you rewrite the constraint as 1000x+x+1 being a perfect square a², which means that x is equal to (a-1)(a+1)/1001, hence that 1001=7x11x13 divides either a+1 or a=1.

```sol=NULL
vals=as.vector(outer(c(7,11,13),1:999,"*"))
vals=c(vals-1,vals+1)
for (a in vals){
x=round((a-1)*(a+1)/1001)
if ((1000*x+x+1==a^2)&(x<999)&(x>99)) sol=c(sol,x)}
```

which returns four solutions:

```> unique(sol)
 183 328 528 715
```

An addendum to the puzzle is

Find 4 digit integers x such that the integer created by collating x with (x+1) gives an 8 digit integrer that is a perfect square.

Similarly easy once you rewrite the constraint as 10,000x+x+1 being a perfect square a², which means that x is equal to (a-1)(a+1)/10,001, hence that 10,001=73×137 divides either a+1 or a=1.

```sol=NULL
vals=as.vector(outer(c(73,137),(1:9999),"*"))
vals=c(vals-1,vals+1)
for (a in vals){
x=round((a-1)*(a+1)/10001)
if ((10000*x+x+1==a^2)&(x<9999)&(x>999)) sol=c(sol,x)}
```

leading to the conclusion there is a single solution:

```> unique(sol)
 6099
```

## Le Monde puzzle [#909]

Posted in Books, Kids, R with tags , , on May 1, 2015 by xi'an Another of those “drop-a-digit” Le Monde mathematical puzzle:

Find all integers n with 3 or 4 digits, no exterior zero digit, and a single interior zero digit, such that removing that zero digit produces a divider of x.

As in puzzle #904, I made use of the digin R function:

```digin=function(n){
as.numeric(strsplit(as.character(n),"")[])}
```

and simply checked all integers up to 10⁶:

```plura=divid=NULL
for (i in 101:10^6){
dive=rev(digin(i))
if ((min(dive,rev(dive))>0)&
(sum((dive[-c(1,length(dive))]==0))==1)){
dive=dive[dive>0]
dive=sum(dive*10^(0:(length(dive)-1)))
if (i==((i%/%dive)*dive)){
plura=c(plura,i)
divid=c(divid,dive)}}}
```

```> plura
1] 105 108 405 2025 6075 10125 30375 50625 70875
> plura/divid
 7 6 9 9 9 9 9 9 9
```

leading to the conclusion there is no solution beyond 70875. (Allowing for more than a single zero within the inner digits sees many more solutions.)

## Le Monde puzzle [#809]

Posted in Kids, R, Travel with tags , , , , , , , on February 22, 2013 by xi'an Another number theory puzzle, completed in the plane to Hamburg:

Integers n are called noble if they can be decomposed as a sum n=a+b+… of distinct integers such that 1/a+1/b+…=1. They are called bourgeois if they are not noble but can be decomposed as a sum n=a+b+… of integers, some of them identical, such that 1/a+1/b+…=1. If neither noble nor bourgeois, they are called villeins. Find all nobles less than 60. And characterise villeins.

Again a case where writing a brute-force R code seems harder than solving the problem on a piece of paper… However, a mix could make the best of both worlds. It is rather straightforward to see that if n is not a villein, then 2n+2 is not a villein (moving from 1/2+1/2=1 to 1/2(1/n)+1/2=1). Similarly for 3n+6 (9=3+3+3), 4n+6 (10=4+4+2), 6n+5 (2+3+6=11). Each new decomposition brings a whole series of non-villeins. Another path is to remark that products keep the property: if a and b are non-villeins, so is ab, a², b², 2(a+b), and actually all products of the form m(m+a-1) for any integer m.  Starting with the most basic solutions (1,4,9,10,11), we can then repeat the application of all those rules until no new non-villein is exposed.

Here is my R code for finding some decompositions with 4 and 5 terms (since there are only one or two decompositions with one (1), two (2+2) and three (2+3+6, 3+3+3): I first define a function checking for a decomposition with no numerical error [(sum(1/deco)==1) does not work!]:

```inva=function(deco){
numerat=0
for (i in 1:length(deco))
numerat=numerat+prod(deco[-i])
return((numerat==prod(deco)))}
```

then check for 4

```reve4=NULL
a=c(2,2,2,2)
while ((a<5)&&(sum(a)<626)){
while ((sum(1/a[-4])>=1)&&(sum(a)<626))
a[3:4]=rep(a+1,2)
a=max(a,trunc(1/(1-sum(1/a[-4])))-1)
while ((sum(1/a)>1)&&(sum(a)<626)) a=a+1
if (inva(a)) reve4=rbind(reve4,a)
if ((sum(1/a[-4])+1/a>1)&&(sum(a)<625)){
a[3:4]=rep(a+1,2)}else{
b=rep(a+1,3)
if (a+sum(b)<625){
a=c(a,b)}else{a=rep(a+1,4)
}}
}
```

and 5 term decompositions:

```reve5=NULL
a=rep(2,5)
while ((a<5)&&(sum(a)<626)){
while ((sum(1/a[-5])>=1)&&(sum(a)<626)) a[4:5]=rep(a+1,2)
a=max(a,trunc(1/(1-sum(1/a[-5])))-1)
while ((sum(1/a)>1)&&(sum(a)<626)) a=a+1
if (inva(a)) reve5=rbind(a,reve5)
if ((sum(1/a[-5])+1/a>1)&&(sum(a)<624)){
a[4:5]=rep(a+1,2)}else{
b=rep(a+1,3)
if (sum(a[1:2])+sum(b)<625){a=c(a[1:2],b)}else{
b=rep(a+1,4)
if (a+sum(b)<625){a=c(a,b)}else{
a=rep(a+1,5)
}}}
}
```

From there, we have a basis on which we can fill a 25×25 table of further non-villeins, applying the above rules:

```nob=rep(0,25^2)
nob[c(1,4,9,10,11,16,24)]=1
nob[apply(reve4,1,sum)]=1
nob[apply(reve5,1,sum)]=1
ref=1
newref=sum(nob)
while (newref>ref){
for (i in (1:25^2)[nob==1]){
if (i<26) nob[i^2]=1
if (2*i+2<625) nob[2*i+2]=1
if (2*i+9<626) nob[2*i+9]=1
if (3*i+6<626) nob[3*i+6]=1
if (4*i+6<626) nob[4*i+6]=1
if (6*i+5<626) nob[6*i+5]=1
for (m in 2:25)
if (m*(m+i-1)<626) nob[m*(m+i-1)]=1}
ref=newref;newref=sum(nob)
image(1:25,1:25,matrix(nob,25))}
```

This produces two false villeins, namely 47 and 103, since all numbers larger than 24 are non-villeins: Since both 47 and 103 are prime numbers, using other product rules would not change anything to the result. An extra R code looking at all possible decompositions of those numbers into sums would show them as false villeins. Or moving to 6 term decompositions since 47 = 3+4+8+8+12+12. Here is a small random search:

```a=c(3,rep(2,23))
N=47
while (!inva(a)){
b=sample(2:N,1)
a0=c(b,N-sum(b))
while (sum(1/a0)<1){
b=sample(2:(N-sum(a0[-length(a0)])),1)
a0=c(a0[-length(a0)],b)
a0=c(a0,N-sum(a0))}
a=a0}
```

which returns 47 =4+6+20+6+5. And 103=40+35+8+14+2+4

Interesting features of this problem are that (a) it appears fairly frequently in the mathematical games literature, like the Olympiads, see e.g. this resolution incl. the proof about 24 as the upper limit to the existence of villeins, (b) the name of those integers varies from case to case, like good integers, and (c) the denomination of noble integers is used in mathematics for “irrational numbers having a continued fraction representation that becomes an infinite sequence of 1s at some point“. (There was also a very similar question on stack exchange a few weeks ago but I cannot trace it…)