Archive for the R Category

call for sessions and labs at Bay2sC0mp²⁰

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on February 22, 2019 by xi'an

A call to all potential participants to the incoming BayesComp 2020 conference at the University of Florida in Gainesville, Florida, 7-10 January 2020, to submit proposals [to me] for contributed sessions on everything computational or training labs [to David Rossell] on a specific language or software. The deadline is April 1 and the sessions will be selected by the scientific committee, other proposals being offered the possibility to present the associated research during a poster session [which always is a lively component of the conference]. (Conversely, we reserve the possibility of a “last call” session made from particularly exciting posters on new topics.) Plenary speakers for this conference are

and the first invited sessions are already posted on the webpage of the conference. We dearly hope to attract a wide area of research interests into a as diverse as possible program, so please accept this invitation!!!

simulation fodder for future exams

Posted in Books, Kids, R, Statistics with tags , , , , on February 20, 2019 by xi'an

Here are two nice exercises for a future simulation exam, seen and solved on X validated.The first one is about simulating a Gibbs sampler associated with the joint target

exp{-|x|-|y|-a|y-x|}

defined over IR² for a≥0 (or possibly a>-1). The conditionals are identical and non-standard, but a simple bound on the conditional density is the corresponding standard double exponential density, which makes for a straightforward accept-reject implementation. However it is also feasible to break the full conditional into three parts, depending on the respective positions of x, y, and 0, and to obtain easily invertible cdfs on the three intervals.The second exercise is about simulating from the cdf

F(x)=1-\exp\{-ax-bx^{p+1}/(p+1)\}

which can be numerically inverted. It is however more fun to call for an accept-reject algorithm by bounding the density with a ½ ½ mixture of an Exponential Exp(a) and of the 1/(p+1)-th power of an Exponential Exp(b/(p+1)). Since no extra constant appears in the solution,  I suspect the (p+1) in b/(p+1) was introduced on purpose. As seen in the above fit for 10⁶ simulations (and a=1,b=2,p=3), there is no deviation from the target! There is nonetheless an even simpler and rather elegant resolution to the exercise: since the tail function (1-F(x)) appears as the product of two tail functions, exp(-ax) and the other one, the cdf is the distribution of the minimum of two random variates, one with the Exp(a) distribution and the other one being the 1/(p+1)-th power of an Exponential Exp(b/(p+1)) distribution. Which of course returns a very similar histogram fit:

Le Monde puzzle [#1085]

Posted in Books, Kids, R with tags , , , , , on February 18, 2019 by xi'an

A new Le Monde mathematical puzzle in the digit category:

Given 13 arbitrary relative integers chosen by Bo, Abigail can select any subset of them to be drifted by plus or minus one by Bo, repeatedly until Abigail reaches the largest possible number N of multiples of 5. What is the minimal possible value of N under the assumption that Bo tries to minimise it?

I got stuck on that one, as building a recursive functiion led me nowhere: the potential for infinite loop (add one, subtract one, add one, …) rather than memory issues forced me into a finite horizon for the R function, which then did not return anything substantial in a manageable time. Over the week and the swimming sessions, I thought of simplifying the steps, like (a) work modulo 5, (b) bias moves towards 1 or 4, away from 2 and 3, by keeping only one entry in 2 and 3, and all but one at 1 and 4, but could only produce five 0’s upon a sequence of attempts… With the intuition that only 3 entries should remain in the end, which was comforted by Le Monde solution the week after.

Le Monde puzzle [#1083]

Posted in Books, Kids, R, Travel with tags , , , , , , on February 7, 2019 by xi'an

A Le Monde mathematical puzzle that seems hard to solve without the backup of a computer (and just simple enough to code on a flight to Montpellier):

Given the number N=2,019, find a decomposition of N as a sum of non-trivial powers of integers such that (a) the number of integers in the sum is maximal or (b) all powers are equal to 4.  Is it possible to write N as a sum of two powers?

It is straightforward to identify all possible terms in these sums by listing all powers of integers less than N

pool=(1:trunc(sqrt(2019)))^2
for (pow in 3:11)
  pool=unique(c(pool,(2:trunc(2019^(1/pow)))^pow))

which leads to 57 distinct powers. Sampling at random from this collection at random produces a sum of 21 perfect powers:

 1+4+8+9+16+25+27+32+36+49+64+81+100+121+125+128+144+169+196+243+441

But looking at the 22 smallest numbers in the pool of powers leads to 2019, which is a sure answer. Restricting the terms to powers of 4 leads to the sequence

1⁴+2⁴+3⁴+5⁴+6⁴ = 2019

And starting from the pools of all possible powers in a decomposition of 2019 as the sum of two powers shows this is impossible.

missing digit in a 114 digit number [a Riddler’s riddle]

Posted in R, Running, Statistics with tags , , , , , , , on January 31, 2019 by xi'an

A puzzling riddle from The Riddler (as Le Monde had a painful geometry riddle this week): this number with 114 digits

530,131,801,762,787,739,802,889,792,754,109,70?,139,358,547,710,066,257,652,050,346,294,484,433,323,974,747,960,297,803,292,989,236,183,040,000,000,000

is missing one digit and is a product of some of the integers between 2 and 99. By comparison, 76! and 77! have 112 and 114 digits, respectively. While 99! has 156 digits. Using WolframAlpha on-line prime factor decomposition code, I found that only 6 is a possible solution, as any other integer between 0 and 9 included a large prime number in its prime decomposition:

However, I thought anew about it when swimming the next early morning [my current substitute to morning runs] and reasoned that it was not necessary to call a formal calculator as it is reasonably easy to check that this humongous number has to be divisible by 9=3×3 (for else there are not enough terms left to reach 114 digits, checked by lfactorial()… More precisely, 3³³x33! has 53 digits and 99!/3³³x33! 104 digits, less than 114), which means the sum of all digits is divisible by 9, which leads to 6 as the unique solution.

 

more concentration, everywhere

Posted in R, Statistics with tags , , , , , , , , , , on January 25, 2019 by xi'an

Although it may sound like an excessive notion of optimality, one can hope at obtaining an estimator δ of a unidimensional parameter θ that is always closer to θ that any other parameter. In distribution if not almost surely, meaning the cdf of (δ-θ) is steeper than for other estimators enjoying the same cdf at zero (for instance ½ to make them all median-unbiased). When I saw this question on X validated, I thought of the Cauchy location example, where there is no uniformly optimal estimator, albeit a large collection of unbiased ones. But a simulation experiment shows that the MLE does better than the competition. At least than three (above) four of them (since I tried the Pitman estimator via Christian Henning’s smoothmest R package). The differences to the MLE empirical cd make it clearer below (with tomato for a score correction, gold for the Pitman estimator, sienna for the 38% trimmed mean, and blue for the median):I wonder at a general theory along these lines. There is a vague similarity with Pitman nearness or closeness but without the paradoxes induced by this criterion. More in the spirit of stochastic dominance, which may be achievable for location invariant and mean unbiased estimators…

Le Monde puzzle [#1081]

Posted in Books, Kids, R, Travel with tags , , , , on January 24, 2019 by xi'an

A “he said-she said” Le Monde mathematical puzzle (again in the spirit of the famous Singapore high-school birthdate problem):

Abigail and Corentin are both given a positive integer, a and b, such that a+b is either 19 or 20. They are asked one after the other and repeatedly if they are sure of the other’s number. What is the maximum number of times they are questioned?

If Abigail is given a 19, b=1 necessarily. Hence if Abigail does not reply, a<19. This implies that, if Corentin is given b=1 or b=19, he can reply a+b=19 or a+b=20, necessarily. Else, 1<b<19 implies that, if a=1 or a=18, b=18 or b=2. And so on…which leads to a maximum of 20 questions, 10 for Abigail and 10 for Corentin. Here is my R implementation

az=bz=cbind(20-(1:19),19-(1:19))
qwz=0;at=TRUE;bt=FALSE
while ((max(az)>0)&(max(bz)>0)){
 if (at){ 
  for (i in 1:19){ 
   if (sum(az[i,]>0)==2){
   for (j in az[i,az[i,]>0]){ 
     if (sum(bz[j,]==0)==2) az[i,]=rep(0,2)}}
   if (sum(az[i,]>0)<2){ 
    az[i,]=rep(0,2)}}} 
  if (bt){ 
   for (i in 1:19){ 
    if (sum(bz[i,bz[i,]>0]>0)==2){
     for (j in bz[i,bz[i,]>0]){ 
      if (sum(az[j,]==0)==2) bz[i,]=rep(0,2)}}
     if (sum(bz[i,]>0)<2){ bz[i,]=rep(0,2)}}}
  bt=!bt;at=!at;qwz=qwz+1}