Archive for the R Category

weakly informative reparameterisations

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on February 14, 2018 by xi'an

Our paper, weakly informative reparameterisations of location-scale mixtures, with Kaniav Kamary and Kate Lee, got accepted by JCGS! Great news, which comes in perfect timing for Kaniav as she is currently applying for positions. The paper proposes a unidimensional mixture Bayesian modelling based on the first and second moment constraints, since these turn the remainder of the parameter space into a compact. While we had already developed an associated R package, Ultimixt, the current editorial policy of JCGS imposes the R code used to produce all results to be attached to the submission and it took us a few more weeks than it should have to produce a directly executable code, due to internal library incompatibilities. (For this entry, I was looking for a link to our special JCGS issue with my picture of Edinburgh but realised I did not have this picture.)

València summer school

Posted in Kids, pictures, R, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , on January 31, 2018 by xi'an

In another continuation of the summer of Bayesian conferences in Europe, the Universidat de Valencià is organising a summer school on Bayesian statistics, from 16 July till 20 July, 2018. Which thus comes right after our summer school on computational statistics at Warwick. With a basic course on Bayesian learning (2 days). And a more advanced course on Bayesian modeling with BayesX. And a final day workshop.

Le Monde puzzle [#1037]

Posted in Books, Kids, R with tags , , , , , , , on January 24, 2018 by xi'an

lemondapariA purely geometric Le Monde mathematical puzzle this (or two independent ones, rather):

Find whether or not there are inscribed and circumscribed circles to a convex polygon with 2018 sides of lengths ranging 1,2,…,2018.

In the first (or rather second) case, the circle of radius R that is tangential to the polygon and going through all nodes (assuming such a circle exists) is such that a side L and its corresponding inner angle θ satisfy

L²=R²2(1-cos(θ))

leading to the following R code

R=3.2e5
step=1e2
anglz=sum(acos(1-(1:2018)^2/(2*R^2)))
while (abs(anglz-2*pi)>1e-4){
R=R-step+2*step*(anglz>2*pi)*(R>step)
anglz=sum(acos(1-(1:2018)^2/(2*R^2))) 
step=step/1.01}

and the result is

> R=324221
> sum(acos(1-(1:2018)^2/(2*R^2)))-2*pi
[1] 9.754153e-05

(which is very close to the solution of Le Monde when replacing sin(α) by α!). The authors of the quoted paper do not seem to consider the existence an issue.

In the second case, there is a theorem that states that if the equations

x¹+x²=L¹,…,x²⁰¹⁸+x¹=L²⁰¹⁸

have a solution on R⁺ then there exists a circle such that the polygon is tangential to this circle. Quite interestingly, if the number n of sides is even there are an infinitude of tangential polygons if any.  Now, and rather obviously, the matrix A associated with the above system of linear equations is singular with a kernel induced by the vector (1,-1,…,1,-1). Hence the collection of the sides must satisfy

L¹-L²…+L²⁰¹⁷-L²⁰¹⁸ =0

which puts a constraint on the sequence of sides, namely to divide them into two groups with equal sum, 2018×2019/4, which is not an integer. Hence, a conclusion of impossibility! [Thanks to my office neighbours François and Julien for discussing the puzzle with me.]

Le Monde puzzle [#1036]

Posted in Books, Kids, R with tags , , , , on January 4, 2018 by xi'an

lemondapariAn arithmetic Le Monde mathematical puzzle to conclude 2017:

Find (a¹,…,a¹³), a permutation of (1,…,13) such that

a¹/a²+a³=a²+a³/a³+a⁴+a⁵=b¹<1
a⁶/a⁶+a⁷=a⁶+a⁷/a⁷+a⁸+a⁹=a⁷+a⁸+a⁹/a⁵+a⁹+a¹⁰=b²<1
a¹¹+a¹²/a¹²+a¹³=a¹²+a¹³/a¹³+a¹⁰=b³<1

The question can be solved by brute force simulation, checking all possible permutations of (1,…,13). But 13! is 6.6 trillion, a wee bit too many cases. Despite the problem being made of only four constraints and hence the target function taking only five possible values, a simulated annealing algorithm returned a solution within a few calls:

(a¹,…,a¹³)=(6,1,11,3,10,8,4,9,5,12,7,2,13)
(b¹,b²,b³)=(1/2,2/3,3/5)

using the following code:

checka=function(a){ #target to maximise
 return(1*(a[1]/sum(a[2:3])==sum(a[2:3])/sum(a[3:5]))+
  1*(sum(a[6:7])/sum(a[7:9])==a[6]/sum(a[6:7]))+
  1*(sum(a[7:9])/(a[5]+sum(a[9:10]))==a[6]/sum(a[6:7]))+
  1*(sum(a[11:12])/sum(a[12:13])==sum(a[12:13])/
    (a[10]+a[13])))}
parm=sample(1:13)
cheka=checka(parm)
beta=1
for (t in 1:1e6){
  qarm=parm
  idx=sample(1:13,sample(2:12))
  qarm[idx]=sample(qarm[idx])
  chekb=checka(qarm)
  if (log(runif(1))<beta*(chekb-cheka)){
     cheka=chekb;parm=qarm}
  beta=beta*(1+log(1.00001))
  if (cheka==4) break()}

correlation for maximal coupling

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , on January 3, 2018 by xi'an

An interesting (if vaguely formulated) question on X validated: given two Gaussian variates that are maximally coupled, what is the correlation between these variates? The answer depends on the parameters of both Gaussian, with a correlation of one when both Gaussians are identical. Answering the question by simulation (as I could not figure out the analytical formula on Boxing Day…) led me back to Pierre Jacob’s entry on the topic on Statisfaction, where simulating the maximal coupling stems from the decompositions

p(x)=p(x)∧q(x)+{p(x)-p(x)∧q(x)}  and  q(x)=p(x)∧q(x)+{q(x)-p(x)∧q(x)}

and incidentally to the R function image.plot (from the R library fields) for including the side legend.

cyclic riddle [not in NYC!]

Posted in Kids, R, Running, Travel with tags , , , , , , , , , , on December 29, 2017 by xi'an

In the riddle of this week on fivethirtyeight, the question is to find the average number of rounds when playing the following game: P=6 players sitting in a circle each have B=3 coins and with probability 3⁻¹ they give one coin to their right or left side neighbour, or dump the coin to the centre. The game ends when all coins are in the centre. Coding this experiment in R is rather easy

situz=rep(B,P)
r=1
while (max(situz)>0){
 unz=runif(P)
 newz=situz-(situz>0)
 for (i in (1:P)[unz<1/3]) 
  newz[i%%P+1]=newz[i%%P+1]+(situz[i]>0)
 for (i in (1:P)[unz>2/3])
  newz[i-1+P*(i==1)]=newz[i-1+P*(i==1)]+(situz[i]>0)
 situz=newz
 r=r+1}

resulting in an average of 15.58, but I cannot figure out (quickly enough) an analytic approach to the problem. (The fit above is to a Gamma(â-1,ĝ) distribution.)

In a completely unrelated aparté (aside), I read earlier this week that New York City had prohibited the use of electric bikes. I was unsure of what this meant (prohibited on sidewalks? expressways? metro carriages?) so rechecked and found that electric bikes are simply not allowed for use anywhere in New York City. Because of the potential for “reckless driving”. The target is apparently the high number of delivery e-bikes, but this ban sounds so absurd that I cannot understand it passed. Especially when De Blasio has committed the city to the Paris climate agreement after Trump moronically dumped it… Banning the cars would sound much more in tune with this commitment! (A further aparté is that I strongly dislike e-bikes, running on nuclear power plants,  especially when they pass me on sharp hills!, but they are clearly a lesser evil when compared with mopeds and cars.)

random wake

Posted in Books, Kids, R, Statistics with tags , , , , , , on December 27, 2017 by xi'an

Just too often on X validated, one sees questions displaying a complete ignorance of the basics that makes one purposelessly wonder what is the point of trying to implement advanced methods when missing the necessary background. And just as often, I reacted to the question by wondering out loud about this… In the current case, the question was about debugging an R code for a mixture of two exponential distributions and the random walk Metropolis algorithm that comes with it. Except that the Normal noise was replaced with a Uniform U(0,1) noise, leading to a most obviously transient Markov chain.I eventually corrected the R code, which returned a rather nicely label-switching output. And which did not necessarily help with the comprehension of the fundamentals.