Archive for the Books Category

an integer programming riddle

Posted in Books, Kids, R with tags , , , , on April 21, 2016 by xi'an

A puzzle on The Riddler this week that ends up as a standard integer programming problem. Removing the little story around the question, it boils down to optimise

200a+100b+50c+25d

under the constraints

400a+400b+150c+50d≤1000, b≤a, a≤1, c≤8, d≤4,

and (a,b,c,d) all non-negative integers. My first attempt was a brute force R code since there are only 3 x 9 x 5 = 135 cases:

f.obj<-c(200,100,50,25)
f.con<-matrix(c(40,40,15,5,
  -1,1,0,0,
   1,0,0,0,
   0,0,1,0,
   0,0,0,1),ncol=4,byrow=TRUE)
f.dir<-c("=","=","=","=","=","=")
f.rhs<-c(100,0,1,8,4)

sol=0
for (a in 0:1)
for (b in 0:a)
for (k in 0:8)
for (d in 0:4){
  cost=f.con%*%c(a,b,k,d)-f.rhs
  if (max(cost)<=0){ gain=f.obj%*%c(a,b,k,d)
if (gain>sol){
     sol=gain
     argu=c(a,b,k,d)}}}

which returns the value:

> sol
     [,1]
[1,]  425
> argu
[1] 1 0 3 3

This is confirmed by a call to an integer programming code like lpSolve:

> lp("max",f.obj,f.con,f.dir,f.rhs,all.int=TRUE)
Success: the objective function is 425
> lp("max",f.obj,f.con,f.dir,f.rhs,all.int=TRUE)$sol
[1] 1 0 3 3

which provides the same solution.

measuring honesty, with p=.006…

Posted in Books, Kids, pictures, Statistics with tags , , , , , on April 19, 2016 by xi'an


Simon Gächter and Jonathan Schulz recently published a paper in Nature attempting to link intrinsic (individual) honesty with a measure of corruption in the subject home country. Out of more than 2,500 subjects in 23 countries. [I am now reading Nature on a regular basis, thanks to our lab subscribing a coffee room subscription!] Now I may sound naïvely surprised at the methodological contents of the paper and at a publication in Nature but I never read psychology papers, only Andrew’s rants at’em!!!

“The results are consistent with theories of the cultural co-evolution of institutions and values, and show that weak institutions and cultural legacies that generate rule violations not only have direct adverse economic consequences, but might also impair individual intrinsic honesty that is crucial for the smooth functioning of society.”

The experiment behind this article and its rather deep claims is however quite modest: the authors asked people to throw a dice twice without monitoring and rewarded them according to the reported result of the first throw. Being dishonest here means reporting a false result towards a larger monetary gain. This sounds rather artificial and difficult to relate to dishonest behaviours in realistic situations, as I do not see much appeal in cheating for 50 cents or so. Especially since the experiment accounted for a difference in wealth backgrounds, by adapting to the hourly wage in the country (“from $0.7 dollar in Vietnam to $4.2 in the Netherlands“). Furthermore, the subjects of this experiment were undergraduate students in economics departments: depending on the country, this may create a huge bias in terms of social background, as I do not think access to universities is the same in Germany and in Guatemala, say.

“Our expanded scope of societies therefore provides important support and qualifications for the generalizability of these theories—people benchmark their justifiable dishonesty with the extent of dishonesty they see in their societal environment.”

The statistical analysis behind this “important support” is not earth-shattering either. The main argument is based on the empirical cdfs of the gain repartitions per country (in the above graph), with tests that the overall empirical cdf for low corruption countries is higher than the corresponding one for high corruption countries. The comparison of the cumulated or pooled cdf across countries from each group is disputable, in that there is no reason the different countries have the same “honesty” cdf. The groups themselves are built on a rough measure of “prevalence of rule violations”. It is also rather surprising that for both groups the percentage of zero gain answers is “significantly” larger than the expected value of 2.8% if the assumption of “justified dishonesty” holds. In any case, there is no compelling argument as to why students not reporting the value of the first dice would naturally opt for the maximum of the two dices. Hence a certain bemusement at this paper appearing in Nature and even deserving an introductory coverage in the first part of the journal…

a day in Roma

Posted in Books, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , on April 18, 2016 by xi'an

Last Friday I spent about 24 hours in Roma due to Clara Grazian defending her thesis there, which was awarded the highest PhD degree from both Sapienza Università di Roma and Université Paris-Dauphine. Her thesis was composed of her papers on ABC for integrated likelihood, on Jeffreys priors for mixtures (which sadly was rejected a few weeks ago), and on scoring rules à la Dawid for model choice. Clara was the first student to graduate from the joint graduate program between Sapienza and Paris-Dauphine, and I look forward the graduation of the next students!It was absolutely wonderful to be there, not only to attend the defence with Marilena Barbieri, Fabrizio Leisen, and Brunero Liseo (who was also Clara’s supervisor) and to congratulate Clara on the completion of her thesis, but also to meet [albeit much too briefly] with old friends, to enjoy great Roman food, perfect weather, my usual long run along the Tiber and twelve of its bridges in the glorious Roman morning, and “just” this unique feeling of Roma in Spring…

flotsam [art brut]

Posted in Books, pictures, Travel with tags , , , , on April 17, 2016 by xi'an

duck, duck, …magnetic duck

Posted in Books, Kids, pictures, Travel with tags , , , , , on April 17, 2016 by xi'an

“Mallards in tight formation tends to face either north or south when landing. Because vision alone cannot prevent collision between these high-speed flyers, the ducks use sensors in their eyes, beaks and ears to align themselves to Earth’s magnetic field.” Nature, vol. 531, 31 March 2016

slice sampling revisited

Posted in Books, pictures, Statistics with tags , , , , , , , , on April 15, 2016 by xi'an

Figure 1 (c.) Neal, 2003Thanks to an X validated question, I re-read Radford Neal’s 2003 Slice sampling paper. Which is an Annals of Statistics discussion paper, and rightly so. While I was involved in the editorial processing of this massive paper (!), I had only vague memories left about it. Slice sampling has this appealing feature of being the equivalent of random walk Metropolis-Hastings for Gibbs sampling, without the drawback of setting a scale for the moves.

“These slice sampling methods can adaptively change the scale of changes made, which makes them easier to tune than Metropolis methods and also avoids problems that arise when the appropriate scale of changes varies over the distribution  (…) Slice sampling methods that improve sampling by suppressing random walks can also be constructed.” (p.706)

One major theme in the paper is fighting random walk behaviour, of which Radford is a strong proponent. Even at the present time, I am a bit surprised by this feature as component-wise slice sampling is exhibiting clear features of a random walk, exploring the subgraph of the target by random vertical and horizontal moves. Hence facing the potential drawback of backtracking to previously visited places.

“A Markov chain consisting solely of overrelaxed updates might not be ergodic.” (p.729)

Overrelaxation is presented as a mean to avoid the random walk behaviour by removing rejections. The proposal is actually deterministic projecting the current value to the “other side” of the approximate slice. If it stays within the slice it is accepted. This “reflection principle” [in that it takes the symmetric wrt the centre of the slice] is also connected with antithetic sampling in that it induces rather negative correlation between the successive simulations. The last methodological section covers reflective slice sampling, which appears as a slice version of Hamiltonian Monte Carlo (HMC). Given the difficulty in implementing exact HMC (reflected in the later literature), it is no wonder that Radford proposes an approximation scheme that is valid if somewhat involved.

“We can show invariance of this distribution by showing (…) detailed balance, which for a uniform distribution reduces to showing that the probability density for x¹ to be selected as the next state, given that the current state is x0, is the same as the probability density for x⁰ to be the next state, given that x¹ is the current state, for any states x⁰ and x¹ within [the slice] S.” (p.718)

In direct connection with the X validated question there is a whole section of the paper on implementing single-variable slice sampling that I had completely forgotten, with a collection of practical implementations when the slice

S={x; u < f(x) }

cannot be computed in an exact manner. Like the “stepping out” procedure. The resulting set (interval) where the uniform simulation in x takes place may well miss some connected component(s) of the slice. This quote may sound like a strange argument in that the move may well leave a part of the slice off and still satisfy this condition. Not really since it states that it must hold for any pair of states within S… The very positive side of this section is to allow for slice sampling in cases where the inversion of u < f(x) is intractable. Hence with a strong practical implication. The multivariate extension of the approximation procedure is more (potentially) fraught with danger in that it may fell victim to a curse of dimension, in that the box for the uniform simulation of x may be much too large when compared with the true slice (or slice of the slice). I had more of a memory of the “trail of crumbs” idea, mostly because of the name I am afraid!, which links with delayed rejection, as indicated in the paper, but seems awfully delicate to calibrate.

exact, unbiased, what else?!

Posted in Books, Statistics, University life with tags , , , , , , , , on April 13, 2016 by xi'an

Last week, Matias Quiroz, Mattias Villani, and Robert Kohn arXived a paper on exact subsampling MCMC, a paper that contributes to the current literature on approximating MCMC samplers for large datasets, in connection with an earlier paper of Quiroz et al. discussed here last week.

quirozetal.The “exact” in the title is to be understood in the Russian roulette sense. By using Rhee and Glynn debiaising device, the authors achieve an unbiased estimator of the likelihood as in Bardenet et al. (2015). The central tool for the derivation of an unbiased and positive estimator is to find a control variate for each component of the log likelihood that is good enough for the difference between the component and the control to be lower bounded. By the constant a in the screen capture above. When the individual terms d in the product are iid unbiased estimates of the log likelihood difference. And q is the sum of the control variates. Or maybe more accurately of the cheap substitutes to the exact log likelihood components. Thus still of complexity O(n), which makes the application to tall data more difficult to contemplate.

The $64 question is obviously how to produce cheap and efficient control variates that kill the curse of the tall data. (It still irks to resort to this term of control variate, really!) Section 3.2 in the paper suggests clustering the data and building an approximation for each cluster, which seems to imply manipulating the whole dataset at this early stage. At a cost of O(Knd). Furthermore, because finding a correct lower bound a is close to impossible in practice, the authors use a “soft lower bound”, meaning that it is only an approximation and thus that (3.4) above can get negative from time to time, which cancels the validation of the method as a pseudo-marginal approach. The resolution of this difficulty is to resort to the same proxy as in the Russian roulette paper, replacing the unbiased estimator with its absolute value, an answer I already discussed for the Russian roulette paper. An additional step is proposed by Quiroz et al., namely correlating the random numbers between numerator and denominator in their final importance sampling estimator, via a Gaussian copula as in Deligiannidis et al.

This paper made me wonder (idly wonder, mind!) anew how to get rid of the vexing unbiasedness requirement. From a statistical and especially from a Bayesian perspective, unbiasedness is a second order property that cannot be achieved for most transforms of the parameter θ. And that does not keep under reparameterisation. It is thus vexing and perplexing that unbiased is so central to the validation of our Monte Carlo technique and that any divergence from this canon leaves us wandering blindly with no guarantee of ever reaching the target of the simulation experiment…

Follow

Get every new post delivered to your Inbox.

Join 1,020 other followers