Archive for optimisation

Le Monde puzzle [#1006]

Posted in Kids, R with tags , , , , , , , on May 3, 2017 by xi'an

Once the pseudo-story [noise] removed, a linear programming Le Monde mathematical puzzle:

For the integer linear programming problem

max 2x¹+2x²+x³+…+x¹⁰

under the constraints

x¹>x²+x³, x²>x³+x⁴, …, x⁹>x¹⁰+x¹, x¹⁰>x¹+x²

find a solution with the maximal number of positive entries.

Expressed this way, it becomes quite straightforward to solve with the help of a linear programming R code like lpSolve. Indeed, a simple iteration of the constraints shows that positive entries are necessarily bracketed by negative entries, since, e.g.,

x²<-88x¹/55, x¹⁰<-33x¹/55

(which applies to all consecutive triplets since the constraints are invariant by transposition). Hence there are at most five positive entries but calling lpSolve with this option

> lp (direction="max",,2,rep(1,8)),
Error: no feasible solution found

shows this is not possible. (The added vector is my way of getting around the constraint that lpSolve only considers positive entries. I therefore moved the negative entries by 20, meaning they are assumed to be larger than -20. Using the larger bound 50 does not change the outcome.) From there, there are 10 possible versions of vectors with four positive entries and a simple loop returns

> masume
[1] -90
> topast
 [1] -11 1 -13 1 -15 1 -17 1 -19 -9

as the maximal criterion and argument of this maximum with four positive entries.

As an aside, the previous Le Monde puzzle [#1005] was also provided by a linear system: given 64 cubes, each of the 384 faces being painted in one of four colours, with exactly 40 of these cubes exhibiting the four colours,  the question was about the number of cubes that could be bicolour so that a mono-colour super-cube could be reconstituted for all colours.  Which amounted to solve the four equations

4a+2b=24,4c+2d=40, b+c=8,d+3a=24,

leading to 40 quadri-colour, 16 tri-colour, and 8 bi-colour cubes.

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on April 12, 2017 by xi'an

The reason for my short visit to Berlin last week was an OxWaSP (Oxford and Warwick Statistics Program) workshop hosted by Amazon Berlin with talks between statistics and machine learning, plus posters from our second year students. While the workshop was quite intense, I enjoyed very much the atmosphere and the variety of talks there. (Just sorry that I left too early to enjoy the social programme at a local brewery, Brauhaus Lemke, and the natural history museum. But still managed nice runs east and west!) One thing I found most interesting (if obvious in retrospect) was the different focus of academic and production talks, where the later do not aim at a full generality or at a guaranteed improvement over the existing, provided the new methodology provides a gain in efficiency over the existing.

This connected nicely with me reading several Nature articles on quantum computing during that trip,  where researchers from Google predict commercial products appearing in the coming five years, even though the technology is far from perfect and the outcome qubit error prone. Among the examples they provided, quantum simulation (not meaning what I consider to be simulation!), quantum optimisation (as a way to overcome multimodality), and quantum sampling (targeting given probability distributions). I find the inclusion of the latest puzzling in that simulation (in that sense) shows very little tolerance for errors, especially systematic bias. It may be that specific quantum architectures can be designed for specific probability distributions, just like some are already conceived for optimisation. (It may even be the case that quantum solutions are (just next to) available for intractable constants as in Ising or Potts models!)

Le Monde puzzle [#1002]

Posted in Kids, R with tags , , , , , , , on April 4, 2017 by xi'an

For once and only because it is part of this competition, a geometric Le Monde mathematical puzzle:

Given both diagonals of lengths p=105 and q=116, what is the parallelogram with the largest area? and when the perimeter is furthermore constrained to be L=290?

This made me jump right away to the quadrilateral page on Wikipedia, which reminds us that the largest area occurs when the diagonals are orthogonal, in which case it is A=½pq. Only the angle between the diagonals matters. Imposing the perimeter 2s in addition is not solved there, so I wrote an R code looking at all the integer solutions, based on one of the numerous formulae for the area, like ½pq sin(θ), where θ is the angle between both diagonals, and discretising in terms of the fractions of both diagonals at the intersection, and of the angle θ:

for (alpha in (1:500)/1000){
 for (beta in (1:999)/1000){
  for (teta in (1:9999)*pi/10000){
   if (abs(a+b+c+d-2*s)<.01){
     if (p*q*sin(teta)<2*maxur){

This code returned an area of 4350, to compare with the optimal 6090 (which is recovered by the above R code when the diagonal lengths are identical and the perimeter is the one of the associated square). (As Jean-Louis Foulley pointed out to me, this area can be found directly by assuming the quadrilateral is a parallelogram and maximising in the length of one side.)

A knapsack riddle [#2]?

Posted in Kids, R, Statistics with tags , , , on February 17, 2017 by xi'an


Still about this allocation riddle of the past week, and still with my confusion about the phrasing of the puzzle, when looking at a probabilistic interpretation of the game, rather than for a given adversary’s y, the problem turns out to search for the maximum of

\mathbb{E}[L(x,Y)]=\sum_{i=1}^{10} i\{P(Y_i<x_i)-P(Y_i>x_i)\}

where the Y’s are Binomial B(100,p). Given those p’s, this function of x is available in closed form and can thus maximised by a simulated annealing procedure, coded as

  for (i in 2:10)
#basic term in utility
#relies on a given or estimated p
#random search
while ((t<T)||(newloss>loss)){
#moving all other counters by one
#corresponding utility change
 for (j in 1:10) baza[j]=baz(j,xp,p)
#soft annealing move
 if (i!=j){ x[i]=x[i]-1;x[j]=x[j]+1}
if (newloss>maxloz){
if ((t>T-10)&(newloss<losref)){

which seems to work, albeit not always returning the same utility. For instance,

> p=cy/sum(cy)
> utility(argz,p)
[1] 78.02
> utility(cy,p)
[1] 57.89


> p=sy/sum(sy)
> utility(argz,p)
[1] 82.04
> utility(sy,p)
[1] 57.78

Of course, this does not answer the question as intended and reworking the code to that purpose is not worth the time!

a knapsack riddle?

Posted in Books, pictures, R, Statistics, Travel with tags , , , , , , on February 13, 2017 by xi'an


The [then current now past] riddle of the week is a sort of multiarmed bandits optimisation. Of sorts. Or rather a generalised knapsack problem. The question is about optimising the allocation of 100 undistinguishable units to 10 distinct boxes against a similarly endowed adversary, when the loss function is


and the distribution q of the adversary is unknown. As usual (!), the phrasing of the riddle is somewhat ambiguous but I am under the impression that the game is played sequentially, hence that one can learn about the distribution of the adversary, at least when assuming this adversary keeps the same distribution q at all times. Continue reading

optimality stands in the eye of the riddler

Posted in Books, Kids, pictures, Statistics with tags , , , , , , on March 22, 2016 by xi'an

When looking at US elections on FiveThirtyEight, I came across this riddle:

Two players go (…) into separate booths (…) and a random number between zero and one appears on a screen (…) chosen from a standard uniform distribution. They can choose to keep that first number, or to (…) get a second random number, which they must keep (…) The prize (…) is awarded to the player who kept the higher number (…) Which number is the optimal cutoff for players to discard their first number and choose another? [The Riddler, Mar 4, 2016]

While the solution is now available, I wonder at the wording of this riddle, where “optimality” is not spelled out. Unless I missed some key entry, as it often happens with puzzles… Assuming both players use the same “optimal” cut-off C to make their decision to keep or drop the original uniform, the probability that one does better than the other is exactly ½ since both outcomes are iid. When considering the expected value of the number kept by a player, a simple integral shows that this value is


which is maximal for C=½. If one considers instead the median of the number Z kept by a player, a bit more computation leads to the function med(Z) = 1/2C when C²>1/2 and (½+C)/(1+C) when C²<1/2, which is maximal for C²=½.

“…using the golden ratio gives the best chance to win the gold bullion!” [The Riddler, Mar 4, 2016]

Incidentally (or not), the solution on 538 is also difficult to understand in that it starts with the event that the first draw is C, which is an event of probability zero. However, when trying to optimise the choice of C for one player, given that the other player has a known cuttoff of D, I found that the only case when C=D coincides with the solution proposed on Riddler, namely ½(√5-1). To check whether or not this derivation was correct, I also plotted the theoretical (right) versus the empirical (left) versions of the winning probability:

riddlerThere is no difference between the two. But the diagonal is exactly at the .5 level, as expected:riddlerinthestormwith an interesting second curve at the .5 probability level. These two level sets happen to meet at the “golden number” solution, ½(√5-1), which comes as a confirmation of my earlier remark. [Another question connected with this riddle would be to figure out the value of D used by the other player from a sequence of games.]

Approximate Maximum Likelihood Estimation

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , on September 21, 2015 by xi'an

linz3Bertl et al. arXived last July a paper on a maximum likelihood estimator based on an alternative to ABC techniques. And to indirect inference. (One of the authors in et al. is Andreas Futschik whom I visited last year in Linz.) Paper that I only spotted when gathering references for a reading list on ABC… The method is related to the “original ABC paper” of Diggle and Gratton (1984) which, parallel to Rubin (1984), contains in retrospect the idea of ABC methods. The starting point is stochastic approximation, namely the optimisation of a function of a parameter θ when written as an expectation of a random variable Y, E[Y|θ], as in the Kiefer-Wolfowitz algorithm. However, in the case of the likelihood function, there is rarely an unbiased estimator and the authors propose instead to use a kernel density estimator of the density of the summary statistic. This means that, at each iteration of the Kiefer-Wolfowitz algorithm, two sets of observations and hence of summary statistics are simulated and two kernel density estimates derived, both to be applied to the observed summary. The sequences underlying the Kiefer-Wolfowitz algorithm are taken from (the excellent optimisation book of) Spall (2003). Along with on-the-go adaptation and convergence test.

The theoretical difficulty in this extension is however that the kernel density estimator is not unbiased and thus that, rigorously speaking, the validation of the Kiefer-Wolfowitz algorithm does not apply here. On the practical side, the need for multiple starting points and multiple simulations of pseudo-samples may induce considerable time overload. Especially if  bootstrap is used to evaluate the precision of the MLE approximation. Besides normal and M/G/1 queue examples, the authors illustrate the approach on a population genetic dataset of Borneo and Sumatra orang-utans. With 5 parameters and 28 summary statistics. Which thus means using a kernel density estimator in dimension 28, a rather perilous adventure..!