Archive for optimisation

computational methods for numerical analysis with R [book review]

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , , on October 31, 2017 by xi'an

compulysis+R_coverThis is a book by James P. Howard, II, I received from CRC Press for review in CHANCE. (As usual, the customary warning applies: most of this blog post will appear later in my book review column in CHANCE.) It consists in a traditional introduction to numerical analysis with backup from R codes and packages. The early chapters are setting the scenery, from basics on R to notions of numerical errors, before moving to linear algebra, interpolation, optimisation, integration, differentiation, and ODEs. The book comes with a package cmna that reproduces algorithms and testing. While I do not find much originality in the book, given its adherence to simple resolutions of the above topics, I could nonetheless use it for an elementary course in our first year classes. With maybe the exception of the linear algebra chapter that I did not find very helpful.

“…you can have a solution fast, cheap, or correct, provided you only pick two.” (p.27)

The (minor) issue I have with the book and that a potential mathematically keen student could face as well is that there is little in the way of justifying a particular approach to a given numerical problem (as opposed to others) and in characterising the limitations and failures of the presented methods (although this happens from time to time as e.g. for gradient descent, p.191). [Seeping in my Gallic “mal-être”, I am prone to over-criticise methods during classing, to the (increased) despair of my students!, but I also feel that avoiding over-rosy presentations is a good way to avoid later disappointments or even disasters.] In the case of this book, finding [more] ways of detecting would-be disasters would have been nice.

An uninteresting and highly idiosyncratic side comment is that the author preferred the French style for long division to the American one, reminding me of my first exposure to the latter, a few months ago! Another comment from a statistician is that mentioning time series inter- or extra-polation without a statistical model sounds close to anathema! And makes extrapolation a weapon without a cause.

“…we know, a priori, exactly how long the [simulated annealing] process will take since it is a function of the temperature and the cooling rate.” (p.199)

Unsurprisingly, the section on Monte Carlo integration is disappointing for a statistician/probabilistic numericist like me,  as it fails to give a complete enough picture of the methodology. All simulations seem to proceed there from a large enough hypercube. And recommending the “fantastic” (p.171) R function integrate as a default is scary, given the ability of the selected integration bounds to misled its users. Similarly, I feel that the simulated annealing section is not providing enough of a cautionary tale about the highly sensitive impact of cooling rates and absolute temperatures. It is only through the raw output of the algorithm applied to the travelling salesman problem that the novice reader can perceive the impact of some of these factors. (The acceptance bound on the jump (6.9) is incidentally wrongly called a probability on p.199, since it can take values larger than one.)

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]

Le Monde puzzle [#1707]

Posted in Books, Kids, R with tags , , , , , on July 28, 2017 by xi'an

A geometric Le Monde mathematical puzzle:

  1. Given a pizza of diameter 20cm, what is the way to cut it by two perpendicular lines through a point distant 5cm from the centre towards maximising the surface of two opposite slices?
  2.  Using the same point as the tip of the four slices, what is the way to make four slices with equal arcs in four cuts from the tip again towards maximising the surface of two opposite slices?

For both questions, I did not bother with the maths but went itself to a discretisation of the disk, counting the proportion of points within two opposite slices and letting the inclination of these slices move from zero to π/2. Unsurprisingly, for the first question, the answer is π/4, given that there is no difference between both surfaces at angles 0 and π/2. My R code is as follows, using (5,0) as the tip:

M=100
surfaz=function(alpha){
surfz=0
cosal=cos(alpha);sinal=sin(alpha)
X=Y=seq(-10,10,le=M)
Xcosal=(X-5)*cosal
Xsinal=(X-5)*sinal
for (i in 1:M){
norm=sqrt(X[i]^2+Y^2)
scal1=Xsinal[i]+Y*cosal
scal2=-Xcosal[i]+Y*sinal
surfz=surfz+sum((norm<=10)*(scal1*scal2>0))}
return(4*surfz/M/M/pi)}

The second puzzle can be solved by a similar code, except that the slice area between two lines has to be determined by a cross product:

surfoz=function(alpha,ploz=FALSE){
  sinal=sin(alpha);cosal=cos(alpha)
  X=Y=seq(-10,10,le=M)
  frsterm=cosal*(10*cosal-5)+sinal*(10*sinal-5)
  trdterm=cosal*(10*cosal+5)+sinal*(10*sinal+5)
  surfz=0
  for (i in 1:M){
    norm=sqrt(X[i]^2+Y^2)
    scal1=(10*(Y[i]-5)*cosal-(10*sinal-5)*X)*frsterm
    scal2=-(-10*(Y[i]-5)*sinal-(10*cosal-5)*X)*frsterm
    scal3=(-10*(Y[i]-5)*cosal+(10*sinal+5)*X)*trdterm
    scal4=-(10*(Y[i]-5)*sinal+(10*cosal+5)*X)*trdterm
    surfz=surfz+sum((norm<=10)* 
    ((scal1>0)*(scal2>0)+
     (scal3>0)*(scal4>0)))}
 return(4*surfz/M/M/pi)}

a code that shows that all cuts lead to identical surfaces for bot sets of slices. A fairly surprising result!

 

Le Monde puzzle [#1006]

Posted in Kids, R with tags , , , , , , , on May 3, 2017 by xi'an

Once the pseudo-story [noise] removed, a linear programming Le Monde mathematical puzzle:

For the integer linear programming problem

max 2x¹+2x²+x³+…+x¹⁰

under the constraints

x¹>x²+x³, x²>x³+x⁴, …, x⁹>x¹⁰+x¹, x¹⁰>x¹+x²

find a solution with the maximal number of positive entries.

Expressed this way, it becomes quite straightforward to solve with the help of a linear programming R code like lpSolve. Indeed, a simple iteration of the constraints shows that positive entries are necessarily bracketed by negative entries, since, e.g.,

x²<-88x¹/55, x¹⁰<-33x¹/55

(which applies to all consecutive triplets since the constraints are invariant by transposition). Hence there are at most five positive entries but calling lpSolve with this option

> lp (direction="max", 
objective.in=c(2,2,rep(1,8)),
const.mat=A, 
const.dir=rep(">=",10), 
const.rhs=rep(1,10)+A%*%c(rep(c(20,-1),5)), 
all.int=TRUE) 
Error: no feasible solution found

shows this is not possible. (The added vector is my way of getting around the constraint that lpSolve only considers positive entries. I therefore moved the negative entries by 20, meaning they are assumed to be larger than -20. Using the larger bound 50 does not change the outcome.) From there, there are 10 possible versions of vectors with four positive entries and a simple loop returns

> masume
[1] -90
> topast
 [1] -11 1 -13 1 -15 1 -17 1 -19 -9

as the maximal criterion and argument of this maximum with four positive entries.

As an aside, the previous Le Monde puzzle [#1005] was also provided by a linear system: given 64 cubes, each of the 384 faces being painted in one of four colours, with exactly 40 of these cubes exhibiting the four colours,  the question was about the number of cubes that could be bicolour so that a mono-colour super-cube could be reconstituted for all colours.  Which amounted to solve the four equations

4a+2b=24,4c+2d=40, b+c=8,d+3a=24,

leading to 40 quadri-colour, 16 tri-colour, and 8 bi-colour cubes.

oxwasp@amazon.de

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on April 12, 2017 by xi'an

The reason for my short visit to Berlin last week was an OxWaSP (Oxford and Warwick Statistics Program) workshop hosted by Amazon Berlin with talks between statistics and machine learning, plus posters from our second year students. While the workshop was quite intense, I enjoyed very much the atmosphere and the variety of talks there. (Just sorry that I left too early to enjoy the social programme at a local brewery, Brauhaus Lemke, and the natural history museum. But still managed nice runs east and west!) One thing I found most interesting (if obvious in retrospect) was the different focus of academic and production talks, where the later do not aim at a full generality or at a guaranteed improvement over the existing, provided the new methodology provides a gain in efficiency over the existing.

This connected nicely with me reading several Nature articles on quantum computing during that trip,  where researchers from Google predict commercial products appearing in the coming five years, even though the technology is far from perfect and the outcome qubit error prone. Among the examples they provided, quantum simulation (not meaning what I consider to be simulation!), quantum optimisation (as a way to overcome multimodality), and quantum sampling (targeting given probability distributions). I find the inclusion of the latest puzzling in that simulation (in that sense) shows very little tolerance for errors, especially systematic bias. It may be that specific quantum architectures can be designed for specific probability distributions, just like some are already conceived for optimisation. (It may even be the case that quantum solutions are (just next to) available for intractable constants as in Ising or Potts models!)

Le Monde puzzle [#1002]

Posted in Kids, R with tags , , , , , , , on April 4, 2017 by xi'an

For once and only because it is part of this competition, a geometric Le Monde mathematical puzzle:

Given both diagonals of lengths p=105 and q=116, what is the parallelogram with the largest area? and when the perimeter is furthermore constrained to be L=290?

This made me jump right away to the quadrilateral page on Wikipedia, which reminds us that the largest area occurs when the diagonals are orthogonal, in which case it is A=½pq. Only the angle between the diagonals matters. Imposing the perimeter 2s in addition is not solved there, so I wrote an R code looking at all the integer solutions, based on one of the numerous formulae for the area, like ½pq sin(θ), where θ is the angle between both diagonals, and discretising in terms of the fractions of both diagonals at the intersection, and of the angle θ:

p=105
q=116
s=145
for (alpha in (1:500)/1000){
 ap=alpha*p;ap2=ap^2;omap=p-ap;omap2=omap^2
 for (beta in (1:999)/1000){
  bq=beta*q;bq2=bq^2;ombq=q-bq;ombq2=ombq^2
  for (teta in (1:9999)*pi/10000){
   d=sqrt(ap2+bq2-2*ap*bq*cos(teta))
   a=sqrt(ap2+ombq2+2*ap*ombq*cos(teta))
   b=sqrt(omap2+ombq2-2*omap*ombq*cos(teta))
   c=sqrt(omap2+bq2+2*omap*bq*cos(teta))
   if (abs(a+b+c+d-2*s)<.01){
     if (p*q*sin(teta)<2*maxur){
       maxur=p*q*sin(teta)/2
       sole=c(a,b,c,d,alpha,beta,teta)}}}}

This code returned an area of 4350, to compare with the optimal 6090 (which is recovered by the above R code when the diagonal lengths are identical and the perimeter is the one of the associated square). (As Jean-Louis Foulley pointed out to me, this area can be found directly by assuming the quadrilateral is a parallelogram and maximising in the length of one side.)

A knapsack riddle [#2]?

Posted in Kids, R, Statistics with tags , , , on February 17, 2017 by xi'an

gear

Still about this allocation riddle of the past week, and still with my confusion about the phrasing of the puzzle, when looking at a probabilistic interpretation of the game, rather than for a given adversary’s y, the problem turns out to search for the maximum of

\mathbb{E}[L(x,Y)]=\sum_{i=1}^{10} i\{P(Y_i<x_i)-P(Y_i>x_i)\}

where the Y’s are Binomial B(100,p). Given those p’s, this function of x is available in closed form and can thus maximised by a simulated annealing procedure, coded as

utility=function(x,p){
  ute=2*pbinom(x[1]-1,100,prob=p[1])+
   dbinom(x[1],100,p[1])
  for (i in 2:10)
   ute=ute+2*i*pbinom(x[i]-1,100,prob=p[i])+
    i*dbinom(x[i],100,p[i])
  return(ute)}
#basic term in utility
baz=function(i,x,p){
  return(i*dbinom(x[i],100,p[i])+
   i*dbinom(x[i]-1,100,p[i]))}
#relies on a given or estimated p
x=rmultinom(n=1,siz=100,prob=p)
maxloz=loss=0
newloss=losref=utility(x,p)
#random search
T=1e3
Te=1e2
baza=rep(0,10)
t=1
while ((t<T)||(newloss>loss)){
 loss=newloss
 i=sample(1:10,1,prob=(10:1)*(x>0))
#moving all other counters by one
 xp=x+1;xp[i]=x[i]
#corresponding utility change
 for (j in 1:10) baza[j]=baz(j,xp,p)
  proz=exp(log(t)*(baza-baza[i])/Te)
#soft annealing move
 j=sample(1:10,1,prob=proz)
 if (i!=j){ x[i]=x[i]-1;x[j]=x[j]+1}
newloss=loss+baza[j]-baza[i]
if (newloss>maxloz){
 maxloz=newloss;argz=x}
t=t+1
if ((t>T-10)&(newloss<losref)){
 t=1;loss=0
 x=rmultinom(n=1,siz=100,prob=p)
 newloss=losref=utility(x,p)}}

which seems to work, albeit not always returning the same utility. For instance,

> p=cy/sum(cy)
> utility(argz,p)
[1] 78.02
> utility(cy,p)
[1] 57.89

or

> p=sy/sum(sy)
> utility(argz,p)
[1] 82.04
> utility(sy,p)
[1] 57.78

Of course, this does not answer the question as intended and reworking the code to that purpose is not worth the time!

a knapsack riddle?

Posted in Books, pictures, R, Statistics, Travel with tags , , , , , , on February 13, 2017 by xi'an

gear

The [then current now past] riddle of the week is a sort of multiarmed bandits optimisation. Of sorts. Or rather a generalised knapsack problem. The question is about optimising the allocation of 100 undistinguishable units to 10 distinct boxes against a similarly endowed adversary, when the loss function is

L(x,y)=(x_1>y_1)-(x_1<y_1)+...+10((x_{10}>y_{10})-(x_{10}<y_{10}))

and the distribution q of the adversary is unknown. As usual (!), the phrasing of the riddle is somewhat ambiguous but I am under the impression that the game is played sequentially, hence that one can learn about the distribution of the adversary, at least when assuming this adversary keeps the same distribution q at all times. Continue reading