Archive for residuals

Le Monde puzzle [#1053]

Posted in Books, Kids, R with tags , , , , , , , on June 21, 2018 by xi'an

An easy arithmetic Le Monde mathematical puzzle again:

  1. If coins come in units of 1, x, and y, what is the optimal value of (x,y) that minimises the number of coins representing an arbitrary price between 1 and 149?
  2.  If the number of units is now four, what is the optimal choice?

The first question is fairly easy to code

coinz <- function(x,y){
  z=(1:149)
  if (y<x){xx=x;x=y;y=xx}
  ny=z%/%y
  nx=(z%%y)%/%x
  no=z-ny*y-nx*x
  return(max(no+nx+ny))
}

and returns M=12 as the maximal number of coins, corresponding to x=4 and y=22. And a price tag of 129.  For the second question, one unit is necessarily 1 (!) and there is just an extra loop to the above, which returns M=8, with other units taking several possible values:

[1] 40 11  3
[1] 41 11  3
[1] 55 15  4
[1] 56 15  4

A quick search revealed that this problem (or a variant) is solved in many places, from stackexchange (for an average—why average?, as it does not make sense when looking at real prices—number of coins, rather than maximal), to a paper by Shalit calling for the 18¢ coin, to Freakonomics, to Wikipedia, although this is about finding the minimum number of coins summing up to a given value, using fixed currency denominations (a knapsack problem). This Wikipedia page made me realise that my solution is not necessarily optimal, as I use the remainders from the larger denominations in my code, while there may be more efficient divisions. For instance, running the following dynamic programming code

coz=function(x,y){
  minco=1:149
  if (x<y){ xx=x;x=y;y=xx}
  for (i in 2:149){
    if (i%%x==0)
      minco[i]=i%/%x
    if (i%%y==0)
      minco[i]=min(minco[i],i%/%y)
    for (j in 1:max(1,trunc((i+1)/2)))
          minco[i]=min(minco[i],minco[j]+minco[i-j])
      }
  return(max(minco))}

returns the lower value of M=11 (with x=7,y=23) in the first case and M=7 in the second one.

The Seven Pillars of Statistical Wisdom [book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , on June 10, 2017 by xi'an

I remember quite well attending the ASA Presidential address of Stephen Stigler at JSM 2014, Boston, on the seven pillars of statistical wisdom. In connection with T.E. Lawrence’s 1926 book. Itself in connection with Proverbs IX:1. Unfortunately wrongly translated as seven pillars rather than seven sages.

As pointed out in the Acknowledgements section, the book came prior to the address by several years. I found it immensely enjoyable, first for putting the field in a (historical and) coherent perspective through those seven pillars, second for exposing new facts and curios about the history of statistics, third because of a literary style one would wish to see more often in scholarly texts and of a most pleasant design (and the list of reasons could go on for quite a while, one being the several references to Jorge Luis Borges!). But the main reason is to highlight the unified nature of Statistics and the reasons why it does not constitute a subfield of either Mathematics or Computer Science. In these days where centrifugal forces threaten to split the field into seven or more disciplines, the message is welcome and urgent.

Here are Stephen’s pillars (some comments being already there in the post I wrote after the address):

  1. aggregation, which leads to gain information by throwing away information, aka the sufficiency principle. One (of several) remarkable story in this section is the attempt by Francis Galton, never lacking in imagination, to visualise the average man or woman by superimposing the pictures of several people of a given group. In 1870!
  2. information accumulating at the √n rate, aka precision of statistical estimates, aka CLT confidence [quoting  de Moivre at the core of this discovery]. Another nice story is Newton’s wardenship of the English Mint, with musing about [his] potential exploiting this concentration to cheat the Mint and remain undetected!
  3. likelihood as the right calibration of the amount of information brought by a dataset [including Bayes’ essay as an answer to Hume and Laplace’s tests] and by Fisher in possible the most impressive single-handed advance in our field;
  4. intercomparison [i.e. scaling procedures from variability within the data, sample variation], from Student’s [a.k.a., Gosset‘s] t-test, better understood and advertised by Fisher than by the author, and eventually leading to the bootstrap;
  5. regression [linked with Darwin’s evolution of species, albeit paradoxically, as Darwin claimed to have faith in nothing but the irrelevant Rule of Three, a challenging consequence of this theory being an unobserved increase in trait variability across generations] exposed by Darwin’s cousin Galton [with a detailed and exhilarating entry on the quincunx!] as conditional expectation, hence as a true Bayesian tool, the Bayesian approach being more specifically addressed in (on?) this pillar;
  6. design of experiments [re-enters Fisher, with his revolutionary vision of changing all factors in Latin square designs], with an fascinating insert on the 18th Century French Loterie,  which by 1811, i.e., during the Napoleonic wars, provided 4% of the national budget!;
  7. residuals which again relate to Darwin, Laplace, but also Yule’s first multiple regression (in 1899), Fisher’s introduction of parametric models, and Pearson’s χ² test. Plus Nightingale’s diagrams that never cease to impress me.

The conclusion of the book revisits the seven pillars to ascertain the nature and potential need for an eight pillar.  It is somewhat pessimistic, at least my reading of it was, as it cannot (and presumably does not want to) produce any direction about this new pillar and hence about the capacity of the field of statistics to handle in-coming challenges and competition. With some amount of exaggeration (!) I do hope the analogy of the seven pillars that raises in me the image of the beautiful ruins of a Greek temple atop a Sicilian hill, in the setting sun, with little known about its original purpose, remains a mere analogy and does not extend to predict the future of the field! By its very nature, this wonderful book is about foundations of Statistics and therefore much more set in the past and on past advances than on the present, but those foundations need to move, grow, and be nurtured if the field is not to become a field of ruins, a methodology of the past!

JSM 2014, Boston

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 6, 2014 by xi'an

A new Joint Statistical meeting (JSM), first one since JSM 2011 in Miami Beach. After solving [or not] a few issues on the home front (late arrival, one lost bag, morning run, flat in a purely residential area with no grocery store nearby and hence no milk for tea!), I “trekked” to [and then through] the faraway and sprawling Boston Convention Centre and was there in (plenty of) time for Mathias Drton’s Medalion Lecture on linear structural equations. (The room was small and crowded and I was glad to be there early enough!, although there were no Cerberus [Cerberi?] to prevent additional listeners to sit on the ground, as in Washington D.C. a few years ago.) The award was delivered to Mathias by Nancy Reid from Toronto (and reminded me of my Medallion Lecture in exotic Fairbanks ten years ago). I had alas missed Gareth Roberts’ Blackwell Lecture on Rao-Blackwellisation, as I was still in the plane from Paris, trying to cut on my slides and to spot known Icelandic locations from glancing sideways at the movie The Secret Life of Walter Mitty played on my neighbour’s screen. (Vik?)

Mathias started his wide-ranging lecture by linking linear structural models with graphical models and specific features of covariance matrices. I did not spot a motivation for the introduction of confounding factors, a point that always puzzles me in this literature [as I must have repeatedly mentioned here]. The “reality check” slide made me hopeful but it was mostly about causality [another of or the same among my stumbling blocks]… What I have trouble understanding is how much results from the modelling and how much follows from this “reality check”. A novel notion revealed by the talk was the “trek rule“, expressing the covariance between variables as a product of “treks” (sequence of edges) linking those variables. This is not a new notion, introduced by Wright (1921), but it is a very elegant representation of the matrix inversion of (I-Λ) as a power series. Mathias made it sound quite intuitive even though I would have difficulties rephrasing the principle solely from memory! It made me [vaguely] wonder at computational implications for simulation of posterior distributions on covariance matrices. Although I missed the fundamental motivation for those mathematical representations. The last part of the talk was a series of mostly open questions about the maximum likelihood estimation of covariance matrices, from existence to unimodality to likelihood-ratio tests. And an interesting instance of favouring bootstrap subsampling. As in random forests.

I also attended the ASA Presidential address of Stephen Stigler on the seven pillars of statistical wisdom. In connection with T.E. Lawrence’s 1927 book. (Actually, 1922.) Itself in connection with Proverbs IX:1. Unfortunately wrongly translated as seven pillars rather than seven sages.  Here are Stephen’s pillars:

  1. aggregation, which leads to gain information by throwing away information, aka the sufficiency principle [one may wonder at the extension of this principleto non-exponantial families]
  2. information accumulating at the √n rate, aka precision of statistical estimates, aka CLT confidence [quoting our friend de Moivre at the core of this discovery]
  3. likelihood as the right calibration of the amount of information brought by a dataset [including Bayes’ essay]
  4. intercomparison [i.e. scaling procedures from variability within the data, sample variation], eventually leading to the bootstrap
  5. regression [linked with Darwin’s evolution of species, albeit paradoxically] as conditional expectation, hence as a Bayesian tool
  6. design of experiment [enters Fisher, with his revolutionary vision of changing all factors in Latin square designs]
  7. residuals [aka goodness of fit but also ABC!]

Maybe missing the positive impact of the arbitrariness of picking or imposing a statistical model upon an observed dataset. Maybe not as it is somewhat covered by #3, #4 and #7. The reliance on the reproducibility of the data could be the ground on which those pillars stand.