Archive for the Kids Category

a trip back in time [and in Rouen]

Posted in Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , on June 24, 2017 by xi'an

On Monday, I took part in a celebration of the remarkable career of a former colleague of mine in Rouen, Gérard Grancher, who is retiring after a life-long position as CNRS engineer in the department of maths of the University of Rouen, a job title that tells very little about the numerous facets of his interactions with mathematics, from his handling of all informatics aspects in the laboratory to his support of all colleagues there, including fresh PhD students like me in 1985!, to his direction of the CNRS lab in 2006 and 2007 at a time of deep division and mistrust, to his numerous collaborations on statistical projects with local actors, to his Norman federalism in bringing the maths departments of Caen and Rouen into a regional federation, to an unceasing activism to promote maths in colleges and high schools and science fairs all around Normandy, to his contributions to professional training in statistics for CNRS agents, and much, much more… Which explains why the science auditorium of the University of Rouen was packed with mathematicians and high schools maths teachers and friends! (The poster of the day was made by Gérard’s accomplices in vulgarisation, Élise Janvresse and Thierry Delarue, based on a sample of points randomly drawn from Gérard’s picture, maybe using a determinantal process, and the construction of a travelling salesman path over those points.)

This was a great day with mostly vulgarisation talks (including one about Rasmus’ socks..!) and reminiscences about Gérard’s carreer at Rouen. As I had left the university in 2000 to move to Paris-Dauphine, this was a moving day as well, as I met with old friends I had not seen for ages, including our common PhD advisor, Jean-Pierre Raoult.

This trip back in time was also an opportunity to (re-)visit the beautifully preserved medieval centre of Rouen, with its wooden houses, Norman-style, the numerous churches, including Monet‘s cathedral, the Justice Hall… Last time I strolled those streets, George Casella was visiting!

Le Monde puzzle [#1013]

Posted in Books, Kids with tags , , , , , on June 23, 2017 by xi'an

A purely arithmetic Le Monde mathematical puzzle:

An operation þ applies to all pairs of natural integers with the properties

0 þ (a+1) = (0 þ a)+1, (a+1) þ (b+1)=(a þ b)+1, 271 þ 287 = 77777, 2018 þ 39 = 2018×39

Find the smallest integer d>287 such that there exists c<d leading to c þ d = c x d, the smallest integer f>2017 such that 2017 þ f = 2017×40. Is there any know integer f such that f þ 2017 = 40×2017?

The major appeal in this puzzle (where no R programming seems to help!) is that the “data” does not completely defines the operation  þ ! Indeed, when a<b, it is straightforward to deduce that a þ b = (0 þ 0)+b, hence solving the first two questions by deriving (0 þ 0)=270×287 [with d=2×287 and f=2017×40-270×287], but the opposed quantity b þ a is not defined, apart from (2018-39) þ 0. This however brings a resolution since

(2018-39) þ 0 = 2017×39 and (2018-39+2017) þ 2017 = 2017×39+2017 = 2017×40

leading to f=2018-39+2017=3996.

La Rochambelle, 25000⁺ coureuses! [39:29, 24⁰, 164th & 7th V2…]

Posted in Kids, pictures, Running, Travel with tags , , , , , , , , , , on June 18, 2017 by xi'an

As almost every year in the last decade, I have run the 10K in Caen for Courants de la Liberté, with 5000⁺ runners, on a new route completely in the city of Caen, partly downhill..! It did not go well (although I started in 3:44 on the first three k’s) as I ended up at a poor position (8th) in my category, which is not surprising with some runners now 8 years younger than I! (The runner next to me is the second V3.) And a fairly hot weather, especially for a Norman early morning…. Several runners fainted on the race or upon arrival and the faces of most runners showed the strain. But I first and primarily want to congratulate my mom for walking the 6⁻ km the previous evening despite serious health issues in the previous months, as well as my mother in-law who walked with her.

Bayesian decision riddle

Posted in Books, Kids, Statistics with tags , , , , on June 15, 2017 by xi'an

The current puzzle on The Riddler is a version of the secretary problem with an interesting (?) Bayesian solution.

Given four positive numbers x¹, x², x³, x⁴, observed sequentially, the associated utility is the value of x at the stopping time. What is the optimal stopping rule?

While nothing is mentioned about the distribution of the x’s, I made the assumption that they were iid and uniformly distributed over (0,M), with M unknown and tried a Bayesian resolution with the non-informative prior π(M)=1/M. And failed. The reason for this failure is that the expected utility is infinite at the first step: while the posterior expected utility is finite with three and two observations, meaning I can compare stopping and continuing at the second and third steps, the predicted expected reward for continuing after observing x¹ does not exist because the expected value of max(x¹,x²) given x¹ does not exist. As the predictive density of x² is max(x¹,x²)⁻²…  Several alternatives are possible to bypass this impossible resolution, from changing the utility function to picking another reference prior.

For instance, using a prior like π(M)=1/M² l(and the same monetary return utility) leads to a proper optimal solution, namely

  1. always wait for the second observation x²
  2. stop at x² if x²>11x¹/12, else wait for x³
  3. stop at x³ if x³>23 max(x¹,x²)/24, else observe x⁴

obtained analytically on a bar table in Rouen (and checked numerically later).

Another approach is to try to optimise the probability to pick the largest amount of the four x’s, but this is not leading to an interesting solution, since it corresponds to picking the first maximum after x¹, while picking the largest among remaining ones leads to a somewhat convoluted solution I have no patience to produce here! Plus this is not a really pertinent loss function as it does not discriminate enough against waiting…

Le Monde puzzle [#1012]

Posted in Books, Kids with tags , , , , , on June 14, 2017 by xi'an

A basic geometric Le Monde mathematical puzzle:

Take a triangle ABC such that the side AB is c=42 long, each side has an integer length, and the area is 756. Given an inner point D, draw three lines parallel to the three sides of ABC through D in order to construct three triangles with common summit D and bases supported by these three sides.

  1. How far is D from the base AB when all three triangles have perimeters equal to the sides that support their basis?
  2. How far is D from the previous solution when the sum of the areas of the three triangles is minimal?

Since the puzzle is purely geometric, I was quite tempted to bypass it and to watch instead the British elections and the Comey audition! However, the sides a and b are easily found by an exhaustive search, a=39 and b=45 (or the reverse). From there, the problem resolution proceeds by a similar triangles argument, since all triangles constructed by the game rule have the same angles, hence proportional sides. For the first question, this leads to a straightforward determination of the basis of each triangle by the perimeter equation, meaning that D is then 12 units away from AB. The second question is not harder in that the surface of a triangle with basis a and opposite angles β and γ can be written as

a²sin(β)sin(γ)/2sin(β+γ)

meaning it suffices to minimise a²+a’²+a”² under the constraint that the sum of the three sides parallel to BC is the complete length of BC, a²+a’²+a”²=39. The solution is then that all triangles are identical, leading to a summit D’ at a distance 12 from AB, again!, but in the middle of the segment, hence distance to the earlier D equal to one.

ACDC versus ABC

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , on June 12, 2017 by xi'an

At the Bayes, Fiducial and Frequentist workshop last month, I discussed with the authors of this newly arXived paper, Approximate confidence distribution computing, Suzanne Thornton and Min-ge Xie. Which they abbreviate as ACC and not as ACDC. While I have discussed the notion of confidence distribution in some earlier posts, this paper aims at producing proper frequentist coverage within a likelihood-free setting. Given the proximity with our recent paper on the asymptotics of ABC, as well as with Li and Fearnhead (2016) parallel endeavour, it is difficult (for me) to spot the actual distinction between ACC and ABC given that we also achieve (asymptotically) proper coverage when the limiting ABC distribution is Gaussian, which is the case for a tolerance decreasing quickly enough to zero (in the sample size).

“Inference from the ABC posterior will always be difficult to justify within a Bayesian framework.”

Indeed the ACC setting is eerily similar to ABC apart from the potential of the generating distribution to be data dependent. (Which is fine when considering that the confidence distributions have no Bayesian motivation but are a tool to ensure proper frequentist coverage.) That it is “able to offer theoretical support for ABC” (p.5) is unclear to me, given both this data dependence and the constraints it imposes on the [sampling and algorithmic] setting. Similarly, I do not understand how the authors “are not committing the error of doubly using the data” (p.5) and why they should be concerned about it, standing outside the Bayesian framework. If the prior involves the data as in the Cauchy location example, it literally uses the data [once], followed by an ABC comparison between simulated and actual data, that uses the data [a second time].

“Rather than engaging in a pursuit to define a moving target such as [a range of posterior distributions], ACC maintains a consistently clear frequentist interpretation (…) and thereby offers a consistently cohesive interpretation of likelihood-free methods.”

The frequentist coverage guarantee comes from a bootstrap-like assumption that [with tolerance equal to zero] the distribution of the ABC/ACC/ACDC random parameter around an estimate of the parameter given the summary statistic is identical to the [frequentist] distribution of this estimate around the true parameter [given the true parameter, although this conditioning makes no sense outside a Bayesian framework]. (There must be a typo in the paper when the authors define [p.10] the estimator as minimising the derivative of the density of the summary statistic, while still calling it an MLE.) That this bootstrap-like assumption holds is established (in Theorem 1) under a CLT on this MLE and assumptions on the data-dependent proposal that connect it to the density of the summary statistic. Connection that seem to imply a data-dependence as well as a certain knowledge about this density. What I find most surprising in this derivation is the total absence of conditions or even discussion on the tolerance level which, as we have shown, is paramount to the validation or invalidation of ABC inference. It sounds like the authors of Approximate confidence distribution computing are setting ε equal to zero for those theoretical derivations. While in practice they apply rules [for choosing ε] they do not voice out, but which result in very different acceptance rates for the ACC version they oppose to an ABC version. (In all illustrations, it seems that ε=0.1, which does not make much sense.) All in all, I am thus rather skeptical about the practical implications of the paper in that it seems to achieve confidence guarantees by first assuming proper if implicit choices of summary statistics and parameter generating distribution.

Alex Honnold free solos Freeride (5.13a/7c+)

Posted in Books, Kids, Mountains, pictures, Travel with tags , , , , , on June 11, 2017 by xi'an