## Archive for Germany

## Die Mauer ist weg!

Posted in Statistics with tags 9 November 1989, Alexanderplatz, Berlin, Berlin wall, Berliner Mauer, Bornholmer Straße, East Germany, GDR, Germany on November 9, 2019 by xi'an## Hausdorff school on MCMC [28 March-02 April, 2020]

Posted in pictures, Statistics, Travel with tags ABC, ABC in Grenoble, Bonn, Garching, Germany, Hausdorff metric, likelihood-free methods, MCMC, SIAM, Technische Universität München, travel support, tutorial, uncertainty quantification, UQ20 on September 26, 2019 by xi'an**T**he Hausdorff Centre for Mathematics will hold a week on recent advances in MCMC in Bonn, Germany, March 30 – April 3, 2020. Preceded by two days of tutorials. (“These tutorials will introduce basic MCMC methods and mathematical tools for studying the convergence to the invariant measure.”) There is travel support available, but the application deadline is quite close, as of 30 September.

Note that, in a Spring of German conference, the SIAM Conference on Uncertainty Quantification will take place in Munich (Garching) the week before, on March 24-27. With at least one likelihood-free session. Not to mention the ABC in Grenoble workshop in France, on 19-20 March. (Although these places are not exactly nearby!)

## 9 pitfalls of data science [book review]

Posted in Books, Kids, Statistics, Travel, University life with tags Austria, book review, CHANCE, Germany, jiu-jitsu, lotus, OUP, Oxford University Press, poker, Salzburg, The Book of Why, Theranos, train travel, USA on September 11, 2019 by xi'an**I** received The 9 pitfalls of data science by Gary Smith [who has written a significant number of general public books on personal investment, statistics and AIs] and Jay Cordes from OUP for review a few weeks ago and read it on my trip to Salzburg. This short book contains a lot of anecdotes and what I would qualify of small talk on job experiences and colleagues’ idiosyncrasies…. More fundamentally, it reads as a sequence of examples of bad or misused statistics, as many general public books on statistics do, but with little to say on how to spot such misuses of statistics. Its title (It seems like *the 9 pitfalls of…* is a rather common début for a book title!) however started a (short) conversation with my neighbour on the train to Salzburg as she wanted to know if the job opportunities in data sciences were better in Germany than in Austria. A practically important question for which I had no clue. And I do not think the book would have helped either! (My neighbour in the earlier plane to München had a book on growing lotus, which was not particularly enticing for launching a conversation either.)

Chapter I “*Using bad data*” is made of examples of truncated or cherry picked data often associated with poor graphics. Only one dimensional outcome and also very US centric. Chapter II “*Data before theory*” highlights spurious correlations and post hoc predictions, criticism of data mining, some examples being quite standard. Chapter III “*Worshiping maths*” sounds like the perfect opposite of the previous cahpter: it discusses the fact that all models are wrong but some may be more wrong than others. And gives examples of over fitting, p-value hacking, regression applied to longitudinal data. With the message that (maths) assumptions are handy and helpful but not always realistic. Chapter IV “*Worshiping computers*” is about the new golden calf and contains rather standard stuff on trusting the computer output because it is a machine. However, the book is somewhat falling foul of the same mistake by trusting a Monte Carlo simulation of a shortfall probability for retirees since Monte Carlo also depends on a model! Computer simulations may be fine for Bingo night or poker tournaments but much more uncertain for complex decisions like retirement investments. It is also missing the biasing aspects in constructing recidivism prediction models pointed out in Weapons of math destruction. Until Chapter 9 at least. The chapter is also mentioning adversarial attacks if not GANs (!). Chapter V “*Torturing data*” mentions famous cheaters like Wansink of the bottomless bowl and pizza papers and contains more about p-hacking and reproducibility. Chapter VI “*Fooling yourself*” is a rather weak chapter in my opinion. Apart from Ioannidis take on Theranos’ lack of scientific backing, it spends quite a lot of space on stories about poker gains in the unregulated era of online poker, with boasts of significant gains that are possibly earned from compulsive gamblers playing their family savings, which is not particularly praiseworthy. And about Brazilian jiu-jitsu. Chapter VII “*Correlation vs causation*” predictably mentions Judea Pearl (whose book of why I just could not finish after reading one rant too many about statisticians being unable to get causality right! Especially after discussing the book with Andrew.). But not so much to gather from the chapter, which could have instead delved into deep learning and its ways to avoid overfitting. The first example of this chapter is more about confusing conditionals (what is conditional on what?) than turning causation around. Chapter VII “*Regression to the mean*” sees Galton’s quincunx reappearing here after Pearl’s book where I learned (and checked with Steve Stiegler) that the device was indeed intended for that purpose of illustrating regression to the mean. While the attractive fallacy is worth pointing out there are much worse abuses of regression that could be presented. CHANCE’s Howard Wainer also makes an appearance along SAT scores. Chapter IX “*Doing harm*” does engage into the issue that predicting social features like recidivism by a (black box) software is highly worrying (and just plain wrong) if only because of this black box nature. Moving predictably to chess and go with the right comment that this does not say much about real data problems. A word of warning about DNA testing containing very little about ancestry, if only because of the company limited and biased database. With further calls for data privacy and a rather useless entry on North Korea. Chapter X “*The Great Recession*“, which discusses the subprime scandal (as in Stewart’s book), contains a set of (mostly superfluous) equations from Samuelson’s paper (supposed to scare or impress the reader?!) leading to the rather obvious result that the expected concave utility of a weighted average of iid positive rvs is maximal when all the weights are equal, result that is criticised by laughing at the assumption of iid-ness in the case of mortgages. Along with those who bought exotic derivatives whose construction they could not understand. The (short) chapter keeps going through all the (a posteriori) obvious ingredients for a financial disaster to link them to most of the nine pitfalls. Except the second about data before theory, because there was no data, only theory with no connection with reality. This final chapter is rather enjoyable, if coming after the facts. And containing this altogether unnecessary mathematical entry. *[Usual warning: this review or a revised version of it is likely to appear in CHANCE, in my book reviews column.]*

## MaxEnt 2019 [last call]

Posted in pictures, Statistics, Travel with tags conference, Garching, Germany, MaxEnt 2019, maximum entropy, München, O'Bayes 2019, Warwick on April 30, 2019 by xi'an**F**or those who definitely do *not* want to attend O’Bayes 2019 in Warwick, the Max Ent 2019 conference is taking place at the Max Planck Institute for plasma physics in Garching, near Münich, (south) Germany at the same time. Registration is still open at a reduced rate and it is yet not too late to submit an abstract. A few hours left. (While it is still possible to submit a poster for O’Bayes 2019 and to register.)

## trip to the past

Posted in Books, pictures with tags cavalry, D Day, Dragons regiment, first World War, French army, Germany, Gustrow, refugees, war prisonner, WW I on January 6, 2019 by xi'an**W**hen visiting my mother for the Xmas break, she showed me this picture of her grand-father, Médéric, in his cavalry uniform, taken before the First World War, in 1905. During the war, as an older man, he did not come close to the front lines, but died from a disease caught from the horses he was taking care of. Two other documents I had not seen before were these refugee cards that my grand-parents got after their house in Saint-Lô got destroyed on June 7, 1944.

And this receipt for the tinned rabbit meat packages my grand-mother was sending to a brother-in-law who was POW in Gustrow, Germany, receipt that she kept despite the hardships she faced in the years following the D Day landing.

## Max Ent at Max Plank

Posted in Statistics with tags Bayesian inference, Carl Friedrich Gauss, conference, Gauß, Germany, Max Planck Institute, MaxEnt 2019, maximum entropy, München, O'Bayes 2019, University of Warwick on December 21, 2018 by xi'an## truncated Gumbels

Posted in Books, Kids, pictures, Statistics with tags Columbia University, cross validated, Emil Julius Gumbel, extreme value theory, Germany, Gumbel distribution, Heidelberg, Weimar on April 6, 2018 by xi'an**A**s I had to wake up pretty early on Easter morning to give my daughter a ride, while waiting I came upon this calculus question on X validated of computing the conditional expectation of a Gumbel variate, conditional on its drifted version being larger than another independent Gumbel variate with the same location-scale parameters. (Just reminding readers that a Gumbel G(0,1) variate is a double log-uniform, i.e., can be generated as X=-log(-log(U)).) And found after a few minutes (and a call to Wolfram Alpha integrator) that

which is simple enough to make me wonder if there is a simpler derivation than the call to the exponential integral Ei(x) function. (And easy to check by simulation.)

Incidentally, I discovered that Emil Gumbel had applied statistical analysis to the study of four years of political murders in the Weimar Republic, demonstrating the huge bias of the local justice towards right-wing murders. When he signed the urgent call [for the union of the socialist and communist parties] against fascism in 1932, he got expelled from his professor position in Heidelberg and emigrated to France, which he had to leave again for the USA on the Nazi invasion in 1940. Where he became a professor at Columbia.