Archive for the Books Category

holy sister [book review]

Posted in Books with tags , , , , , , on October 13, 2019 by xi'an

Third and last volume in Mark Lawrence’s series, this book did not disappoint me, as often conclusions do. Maybe because I was in a particularly serene mind after my month in Japan! The characters were the same, obviously, but had grown in depth and maturity, including the senior nuns that were before somewhat caricatures of themselves, the superposition of two time lines was helping with the story tension, as was the imminent destruction of the spatial apparatus keeping the planet from freezing, with some time spent under the Ice (although the notion of permanent tunnels there was rather unrealistic!) and the petty fantasy boarding school stories had all but vanished (or remained with a purpose). But also unpredictable twists and a whole new scale for the magical abilities of the characters, some sad deaths and happy survivals. While Lawrence somehow specializes in anti-heroes, the central character is very much redeemed of the blackness that could have been attached with her, especially when [no-spoiler!] occurs! The book is also so well-connected with the previous two volumes that this would almost make re-reading these compulsory. If anything, this last volume could have benefited from being thicker!

ABC-SAEM

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , on October 8, 2019 by xi'an

In connection with the recent PhD thesis defence of Juliette Chevallier, in which I took a somewhat virtual part for being physically in Warwick, I read a paper she wrote with Stéphanie Allassonnière on stochastic approximation versions of the EM algorithm. Computing the MAP estimator can be done via some adapted for simulated annealing versions of EM, possibly using MCMC as for instance in the Monolix software and its MCMC-SAEM algorithm. Where SA stands sometimes for stochastic approximation and sometimes for simulated annealing, originally developed by Gilles Celeux and Jean Diebolt, then reframed by Marc Lavielle and Eric Moulines [friends and coauthors]. With an MCMC step because the simulation of the latent variables involves an untractable normalising constant. (Contrary to this paper, Umberto Picchini and Adeline Samson proposed in 2015 a genuine ABC version of this approach, paper that I thought I missed—although I now remember discussing it with Adeline at JSM in Seattle—, ABC is used as a substitute for the conditional distribution of the latent variables given data and parameter. To be used as a substitute for the Q step of the (SA)EM algorithm. One more approximation step and one more simulation step and we would reach a form of ABC-Gibbs!) In this version, there are very few assumptions made on the approximation sequence, except that it converges with the iteration index to the true distribution (for a fixed observed sample) if convergence of ABC-SAEM is to happen. The paper takes as an illustrative sequence a collection of tempered versions of the true conditionals, but this is quite formal as I cannot fathom a feasible simulation from the tempered version and not from the untempered one. It is thus much more a version of tempered SAEM than truly connected with ABC (although a genuine ABC-EM version could be envisioned).

what if what???

Posted in Books, Statistics with tags , , , , , on October 7, 2019 by xi'an

[Here is a section of the Wikipedia page on Monte Carlo methods which makes little sense to me. What if it was not part of this page?!]

Monte Carlo simulation versus “what if” scenarios

There are ways of using probabilities that are definitely not Monte Carlo simulations – for example, deterministic modeling using single-point estimates. Each uncertain variable within a model is assigned a “best guess” estimate. Scenarios (such as best, worst, or most likely case) for each input variable are chosen and the results recorded.[55]

By contrast, Monte Carlo simulations sample from a probability distribution for each variable to produce hundreds or thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring.[56] For example, a comparison of a spreadsheet cost construction model run using traditional “what if” scenarios, and then running the comparison again with Monte Carlo simulation and triangular probability distributions shows that the Monte Carlo analysis has a narrower range than the “what if” analysis. This is because the “what if” analysis gives equal weight to all scenarios (see quantifying uncertainty in corporate finance), while the Monte Carlo method hardly samples in the very low probability regions. The samples in such regions are called “rare events”.

Les Indes Fourbes [book review]

Posted in Books, pictures with tags , , , , , , , , on October 6, 2019 by xi'an

Among the pile of Le Monde issues I found when I came back from Japan, I spotted a rather enthusiastic review of a bédé (comics) by Alain Ayroles and Juanjo Guarnido called les Indes Fourbes, which is the pastiche of the continuation of a 16th century picaresque (Spanish) novel El Buscon by Francisco de Quevedo. While picaresque novels often end up being overwhelming by the endless inclusion of stories within stories, this one is quite amazing and does not suffer from its length (200 pages) or the continuation of switches in the plot, the end being particularly terrific. The author of the scenario also wrote the De Cape and De Croc series, which I enjoyed and which takes place at about the same time, inspired from Cyrano de Bergerac and his trip to the Moon. The drawings are however made by another author, more in tune with the story and quite elaborate. Definitely enjoyable!

Le Monde puzzle [#1112]

Posted in Books, Kids, R with tags , , , on October 3, 2019 by xi'an

Another low-key arithmetic problem as Le Monde current mathematical puzzle:

Find the 16 integers x¹,x²,x³,x⁴,y¹,y²,y³,y⁴,z¹,z²,z³,z⁴,w¹,w²,w³,w⁴ such that the groups x¹,y¹,z¹,w¹, &tc., are made of distinct positive integers, the sum of the x’s is 24, of the y’s 20, of the z’s 19 and of the w’s 17. Furthermore, x¹<max(y¹,z¹,w¹), x²<max(y²,z²,w²), x³<max(y³,z³,w³)=z³, and x⁴<max(y⁴,z⁴,w⁴), while x¹<x².

There are thus 4 x 3 unknowns, all bounded by either 20-3=17 or 19-3=16 for x³,y³,z³. It is then a case for brute force resolution since drawing all quadruplets by rmultinom until all conditions are satisfied

valid=function(x,y,z,w){
 (z[3]>max(y[3],w[3]))&(x[1]<x[2])&(sum(x)==24)&
 (sum(y)==20)&(sum(z)==19)&(sum(w)==17)}

returns quickly several solutions. Meaning I misread the question and missed the constraint that the four values at each step were the same up to a permutation, decreasing the number of unknowns to four, a,b,c,d (ordered). And then three because the sum of the four is 20, average of the four sums. It seems to me that the first sum of x’s being 24 and involving only a,b, and c implies that 4c is larger than 24, ie c>6, hence d>7, while a>0 and b>1, leaving only two degrees of freedom for choosing the four values, meaning that only

  1 2 7 10
1 2 8 9
1 3 7 9
1 4 7 8
2 3 7 8

are possible. Sampling at random across these possible choices and allocating the numbers at random to x,y,z, and w leads rather quickly to the solution

     [,1] [,2] [,3] [,4]
[1,]    3    7    7    7
[2,]    2    8    2    8
[3,]    7    2    8    2
[4,]    8    3    3    3

from here to infinity

Posted in Books, Statistics, Travel with tags , , , , , , , , , , , , , on September 30, 2019 by xi'an

“Introducing a sparsity prior avoids overfitting the number of clusters not only for finite mixtures, but also (somewhat unexpectedly) for Dirichlet process mixtures which are known to overfit the number of clusters.”

On my way back from Clermont-Ferrand, in an old train that reminded me of my previous ride on that line that took place in… 1975!, I read a fairly interesting paper published in Advances in Data Analysis and Classification by [my Viennese friends] Sylvia Früwirth-Schnatter and Gertrud Malsiner-Walli, where they describe how sparse finite mixtures and Dirichlet process mixtures can achieve similar results when clustering a given dataset. Provided the hyperparameters in both approaches are calibrated accordingly. In both cases these hyperparameters (scale of the Dirichlet process mixture versus scale of the Dirichlet prior on the weights) are endowed with Gamma priors, both depending on the number of components in the finite mixture. Another interesting feature of the paper is to witness how close the related MCMC algorithms are when exploiting the stick-breaking representation of the Dirichlet process mixture. With a resolution of the label switching difficulties via a point process representation and k-mean clustering in the parameter space. [The title of the paper is inspired from Ian Stewart’s book.]

Prussian blue [book review]

Posted in Books, Travel with tags , , , , , , , , , , , , , , , , on September 28, 2019 by xi'an

This is the one-before-last volume in Philip Kerr’s Bernie Gunther series (one-before-last since the author passed away last year). Which I picked in a local bookstore for taking place in Berchtesgaden, which stands a few kilometers west of Salzburg and which I passed on my way there (and back) last week. Very good title, full of double meanings!

“When you’re working for people who are mostly thieves and murderers, a little of it comes off on your hands now and then.”

Two time-lines run in parallel in Prussian Blue, from 1939 Nazi Germany to 1956 France, from (mostly) hunter to hunted. Plenty of wisecracks worth quoting throughout the book, mostly à la Marlowe, but also singling out Berlin(ers) from the rest of Germany. An anti-hero if any in that Bernie Gunther is working there as a policeman for the Nazi State, aiming at making the law respected in a lawless era and to catch murderers at a time where the highest were all murderers and about to upscale this qualification to levels never envisioned before. Still working under Heydrich’s order to solve a murder despite the attempt of other arch-evils like Martin Bormann and Ernst Kaltenbrunner, as well as a helpful (if Hitler supporter!) Gerdy Troost. Among the Gunther novels I have read so far this one is the closest he gets to the ultimate evil, Hitler himself, who considered the Berghof in Berchtesgaden as his favourite place, without ever meeting him. The gratuitous violence and bottomless corruption inherent to the fascist regime are most realistically rendered in the thriller, to the point of making the possibility of a Bernie Gunther debatable!

‘Making a nuisance of yourself is what being a policeman is all about and suspecting people who were completely above suspicion was about the only thing that made doing the job such fun in Nazi Germany.’

As I kept reading the book I could not but draw a connection with the pre-War Rogue Male imperfect but nonetheless impressive novel, where an English “sport” hunter travels to Berchtesgaden to shoot (or aim at) Hitler only to get spotted by soldiers before committing the act and becoming hunted in his turn throughout Europe, ending up [spoiler!] in a burrow trapped by Nazi secret services [well this is not exactly the end!]. This connection has been pointed out in some reviews, but the role of the burrows and oppressive underground and the complicity of the local police forces are strongly present in both books and somewhat decreases the appeal of this novel. Especially since the 1956 thread therein is a much less convincing plot than the 1939 one, despite involving conveniently forgotten old colleagues, the East Germany Stasi, hopeless French policemen and clergymen, the Sarre referendum, [much maligned!] andouillettes and oignons.