Archive for Book

back to the Bayesian Choice

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on October 17, 2018 by xi'an

Surprisingly (or not?!), I received two requests about some exercises from The Bayesian Choice, one from a group of students from McGill having difficulties solving the above, wondering about the properness of the posterior (but missing the integration of x), to whom I sent back this correction. And another one from the Czech Republic about a difficulty with the term “evaluation” by which I meant (pardon my French!) estimation.

ABC intro for Astrophysics

Posted in Books, Kids, Mountains, R, Running, Statistics, University life with tags , , , , , , , , , , , on October 15, 2018 by xi'an

Today I received in the mail a copy of the short book published by edp sciences after the courses we gave last year at the astrophysics summer school, in Autrans. Which contains a quick introduction to ABC extracted from my notes (which I still hope to turn into a book!). As well as a longer coverage of Bayesian foundations and computations by David Stenning and David van Dyk.

Is that a big number? [book review]

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , on July 31, 2018 by xi'an

A book I received prior to its publication a few days ago from OXford University Press (OUP), as a book editor for CHANCE (usual provisions apply: the contents of this post will be more or less reproduced in my column in CHANCE when it appears). Copy that I found in my mailbox in Warwick last week and read over the (very hot) weekend.

The overall aim of this book by Andrew Elliott is to encourage numeracy (or fight innumeracy) by making sense of absolute quantities by putting them in perspective, teaching about log scales, visualisation, and divide-and-conquer techniques. And providing a massive list of examples and comparisons, sometimes for page after page… The book is associated with a fairly rich website, itself linked with the many blogs of the author and a myriad of other links and items of information (among which I learned of the recent and absurd launch of Elon Musk’s Tesla car in space! A première in garbage dumping…). From what I can gather from these sites, some (most?) of the material in the book seems to have emerged from the various blog entries.

“Length of River Thames (386 km) is 2 x length of the Suez Canal (193.3 km)”

Maybe I was too exhausted by heat and a very busy week in Warwick for our computational statistics week, the football  2018 World Cup having nothing to do with this, but I could not keep reading the chapters of the book in a continuous manner, suffering from massive information overdump! Being given thousands of entries kills [for me] the appeal of outing weight or sense to large and very large and humongous quantities. And the final vignette in each chapter of pairing of numbers like the one above or the one below

“Time since earliest writing (5200 y) is 25 x time since birth of Darwin (208 y)”

only evokes the remote memory of some kid journal I read from time to time as a kid with this type of entries (I cannot remember the name of the journal!). Or maybe it was a journal I would browse while waiting at the hairdresser’s (which brings back memories of endless waits, maybe because I did not like going to the hairdresser…) Some of the background about measurement and other curios carry a sense of Wikipediesque absolute in their minute details.

A last point of disappointment about the book is the poor graphical design or support. While the author insists on the importance of visualisation on grasping the scales of large quantities, and the webpage is full of such entries, there is very little backup with great graphs to be found in “Is that a big number?” Some of the pictures seem taken from an anonymous databank (where are the towers of San Geminiano?!) and there are not enough graphics. For instance, the fantastic graphics of xkcd conveying the xkcd money chart poster. Or about future. Or many many others

While the style is sometimes light and funny, an overall impression of dryness remains and in comparison I much more preferred Kaiser Fung’s Numbers rule your world and even more both Guesstimation books!

resampling methods

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , on December 6, 2017 by xi'an

A paper that was arXived [and that I missed!] last summer is a work on resampling by Mathieu Gerber, Nicolas Chopin (CREST), and Nick Whiteley. Resampling is used to sample from a weighted empirical distribution and to correct for very small weights in a weighted sample that otherwise lead to degeneracy in sequential Monte Carlo (SMC). Since this step is based on random draws, it induces noise (while improving the estimation of the target), reducing this noise is preferable, hence the appeal of replacing plain multinomial sampling with more advanced schemes. The initial motivation is for sequential Monte Carlo where resampling is rife and seemingly compulsory, but this also applies to importance sampling when considering several schemes at once. I remember discussing alternative schemes with Nicolas, then completing his PhD, as well as Olivier Cappé, Randal Douc, and Eric Moulines at the time (circa 2004) we were working on the Hidden Markov book. And getting then a somewhat vague idea as to why systematic resampling failed to converge.

In this paper, Mathieu, Nicolas and Nick show that stratified sampling (where a uniform is generated on every interval of length 1/n) enjoys some form of consistent, while systematic sampling (where the “same” uniform is generated on every interval of length 1/n) does not necessarily enjoy this consistency. There actually exists cases where convergence does not occur. However, a residual version of systematic sampling (where systematic sampling is applied to the residuals of the decimal parts of the n-enlarged weights) is itself consistent.

The paper also studies the surprising feature uncovered by Kitagawa (1996) that stratified sampling applied to an ordered sample brings an error of O(1/n²) between the cdf rather than the usual O(1/n). It took me a while to even understand the distinction between the original and the ordered version (maybe because Nicolas used the empirical cdf during his SAD (Stochastic Algorithm Day!) talk, ecdf that is the same for ordered and initial samples).  And both systematic and deterministic sampling become consistent in this case. The result was shown in dimension one by Kitagawa (1996) but extends to larger dimensions via the magical trick of the Hilbert curve.

mea culpa!

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on October 9, 2017 by xi'an

An entry about our Bayesian Essentials book on X validated alerted me to a typo in the derivation of the Gaussian posterior..! When deriving the posterior (which was left as an exercise in the Bayesian Core), I just forgot the term expressing the divergence between the prior mean and the sample mean. Mea culpa!!!

LaTeX issues from Vienna

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on September 21, 2017 by xi'an

When working on the final stage of our edited handbook on mixtures, in Vienna, I came across unexpected practical difficulties! One was that by working on Dropbox with Windows users, files and directories names suddenly switched from upper case to lower cases letters !, making hard-wired paths to figures and subsections void in the numerous LaTeX files used for the book. And forcing us to change to lower cases everywhere. Having not worked under Windows since George Casella gave me my first laptop in the mid 90’s!, I am amazed that this inability to handle both upper and lower names is still an issue. And that Dropbox replicates it. (And that some people see that as a plus.)

The other LaTeX issue that took a while to solve was that we opted for one chapter one bibliography, rather than having a single bibliography at the end of the book, mainly because CRC Press asked for this feature in order to sell chapters individually… This was my first encounter with this issue and I found the solutions to produce individual bibliographies incredibly heavy handed, whether through chapterbib or bibunits, since one has to bibtex one .aux file for each chapter. Even with a one line bash command,

for f in bu*aux; do bibtex `basename $f .aux`; done

this is annoying in the extreme!

zurück nach Wien

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , on September 16, 2017 by xi'an

Today, I am travelling to Vienna for a few days, primarily for assessing a grant renewal for a research consortium federating most Austrian research groups on a topic for which Austria is a world-leader. (Sorry for being cryptic but I am unsure how much I can disclose about this assessment!) And taking advantage on being in Vienna, for a two-day editing session with Sylvia Früwirth-Schnatter and Gilles Celeux on our Handbook of mixtures analysis project. Which started a few years ago with another meeting in Vienna. And taking further advantage on being in Vienna, for an evening at the Volksoper, conveniently playing Die Zauberflöte!