Archive for the Books Category

Hugo 2021 nominations

Posted in Books with tags , , , , , , , , , , on June 13, 2021 by xi'an

I received an email from Tor about their books shortlisted for the Hugo Awards this year, which made me check the nominated novels (as there was little chance I had read novellas, novelettes, or short stories in the other lists, except those by P. Djèlí Clark who did win the Nebula last week!):

Of which I have only read the [great] Network Effect from the Murderbot series, but with Muir’s, Clarke’s and Kowal’s opera on my reading list.

journal of the [second] plague year [away]

Posted in Books, Kids, Mountains, pictures, Travel, University life, Wines with tags , , , , , , , , , , , , , , , on June 12, 2021 by xi'an

Read Fred Vargas’s Seeking Whom He May Devour (L’Homme à l’envers), which I found on a bookshelf of our vacation rental in Annecy. And got more quickly bored by the story as it is plagued with the same defects as the ones I read before, from a definitive issue with Canadians (!), to an attempt to bring supernatural causes in the story and reveal them as fake by the end of the book, to a collection of implausible and caricaturesque characters surrounded by the usual backcountry morons that would rather fit a Paasilinna novel, and to the incomprehensible intuitions of Inspector Adamsberg. I also went through the sequel to Infomocracy, Null states, albeit this was a real chore as it lacked substance and novelty (the title by itself should have been a warning!).

Watched Night in Paradise (낙원의 밤), another Korean gangster movie, which seems to repeat the trope of bad-guy-on-the-run-meets-lost-girl found in my previously watched Korean Jo-Phil: The Dawning Rage, where the main character, a crooked police officer is radically impacted after failing to save a lost teenager.  (And also in the fascinating The Wild Goose Lake.) The current film is stronger however in creating the bond between the few-words gangster on the run and the reluctant guest Jae-yeon who is on a run of a different magnitude. While the battle scenes are still grand-guignolesque (if very violent) in a Kill Bill spirit, and the gang leaders always caricaturesque, the interplay between the main characters makes Night in Paradise a pretty good film (and explains why it got selected for the Venice Film Festival of 2020). Also went through the appalling Yamakasi by Luc Besson, a macho, demagogical, sexist, simplist, non-story…

more air [&q’s] for MCMC [comments]

Posted in Books, pictures, Statistics with tags , , , , , , , , , on June 11, 2021 by xi'an

[A rich set of comments by Tom Loredo about convergence assessments for MCMC that I feel needs reposting:]

Two quick points:

  • By coincidence (and for a different problem), I’ve just been looking at the work of Gorham & Mackey that I believe Pierre is referring to. This is probably the relevant paper: “Measuring Sample Quality with Kernels“.
  • Besides their new rank-based R-hat, bloggers on Gelman’s blog have also pointed to another R-hat replacement, R, developed by some Stan team members; it is “based on how well a machine learning classifier model can successfully discriminate the individual chains.” See: “R: A robust MCMC convergence diagnostic with uncertainty using decision tree classifiers”.

In addition, here’s an anecdote regarding your comment, “I remain perplexed and frustrated by the fact that, 30 years later, the computed values of the visited likelihoods are not better exploited.”

That has long bothered me, too. During a SAMSI program around 2006, I spent time working on one approach that tried to use the prior*likelihood (I call it q(θ), for “quasiposterior” and because it’s next to “p”!) to compute the marginal likelihood. It would take posterior samples (from MCMC or another approach) and find their Delaunay triangulation. Then, using q(θ) on the nodes of the simplices comprising the triangulation, it used a simplicial cubature rule to approximate the integral of q(theta) over the volume spanned by the samples.

As I recall, I only explored it with multivariate normal and Student-t targets. It failed, but in an interesting way. It worked well in low dimensions, but gave increasingly poor estimates as dimension grew. The problem appeared related to concentration of measure (or the location of the typical set), with the points not sufficiently covering the center or the large volume in the tails (or both; I can’t remember what diagnostics said exactly).

Another problem is that Delaunay triangulation gets expensive quickly with growing dimension. This method doesn’t need an optimal triangulation, so I wondered if there was a faster sub-optimal triangulation algorithm, but I couldn’t find one.

An interesting aspect of this approach is that the fact that the points are drawn from the prior doesn’t matter. Any set of points is a valid set of points for approximating the integral (in the spanned volume). I just used posterior samples because I presumed those would be available from MCMC. I briefly did some experiments taking the samples, and reweighting them to draw a subset for the cubature that was either over- or under-dispersed vs. the target. And one could improve things this way (I can’t remember what choice was better). This suggests that points drawn from q(theta) aren’t optimal for such cubature, but I never tried looking formally for the optimal choice.

I called the approach “adaptive simplicial cubature,” adaptive in the sense that the points are chosen in a way that depends on the integrand.

The only related work I could find at the time was work by you and Anne Philippe on Riemanns sums with MCMC (https://doi.org/10.1023/A:1008926514119). I later stumbled upon a paper on “random Riemann sum estimators” as an alternative to Monte Carlo that seems related but that I didn’t explore further (https://doi.org/10.1016/j.csda.2006.09.041).

I still find it hard to believe that the q values aren’t useful. Admittedly, in an n-dimensional distribution, it’s just 1 more quantity available beyond the n that comprise the sample location. But it’s a qualitatively different type of information from the sample location, and I can’t help but think there’s some clever way to use it (besides emulating the response surface).

inf R ! [book review]

Posted in Books, R, Travel with tags , , , , , , , , , , , on June 10, 2021 by xi'an

Thanks to my answering a (basic) question on X validated involving an R code, R mistakes and some misunderstanding about Bayesian hierarchical modelling, I got pointed out to Patrick Burns’ The R inferno. This is not a recent book as the second edition is of 2012, with a 2011 version still available on-line. Which is the version I read. As hinted by the cover, the book plays on Dante’s Inferno and each chapter is associated with a circle of Hell… Including drawings by Botticelli. The style is thus most enjoyable and sometimes hilarious. Like hell!

The first circle (reserved for virtuous pagans) is about treating integral reals as if they were integers, the second circle (attributed to gluttons, although Dante’s is for the lustful) is about allocating more space along the way, as in the question I answered and in most of my students’ codes! The third circle (allocated here to blasphemous sinners, destined for Dante’s seven circle, when Dante’s third circle is to the gluttons) points out the consequences of not vectorising, with for instance the impressive capacities of the ifelse() function [exploited to the max in R codecolfing!].  And the fourth circle (made for the lustfuls rather than Dante’s avaricious and prodigals) is a short warning about the opposite over-vectorising. Circle five (destined for the treasoners, and not Dante’s wrathfuls) pushes for and advises about writing R functions. Circle six recovers Dante’s classification, welcoming (!) heretics, and prohibiting global assignments, in another short chapter. Circle seven (alloted to the simoniacs—who should be sharing the eight circle with many other sinners—rather than the violents as in Dante’s seventh) discusses object attributes, with the distinction between S3 and S4 methods somewhat lost on me. Circle eight (targeting the fraudulents, as in Dante’s original) is massive as it covers “a large number of ghosts, chimeras and devils”, a collection of difficulties and dangers and freak occurences, with the initial warning that “It is a sin to assume that code does what is intended”. A lot of these came as surprises to me and I was rarely able to spot the difficulty without the guidance of the book. Plenty to learn from these examples and counter-examples. Reaching Circle nine (where live (!) the thieves, rather than Dante’s traitors). A “special place for those who feel compelled to drag the rest of us into hell.” Discussing the proper ways to get help on fori. Like Stack Exchange. Concluding with the tongue-in-cheek comment that “there seems to be positive correlation between a person’s level of annoyance at [being asked several times the same question] and ability to answer questions.” This being a hidden test, right?!, as the correlation should be negative.

genuine Latinsquare rectangle

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on June 9, 2021 by xi'an