a journal of the plague and pestilence [and war] year

Posted in Books, Kids, Mountains, pictures, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on May 28, 2022 by xi'an

Received my first biking ticket ever, not for [cautiously!] Xing one of the 60⁺ red-lights on my bike route but for driving [most respectfully!] on the sidewalk in order to reach Dauphine as roads are currently under construction in the area, traffic is a mess, and bike lanes are closed. Had I realised this was at all possible (considering the absence of sanctions on reckless car and truck drivers!), I would have stopped before reaching the Paris traffic police which was already ticketing another cyclist.Read Upright Women Wanted [on Kindle, a courtesy gift from Tor] for just a few dozen pages and then almost gave up out of boredom! I found of limited literary or scenarist interest, despite its nominations to both Hugo and Locus Awards 2021, but finished it in the train to Roissy airport… I am still stuck (and much disappointed!) on the first pages of Susan Clarke’s Piranesi, as the story (?) takes place in an endless complex of empty rooms and the descriptions are endless. By comparison, the growing madness perspiring through the Gormenghast series is at least providing a leading line that makes it worth reading! Although it won the 2021 Women’s Prize for Fiction, and was praised everywhere and nominated for many prizes, imho, Piranesi stands as far as possible from Clarke’s earlier masterpiece Jonathan Strange & Mr. Norrell… I do not think I will manage to ever finish this book!

Cooked a batch of kouign amann but failed to include enough butter! Still eatable. And made a rather successful attempt at tortillas, following a NYT recipe.

Watched Witch at Court (마녀의 법정), which proposes a [of course] highly unrealistic story of an evil policeman turned politician and eventually being faced with his crimes by the daughter of one of his early victims. As often in K drama, everyone is connected to the case, with prosecutors being relatives of victims or culprits (but not bothered by conflicts of interest), red herrings abounding, and trial outcomes being decided on the flimsiest proofs. Nonetheless, this is the one series I (fast-forward) watched that addressed the most frontally women exploitation and sexual crimes.

Bayesian sampling without tears

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , , , on May 24, 2022 by xi'an

Following a question on Stack Overflow trying to replicate a figure from the paper written by Alan Gelfand and Adrian Smith (1990) for The American Statistician, Bayesian sampling without tears, which precedes their historical MCMC papers, I looked at the R code produced by the OP and could not spot an issue as to why their simulation did not fit the posterior produced in the paper. Which proposes acceptance-rejection and sampling-importance-resampling as two solutions to approximately simulate from the posterior. The later being illustrated by simulations from the prior being weighted by the likelihood… The illustration is made of 3 observations from the sum of two Binomials with different success probabilities, θ¹ and θ². With a Uniform prior on both.

for (i in 1:N)
for (k in 1:3){
llh<-0
for (j in max(0,n2[k]-y[k]):min(y[k],n1[k]))
llh<-llh+choose(n1[k],j)*choose(n2[k],y[k]-j)*
theta[i,1]^j*(1-theta[i,1])^(n1[k]-j)*theta[i,2]^(y[k]-j)*
(1-theta[i,2])^(n2[k]-y[k]+j)
l[i]=l[i]*llh}


To double-check, I also wrote a Gibbs version:

theta=matrix(runif(2),nrow=T,ncol=2)
x1=rep(NA,3)
for(t in 1:(T-1)){
for(j in 1:3){
a<-max(0,n2[j]-y[j]):min(y[j],n1[j])
x1[j]=sample(a,1,
prob=choose(n1[j],a)*choose(n2[j],y[j]-a)*
theta[t,1]^a*(1-theta[t,1])^(n1[j]-a)*
theta[t,2]^(y[j]-a)*(1-theta[t,2])^(n2[j]-y[j]+a)
)}
theta[t+1,1]=rbeta(1,sum(x1)+1,sum(n1)-sum(x1)+1)
theta[t+1,2]=rbeta(1,sum(y)-sum(x1)+1,sum(n2)-sum(y)+sum(x1)+1)}


which did not show any difference with the above. Nor with the likelihood surface.

brave [not!] new [not!] world

Posted in Books, Kids, Travel with tags , , , , , , , , , , on May 23, 2022 by xi'an

“…the “central paradox” in the debate over the future of abortion: [14] States with the most restrictive abortion policies also show the weakest maternal and child health outcomes and are least likely to invest in at-risk populations.” The Commonwealth Fund, March 8

“In Louisiana, lawmakers are considering a proposal to classify ending a pregnancy at any point from the moment of fertilization as homicide. And the Idaho State Legislature may hold hearings on outlawing emergency contraceptives…” NYT, May 11

“Arizona enacted an abortion ban in cases of genetic indication, and South Dakota banned abortion if the fetus has Down syndrome.” Guttmacher Institute,

Most of the 21 states with laws on the books that would “snap back” abortion restrictions if the court overturns Roe fall into the bottom half of state rankings on a wide array of measures tracking the well-being of children and families, including childhood poverty, low birth weight and premature births, access to health insurance for low-income mothers, availability of prenatal care and the share of kids enrolled in early childhood education… ” CNN, December 14, 2021

“Six states banned providers from mailing the abortion medication to patients, and seven states either required the provider and patient to meet in person or banned the use of telehealth.” Guttmacher Institute,

“Arkansas also passed legislation in 2021 that would make abortion in the state an unclassified felony unless a procedure is undertaken to save the life of a pregnant woman.” Newsweek, May 20, 2021

“…in Alabama, legislation signed in 2019 bans the procedure at any stage of a pregnancy, with doctors facing the possibility of life imprisonment for performing one.” Newsweek, May 20, 2021

“Lawmakers in Missouri weighed legislation early this year that would allow individuals to sue anyone helping a patient cross state lines for an abortion (…) In Texas, a law passed last year made it illegal to ship medication for self-managed abortion, including across state lines” The Guardian,  5 May

Another harmonic mean

Posted in Books, Statistics, University life with tags , , , , , , , , on May 21, 2022 by xi'an

Yet another paper that addresses the approximation of the marginal likelihood by a truncated harmonic mean, a popular theme of mine. A 2020 paper by Johannes Reich, entitled Estimating marginal likelihoods from the posterior draws through a geometric identity and published in Monte Carlo Methods and Applications.

The geometric identity it aims at exploiting is that

$m(x) = \frac{\int_A \,\text d\theta}{\int_A \pi(\theta|x)\big/\pi(\theta)f(x|\theta)\,\text d\theta}$

for any (positive volume) compact set $A$. This is exactly the same identity as in an earlier and uncited 2017 paper by Ana Pajor, with the also quite similar (!) title Estimating the Marginal Likelihood Using the Arithmetic Mean Identity and which I discussed on the ‘Og, linked with another 2012 paper by Lenk. Also discussed here. This geometric or arithmetic identity is again related to the harmonic mean correction based on a HPD region A that Darren Wraith and myself proposed at MaxEnt 2009. And that Jean-Michel and I presented at Frontiers of statistical decision making and Bayesian analysis in 2010.

In this avatar, the set A is chosen close to an HPD region, once more, with a structure that allows for an exact computation of its volume. Namely an ellipsoid that contains roughly 50% of the simulations from the posterior (rather than our non-intersecting union of balls centered at the 50% HPD points), which assumes a Euclidean structure of the parameter space (or, in other words, depends on the parameterisation)In the mixture illustration, the author surprisingly omits Chib’s solution, despite symmetrised versions avoiding the label (un)switching issues. . What I do not get is how this solution gets around the label switching challenge in that set A remains an ellipsoid for multimodal posteriors, which means it either corresponds to a single mode [but then how can a simulation be restricted to a “single permutation of the indicator labels“?] or it covers all modes but also the unlikely valleys in-between.

evidence estimation in finite and infinite mixture models

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 20, 2022 by xi'an

Adrien Hairault (PhD student at Dauphine), Judith and I just arXived a new paper on evidence estimation for mixtures. This may sound like a well-trodden path that I have repeatedly explored in the past, but methinks that estimating the model evidence doth remain a notoriously difficult task for large sample or many component finite mixtures and even more for “infinite” mixture models corresponding to a Dirichlet process. When considering different Monte Carlo techniques advocated in the past, like Chib’s (1995) method, SMC, or bridge sampling, they exhibit a range of performances, in terms of computing time… One novel (?) approach in the paper is to write Chib’s (1995) identity for partitions rather than parameters as (a) it bypasses the label switching issue (as we already noted in Hurn et al., 2000), another one is to exploit  Geyer (1991-1994) reverse logistic regression technique in the more challenging Dirichlet mixture setting, and yet another one a sequential importance sampling solution à la  Kong et al. (1994), as also noticed by Carvalho et al. (2010). [We did not cover nested sampling as it quickly becomes onerous.]

Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest and diverging opinions, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish there the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric ‘strongly identifiable’ Dirichlet Process Mixture (DPM) model.