Archive for evidence

Natural nested sampling

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on May 28, 2023 by xi'an

“The nested sampling algorithm solves otherwise challenging, high-dimensional integrals by evolving a collection of live points through parameter space. The algorithm was immediately adopted in cosmology because it partially overcomes three of the major difficulties in Markov chain Monte Carlo, the algorithm traditionally used for Bayesian computation. Nested sampling simultaneously returns results for model comparison and parameter inference; successfully solves multimodal problems; and is naturally self-tuning, allowing its immediate application to new challenges.”

I came across a review on nested sampling in Nature Reviews Methods Primers of May 2022, with a large number of contributing authors, some of whom I knew from earlier papers in astrostatistics. As illustrated by the above quote from the introduction, the tone is definitely optimistic about the capacities of the method, reproducing the original argument that the evidence is the posterior expectation of the likelihood L(θ) under the prior. Which representation, while valid, is not translating into a dimension-free methodology since parameters θ still need be simulated.

“Nested sampling lies in a class of algorithms that form a path of bridging distributions and evolves samples along that path. Nested sampling stands out because the path is automatic and smooth — compression along log X by, on average, 1/𝑛at each iteration — and because along the path is compressed through constrained priors, rather than from the prior to the posterior. This was a motivation for nested sampling as it avoids phase transitions — abrupt changes in the bridging distributions — that cause problems for other methods, including path samplers, such as annealing.”

The elephant in the room is eventually processed, namely the simulation from the prior constrained to the likelihood level sets that in my experience (with, e.g., mixture posteriors) proves most time consuming. This stems from the fact that these level sets are notoriously difficult to evaluate from a given sample: all points stand within the set but they hardly provide any indication of the boundaries of saif set… Region sampling requires to construct a region that bounds the likelihood level set, which requires some knowledge of the likelihood variations to have a chance to remain efficient, incl. in cosmological applications, while regular MCMC steps require an increasing number of steps as the constraint gets tighter and tighter. For otherwise it essentially amounts to duplicating a live particle.

inferring the number of components [remotely]

Posted in Statistics with tags , , , , , , , , , , , , , , , , , on October 14, 2022 by xi'an

day one at ISBA 22

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , on June 29, 2022 by xi'an

Started the day with a much appreciated swimming practice in the [alas warm⁺⁺⁺] outdoor 50m pool on the Island with no one but me in the slooow lane. And had my first ride with the biXi system, surprised at having to queue behind other bikes at red lights! More significantly, it was a great feeling to reunite at last with so many friends I had not met for more than two years!!!

My friend Adrian Raftery gave the very first plenary lecture on his work on the Bayesian approach to long-term population projections, which was recently  a work censored by some US States, then counter-censored by the Supreme Court [too busy to kill Roe v. Wade!]. Great to see the use of Bayesian methods validated by the UN Population Division [with at least one branch of the UN

Stephen Lauritzen returning to de Finetti notion of a model as something not real or true at all, back to exchangeability. Making me wonder when exchangeability is more than a convenient assumption leading to the Hewitt-Savage theorem. And sufficiency. I mean, without falling into a Keynesian fallacy, each point of the sample has unique specificities that cannot be taken into account in an exchangeable model. Nice to hear some measure theory, though!!! Plus a comment on the median never being sufficient, recouping an older (and presumably not original) point of mine. Stephen’s (or Fisher’s?) argument being that the median cannot be recursively computed!

Antonietta Mira and I had our ABC session this afternoon with Cecilia Viscardi, Sirio Legramanti, and Massimiliano Tamborino (Warwick) as speakers. Cecilia linked ABC with normalising flows, in collaboration with Dennis Prangle (whose earlier paper on this connection was presented as the first One World ABC seminar). Thus using past simulations to approximate the posterior by a neural network, possibly with a significant increase in computing time when compared with more rudimentary SMC-ABC methods in larger dimensions. Sirio considered summary-free ABC based on discrepancies like Rademacher complexity. Which more or less contains MMD, Kullback-Leibler, Wasserstein and more, although it seems to be dependent on the parameterisation of the observations. An interesting opening at the end was that this approach could apply to non iid settings. Massi presented a paper coauthored with Umberto that had just been arXived. On sequential ABC with a dependence on the summary statistic (hence guided). Further bringing copulas into the game, although this forces another choice [for the marginals] in the method.

Tamara Broderick talked about a puzzling leverage effect of some observations in economic studies where a tiny portion of individuals may modify the significance or the sign of a coefficient, for which I cannot tell whether the data or the reliance on statistical significance are to blame. Robert Kohn presented mixture-of-Gaussian copulas [not to be confused with mixture of Gaussian-copulas!] and Nancy Reid concluded my first [and somewhat exhausting!] day at ISBA with a BFF talk on the different statistical paradigms take on confidence (for which the notion of calibration seems to remain frequentist).

Side comments: First, most people in the conference are wearing masks, which is great! Also, I find it hard to read slides from the screen, which I presume is an age issue (?!) Even more aside, I had Korean lunch in a place that refused to serve me a glass of water, which I find amazing.

taking advantage of the constant

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , on May 19, 2022 by xi'an

A question from X validated had enough appeal for me to procrastinate about it for ½ an hour: what difference does it make [for simulation purposes] that a target density is properly normalised? In the continuous case, I do not see much to exploit about this knowledge, apart from the value potentially leading to a control variate (in a Gelfand and Dey 1996 spirit) and possibly to a stopping rule (by checking that the portion of the space visited so far has mass close to one, but this is more delicate than it sounds).

In a (possibly infinite) countable setting, it seems to me one gain (?) is that approximating expectations by Monte Carlo no longer requires iid simulations in the sense that once visited,  atoms need not be visited again. Self-avoiding random walks and their generalisations thus appear as a natural substitute for MC(MC) methods in this setting, provided finding unexplored atoms proves manageable. For instance, a stopping rule is always available, namely that the cumulated weight of the visited fraction of the space is close enough to one. The above picture shows a toy example on a 500 x 500 grid with 0.1% of the mass remaining at the almost invisible white dots. (In my experiment, neighbours for the random exploration were chosen at random over the grid, as I assumed no global information was available about the repartition over the grid either of mass function or of the function whose expectation was seeked.)

put the data aside [SCOTUS v. evidence]

Posted in Statistics with tags , , , , , , , , , , , , on May 18, 2022 by xi'an

Reposted from a Nature editorial:

(…) Moving in the opposite direction runs contrary to 50 years of research from around the world showing that abortion access is a crucial component of health care and is important for women’s equal participation in society. After the Supreme Court agreed to hear Mississippi’s case last year, Nature covered some of this evidence, submitted to the court by US scientific societies and more than 800 US researchers in public health, reproductive health, social sciences and economics, to the court in advance of the case’s hearing in December.

Some outcomes of outlawing abortion can be predicted by what’s known. Researchers expect overall infant and maternal health to decline in the United States in the wake of abortion bans, because more unintended pregnancies will be brought to term. Unintended pregnancies are associated with an increased risk of health problems for babies, and often for mothers, for several reasons — including reduced prenatal care.

Maternal health is also expected to decline overall. One straightforward reason is that the risks of dying from pregnancy-related causes are much greater than the risks of dying because of a legal abortion. A predicted rise in maternal mortality among Black women in the United States is particularly distressing, because the rate is already unacceptably high. In one study, sociologist Amanda Stevenson at the University of Colorado Boulder modelled a hypothetical situation in which abortions are banned throughout the United States, and found that the lifetime risk of dying from pregnancy-related causes for non-Hispanic Black women would rise from 1 in 1,300 to 1 in 1,000.

One claim made by abortion opponents in this case is that abortions no longer benefit women and even cause them harm, but dozens of studies contradict this. In just one, health economist Sarah Miller at the University of Michigan in Ann Arbor and her colleagues assessed around 560 women of comparable age and financial standing who sought abortions. They found that, five years after pregnancy, women who were denied the procedure had experienced a substantial increase in debt, bankruptcies, evictions and other dire financial events — whereas the financial standing of women who had received an abortion had remained stable or improved. A primary reason that women give for wanting an abortion is an inability to afford to raise the child, and this study suggests that they understand their own situations.

Abortion bans will extract an unequal toll on society. Some 75% of women who choose to have abortions are in a low income bracket and nearly 60% already have children, according to one court brief submitted ahead of the December hearing and signed by more than 150 economists. Travelling across state lines to receive care will be particularly difficult for people who do not have the funds for flights or the ability to take time off work, or who struggle to find childcare.

Unfortunately, some of the justices seem to be disregarding these data. At the December hearing, Julie Rikelman, a lawyer at the non-profit Center for Reproductive Rights, headquartered in New York City, brought up studies presented in the economists’ brief; Roberts interrupted her and suggested “putting that data aside”. In the leaked draft opinion, Alito also elides a body of research on abortion policy, writing that it’s “hard for anyone — and in particular for a court — to assess” the effect of the right to abortion on women’s lives.

Such an attitude suggests that the justices see research as secondary to the question of whether the US Constitution should protect abortion. But the outcome of this ruling isn’t an academic puzzle. The Supreme Court needs to accept that the consensus of research, knowledge and scholarship — the evidence on which societies must base their laws — shows how real lives hang in the balance. Already, the United States claims the highest rate of maternal and infant mortality among wealthy nations. Should the court overturn Roe v. Wade, these grim statistics will only get worse.

%d bloggers like this: