Marc Beaumont on One World ABC webinar [30 May, 9am]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on May 17, 2024 by xi'an

For the final talk of this Spring season of the One World ABC webinar, we are very glad to welcome Marc Beaumont, a central figure in the development of ABC methods and inference! (And a coauthor of our ABC-PMC paper.)

Model misspecification in population genomic
Mark Beaumont
University of Bristol
30th May 2024, 9.00am UK time

Abstract
In likelihood-free settings, problematic effects of model misspecification can manifest them-
selves during computation, leading to nonsensical answers, particularly causing convergence
problems in sequential algorithms. This issue has been well studied in the last 10 years, leading
to a number of methods for robust inference. In practical applications, likelihood-free methods
tend to be applied to the output of complex simulations where there is a choice of summary
statistics that can be computed. One approach to handling misspecification is to simply not
use summary statistics computed from simulations of the model under the prior that cannot be
with those observed in the data. This presentation gives a brief review of methods for observing
and handling misspecification in ABC and SBI, and then discusses approaches that we have
explored in a population genomic modelling framework.

6th Workshop on Sequential Monte Carlo Methods

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , on May 16, 2024 by xi'an

Very glad to be back to an SMC workshop as it has been nine years since my attending SMC 2015 in Malakoff! The more for the workshop taking place in Edinburgh and at the Bayes Centre. It is one of these places where I feel somewhat returning to familiar grounds with accumulated memories. Like my last visit there when I had a tea with Mike Titterington…

The overall pace of the workshop was quite nice, with long breaks for informal discussions (and time for ‘oggin’!) and interesting poster late afternoons, helped by the small number of them at each instance, incl. one on reversible jump HMC. Here are a few scribbled entries about some talks along the first two days.

After my opening talk (!), Joaquín Míguez talked about the impact of a sequential (Euler-Marayama) discretisation scheme for stochastic differential equations on Bayesian filtering with control of the approximation effect. Axel Finke (in a joint work with Adrien Corenflos, now an ERC Ocean postdoc in Warwick) built a sequence of particle filter algorithms targeting good performances (high expected jumping distance) against both large dimensions and high time horizon, exploiting gradient shift MALA-like, as well as prior impact, with the conclusion that their jack-of-all-trades solutions, Particle­-MALA and Particle­-mGRAD, enjoyed this resistance in nearly normal models. Interesting reminder of the auxiliary particle trick and good insights on using the smoothing target, even when accounting for the computing time, but too many versions for a single talk without checking against the preprint.

The SMC sampler-like algorithm involves propagating N “seed” particles z(i), with a mutation mechanism consisting of the generation of N integrator snippets 𝗓:=(z,ψ⁢(z),ψ²⁢(z),…) started at every seed particle z(i), resulting in N×(T+1) particles which are then whittled down to a set of N seed particles using a standard resampling scheme. Andrieu et al., 2024

Christophe Andrieu talked about Monte Carlo sampling with integrator snippets, starting with recycling solutions for the leapfrog integrator HMC and unfolding Hamiltonians for moving more easily. With snippets representing discretised paths along the level sets being used as particles, picking zero, one, or more particles along each path, since importance weights are connection with multinomial HMC

This relatively small algorithmic modification of the conditional particle filter, which we call the conditional backward sampling particle filter has a dramatically improved performance over the conditional particle filter. Karjalainen et al., 2024

Anthony Lee looked at mixing times for backward sampling SMC (CBPF/ancestor sampling) cf Lee et al. (2020), where the backward step consists in computing the weight of a randomly drawn backward or ancestral history. Improving on earlier results to reach mixing time O(log T) and complexity O(T log T) (with T the time horizon). Thanks to maximal coupling and boundedness assumptions on the prior and likelihood functions.

Neil Chada presented a work on Bayesian multilevel Monte Carlo on deep networks. À la Giles, with a telescoping identity. Always puzzling to envision a prior on all parameters of a neural network. Achieving a computational cost inverse to the order of the MSE, at best. With a useful reminder that pushing the size of the NN to infinity results in a (poor) Gaussian process prior (Sell et al., 2023).

On my first evening, I stopped with a friend in my favourite Blonde [restaurant], as in almost every other visit to Edinburgh, enjoyable as always, but I also found the huge offer of Asian minimarkets in the area too tempting to resist, between Indian, Korean, and Chinese products. (Although with a disappointing hojicha!). As I could not reach any new Munro by train or bus within a reasonable time range I resorted to the nearer Pentland Hills, with a stop by Rosslyn Chapel (mostly of Da Vinci Code fame!, if classic enough). And some delays in finding a bus getting there (misled by google map!) and a trail (misled by my poor map reading skills) up the actual hills. The mist did not help either.

war is personnal

Posted in Books, Kids, pictures with tags , , , , , , , , on May 15, 2024 by xi'an

robust privacy

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on May 14, 2024 by xi'an

During a recent working session, some Oceanerc (incl. me) went reading Privacy-Preserving Parametric Inference: A Case for Robust Statistics by Marco Avella-Medina (JASA, 2022), where robust criteria are advanced as efficient statistical tools in private settings. In this paper, robustness means using M-estimators T—as function of the empirical cdf—with basis score functions Ψ, defined as

\sum_{i=1}^n\Psi(x_i,T(\hat F_n))=0,

where Ψ is bounded. A construction further requiring that one can assess the sensitivity (in Dwork et al, 2006, sense) of a queried function, sensitivity itself linked with a measure of differential privacy. Because standard robustness approaches à la Huber allow for a portion of the sample to issue from an outlying (arbitrary) distribution, as in ε-contaminations, it makes perfect sense that robustness emerges within the differential framework. However, this common sense perception does not seem good enough for achieving differential privacy and the paper introduces a further randomization with noise scaled by (n,ε,δ) in the following way

T(\hat F_n)+\gamma(T,\hat F_n)5\sqrt{2\log(n)\log(2/\delta)/\epsilon_n}Z

that also applies to test statistics. This scaling seems to constitute the central result of the paper, which establishes asymptotically validity in the sense of statistical consistency (with the sample size n). But I am left wondering whether this outcome counts as supporting differential privacy as a sensible notion…

“…our proofs for the convergence of noisy gradient descent and noisy Newton’s method rely on showing that with high probability, the noise introduced to the gradients and Hessians has a negligible effect on the convergence of the iterates (up to the order of the statistical error of the non-noisy versions of the algorithms).” Avella-Medina, Bradshaw, & Loh

As a sequel I then read a more recent publication of Avella-Medina, Differentially private inference via noisy optimization, written with Casey Bradshaw & Po-Ling Loh, which appeared in the Annals of Statistics (2023). Again considering privatised estimation and inference for M-estimators, obtained by using noisy optimization procedures (noisy gradient descent, noisy Newton’s method) and constructing noisy confidence regions, that output differentially private avatars of standard M-estimators. Here the noisification goes through a randomisation of the gradient step like

\theta^{(k+1)}=\theta^{(k)}-\frac{\eta}{n}\sum_i\Psi(x_i,\theta^{(k)})+\frac{\eta B\sqrt K}{n}Z_k

where B is an upper bound on the gradient Ψ, η is a discretization step, and K is the total number of iterations (thus fixed in advance). The above stochastic gradient sequence converges with high probability to the actual M-estimator in n and not in K, since the upper bound on the distance scales in √K/n. Where does the attached privacy guarantee come from? It proceeds by an argument of a composition of a sequence of differentially private outputs, all based on the same dataset.

“…the larger the number [K] of data (gradient) queries of the algorithm, the more prone it will be to privacy leakage.”

The Newton method version is a variation on the above stochastic gradient descent. Except it seems to converge faster, as illustrated above.

Privacy-preserving Computing [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , on May 13, 2024 by xi'an

Privacy-preserving Computing for Big Data Analytics and AI, by Kai Chen and Qiang Yang, is a rather short 2024 CUP book translated from the 2022 Chinese version (by the authors).  It covers secret sharing, homomorphic encryption, oblivious transfer, garbled circuit, differential privacy, trusted execution environment, federated learning, privacy-preserving computing platforms, and case studies. The style is survey-like, meaning it often is too light for my liking, with too many lists of versions and extensions, and more importantly lacking in detail to rely (solely) on it for a course. At several times standing closer to a Wikipedia level introduction to a topic. For instance, the chapter on homomorphic encryption [Chap.5] does not connect with the (presumably narrow) picture I have of this method. And the chapter on differential privacy [Chap.6] does not get much further than Laplace and Gaussian randomization, as in eg the stochastic gradient perturbation of Abadi et al. (2016) the privacy requirement is hardly discussed. The chapter on federated leaning [Chap.8] is longer if not much more detailed, being based on a entire book on Federated learning whose Qiang Yang is the primary author. (With all figures in that chapter being reproduced from said book.)  The next chapter [Chap.9] describes to some extent several computing platforms that can be used for privacy purposes, such as FATE, CryptDB, MesaTEE, Conclave, and PrivPy, while the final one goes through case studies from different areas, but without enough depth to be truly formative for neophyte readers and students. Overall, too light for my liking.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]