Archive for information theory

Contextual Integrity for Differential Privacy #4 [23w5106]

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , , , , , , , , , , , on August 5, 2023 by xi'an

Mostly short talks. First talk by Thomas Seinke (Google) on interpreting ε, with a side wondering of mine on the relation between exp(ε) and the uncertainty that comes with Monte Carlo outcome. Which may relate to this 2022 paper by Ruobin Gong. Second talk by Gautam Kamath (U Waterloo) on large language models under privacy with “public” data. Questioning the appropriateness of ML benchmarks in terms of privacy. Third talk by Mark Bun (Boston U) on replicability, privacy and adaptive generalisation in machine learning, with a strange criticism of confidence intervals on the same parameter not intersecting for two independent studies. And proposing high probability replicable algorithms that can be put in duality with differentially private algorithms at the cost of lowering precision and effective sample size. We also had another group discussion on how to reach out about privacy guarantees, which made me realise there were GDPR compliance software available.

In the afternoon session, Shlomi Hod (Boston U) presented a practical case of designing a privacy preserving protocol for the Israeli birth record. With a strong opposition from stakeholders to use synthetic data, due to a semantic drift from synthetic to manipulated to fake, to lying. Wanrong Zhang did not talk about her stunning recent ICML paper but instead of another practical case connected with mobile based Covid case predictions, by adding minimal noise to mobility data. Nidhi Hegde (U Alberta) gave up talking on Thomson sampling with privacy protection, to focus on an ongoing health application for Alberta as more suited for the workshop. And Ria Safavi-Naini (U Calgary) drew a parallel between information theory and DP versus CI.

While the workshop was scheduled till Friday noon, in usual BIRS habits (!), the morning session was cancelled for most people leaving Kelowna in the morning.

Monte Carlo methods for Potts models

Posted in pictures, Statistics, University life with tags , , , , on March 10, 2016 by xi'an

poincareThere will be a seminar talk by Mehdi Molkaraie (Pompeu Fabra) next week at Institut Henri Poincaré (IHP), Paris, on his paper with Vincent Gomez.

We consider the problem of estimating the partition function of the ferromagnetic q-state Potts model. We propose an importance sampling algorithm in the dual of the normal factor graph representing the model. The algorithm can efficiently compute an estimate of the partition function when the coupling parameters of the model are strong (corresponding to models at low temperature) or when the model contains a mixture of strong and weak couplings. We show that, in this setting, the proposed algorithm significantly outperforms the state of the art methods.

The talk is at 14:30, March 17. It is part of a trimester program on information and computation theories I was completely unaware of.

The Unimaginable Mathematics of Borges’ Library of Babel [book review]

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , on September 30, 2014 by xi'an

This is a book I carried away from JSM in Boston as the Oxford University Press representative kindly provided my with a copy at the end of the meeting. After I asked for it, as I was quite excited to see a book linking Jorge Luis Borges’ great Library of Babel short story with mathematical concepts. Even though many other short stories by Borges have a mathematical flavour and are bound to fascinate mathematicians, the Library of Babel is particularly prone to mathemati-sation as it deals with the notions of infinite, periodicity, permutation, randomness… As it happens, William Goldbloom Bloch [a patronym that would surely have inspired Borges!], professor of mathematics at Wheaton College, Mass., published the unimaginable mathematics of Borges’ Library of Babel in 2008, so this is not a recent publication. But I had managed to miss through the several conferences where I stopped at OUP exhibit booth. (Interestingly William Bloch has also published a mathematical paper on Neil Stephenson’s Cryptonomicon.)

Now, what is unimaginable in the maths behind Borges’ great Library of Babel??? The obvious line of entry to the mathematical aspects of the book is combinatorics: how many different books are there in total? [Ans. 10¹⁸³⁴⁰⁹⁷…] how many hexagons are needed to shelf that many books? [Ans. 10⁶⁸¹⁵³¹…] how long would it take to visit all those hexagons? how many librarians are needed for a Library containing all volumes once and only once? how many different libraries are there [Ans. 1010⁶…] Then the book embarks upon some cohomology, Cavalieri’s infinitesimals (mentioned by Borges in a footnote), Zeno’s paradox, topology (with Klein’s bottle), graph theory (and the important question as to whether or not each hexagon has one or two stairs), information theory, Turing’s machine. The concluding chapters are comments about other mathematical analysis of Borges’ Grand Œuvre and a discussion on how much maths Borges knew.

So a nice escapade through some mathematical landscapes with more or less connection with the original masterpiece. I am not convinced it brings any further dimension or insight about it, or even that one should try to dissect it that way, because it kills the poetry in the story, especially the play around the notion(s) of infinite. The fact that the short story is incomplete [and short on details] makes its beauty: if one starts wondering at the possibility of the Library or at the daily life of the librarians [like, what do they eat? why are they there? where are the readers? what happens when they die? &tc.] the intrusion of realism closes the enchantment! Nonetheless, the unimaginable mathematics of Borges’ Library of Babel provides a pleasant entry into some mathematical concepts and as such may initiate a layperson not too shy of maths formulas to the beauty of mathematics.

did I mean endemic? [pardon my French!]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on June 26, 2014 by xi'an

clouds, Nov. 02, 2011Deborah Mayo wrote a Saturday night special column on our Big Bayes stories issue in Statistical Science. She (predictably?) focussed on the critical discussions, esp. David Hand’s most forceful arguments where he essentially considers that, due to our (special issue editors’) selection of successful stories, we biased the debate by providing a “one-sided” story. And that we or the editor of Statistical Science should also have included frequentist stories. To which Deborah points out that demonstrating that “only” a frequentist solution is available may be beyond the possible. And still, I could think of partial information and partial inference problems like the “paradox” raised by Jamie Robbins and Larry Wasserman in the past years. (Not the normalising constant paradox but the one about censoring.) Anyway, the goal of this special issue was to provide a range of realistic illustrations where Bayesian analysis was a most reasonable approach, not to raise the Bayesian flag against other perspectives: in an ideal world it would have been more interesting to get discussants produce alternative analyses bypassing the Bayesian modelling but obviously discussants only have a limited amount of time to dedicate to their discussion(s) and the problems were complex enough to deter any attempt in this direction.

As an aside and in explanation of the cryptic title of this post, Deborah wonders at my use of endemic in the preface and at the possible mis-translation from the French. I did mean endemic (and endémique) in a half-joking reference to a disease one cannot completely get rid of. At least in French, the term extends beyond diseases, but presumably pervasive would have been less confusing… Or ubiquitous (as in Ubiquitous Chip for those with Glaswegian ties!). She also expresses “surprise at the choice of name for the special issue. Incidentally, the “big” refers to the bigness of the problem, not big data. Not sure about “stories”.” Maybe another occurrence of lost in translation… I had indeed no intent of connection with the “big” of “Big Data”, but wanted to convey the notion of a big as in major problem. And of a story explaining why the problem was considered and how the authors reached a satisfactory analysis. The story of the Air France Rio-Paris crash resolution is representative of that intent. (Hence the explanation for the above picture.)

Decision systems and nonstochastic randomness

Posted in Books, Statistics, University life with tags , , , , , on October 26, 2011 by xi'an

Thus the informativity of stochastic experiment turned out to depend on the Bayesian system and to coincide to within the scale factor with the previous “value of information”.” V. Ivanenko, Decision systems and nonstochastic randomness, p.208

This book, Decision systems and nonstochastic randomness, written by the Ukrainian researcher Victor Ivanenko, is related to decision theory and information theory, albeit with a statistical component as well. It however works at a fairly formal level and the reading is certainly not light. The randomness it address is the type formalised by Andreï Kolmogorov (also covered in the book Randomness through Computation I [rather negatively] reviewed a few months ago, inducing angry comments and scathing criticisms in the process). The terminology is slightly different from the usual one, but the basics are those of decision theory as in De Groot (1970). However, the tone quickly gets much more mathematical and the book lost me early in Chapter 3 (Indifferent uncertainty) on a casual reading. The following chapter on non-stochastic randomness reminded me of von Mises for its use of infinite sequences, and of the above book for its purpose, but otherwise offered an uninterrupted array of definitions and theorems that sounded utterly remote from statistical problems. After failing to make sense of the chapter on the informativity of experiment in Bayesian decision problems, I simply gave up… I thus cannot judge from this cursory reading whether or not the book is “useful in describing real situations of decision-making” (p.208). It just sounds very remote from my centres of interest. (Anyone interested by writing a review?)