Archive for differential privacy

distracting redistricting?

Posted in Books, Statistics with tags , , , , , , , , , on August 26, 2021 by xi'an

“We at FiveThirtyEight will be tracking the whole redistricting process, from proposed maps to final maps, so watch this space for updates!”

FiveThirtyEight is keeping a tracker on the “redistricting” of U.S. states, namely the decennial redrawing of electoral districts. This is still an early stage when no map has been validated by the state legislature and hence I cannot tell whether or not FiveThirtyEight will be analysing gerrymandering in a statistical manner, to figure out how extreme the map is within the collection of all electoral maps. The States being the States, the rules vary widely between them, from the legislators themselves setting the boundaries (while sometimes being very open on their intentions to favour their own side) to independent commissions being in charge. I did not spot any clear involvement of statisticians in the process.

“The application of differential privacy will bring significant harm to Alabama (…) The Census Bureau has not shown that other disclosure avoidance methods
would not satisfy the privacy requirements
” Case No. 3:21-cv-00211

While looking at this highly informative webpage maintained by University of Colorado Law School Doug Spencer, I came across this federal court challenge by the State of Alabama again the Census Bureau for using differential privacy! A statistical version of “shoot the messenger”?! The legal argument of the State is “the Fifth Amendment, alleging that differential privacy is a violation of the one-person, one-vote principle and will result in the dilution of their votes.” I however wonder what is the genuine (political) reason for this challenge!

One World ABC seminar [season 2]

Posted in Books, Statistics, University life with tags , , , , , , on March 23, 2021 by xi'an

The One World ABC seminar will resume its talks on ABC methods with a talk on Thursday, 25 March, 12:30CET, by Mijung Park, from the Max Planck Institute for Intelligent Systems, on the exciting topic of producing differential privacy by ABC. (Talks will take place on a monthly basis.)

Monte Carlo fusion

Posted in Statistics with tags , , , , , , , , , on January 18, 2019 by xi'an

Hongsheng Dai, Murray Pollock (University of Warwick), and Gareth Roberts (University of Warwick) just arXived a paper we discussed together last year while I was at Warwick. Where fusion means bringing different parts of the target distribution

f(x)∝f¹(x)f²(x)…

together, once simulation from each part has been done. In the same spirit as in Scott et al. (2016) consensus Monte Carlo. Where for instance the components of the target cannot be computed simultaneously, either because of the size of the dataset, or because of privacy issues.The idea in this paper is to target an augmented density with the above marginal, using for each component of f, an auxiliary variable x¹,x²,…, and a target that is the product of the squared component, f¹(x¹)², f²(x²)², … by a transition density keeping f¹(.)²,f²(.)²,… invariant:

f^c(x^c)^2 p_c(y|x^c) / f_c(y)

as for instance the transition density of a Langevin diffusion. The marginal of

\prod_c f^c(x^c)^2 p_c(y|x^c) / f_c(y)

as a function of y is then the targeted original product. Simulating from this new extended target can be achieved by rejection sampling. (Any impact of the number of auxiliary variables on the convergence?) The practical implementation actually implies using the path-space rejection sampling methods in the Read Paper of Beskos et al. (2006). (An extreme case of the algorithm is actually an (exact) ABC version where the simulations x¹,x²,… from all components have to be identical and equal to y. The opposite extreme is the consensus Monte Carlo Algorithm, which explains why this algorithm is not an efficient solution.) An alternative is based on an Ornstein-Uhlenbeck bridge. While the paper remains at a theoretical level with toy examples, I heard from the same sources that applications to more realistic problems and implementation on parallel processors is under way.

Nature Outlook on AI

Posted in Statistics with tags , , , , , , , , , , , , , , , on January 13, 2019 by xi'an

The 29 November 2018 issue of Nature had a series of papers on AIs (in its Outlook section). At the general public (awareness) level than in-depth machine-learning article. Including one on the forecasted consequences of ever-growing automation on jobs, quoting from a 2013 paper by Carl Frey and Michael Osborne [of probabilistic numerics fame!] that up to 47% of US jobs could become automated. The paper is inconclusive on how taxations could help in or deter from transfering jobs to other branches, although mentioning the cascading effect of taxing labour and subsidizing capital. Another article covers the progresses in digital government, with Estonia as a role model, including the risks of hacking (but not mentioning Russia’s state driven attacks). Differential privacy is discussed as a way to keep data “secure” (but not cryptography à la Louis Aslett!). With another surprising entry that COBOL is still in use in some administrative systems. Followed by a paper on the apparently limited impact of digital technologies on mental health, despite the advertising efforts of big tech companies being described as a “race to the bottom of the brain stem”! And another one on (overblown) public expectations on AIs, although the New York Time had an entry yesterday on people in Arizona attacking self-driving cars with stones and pipes… Plus a paper on the growing difficulties of saving online documents and culture for the future (although saving all tweets ever published does not sound like a major priority to me!).

Interesting (?) aside, the same issue contains a general public article on the use of AIs for peer reviews (of submitted papers). The claim being that “peer review by artificial intelligence (AI) is promising to improve the process, boost the quality of published papers — and save reviewers time.” A wee bit over-optimistic, I would say, as the developed AI’s are at best “that statistics and methods in manuscripts are sound”. For instance, producing “key concepts to summarize what the paper is about” is not particularly useful. A degree of innovation compared with the existing would be. Or an automated way to adapt the paper style to the strict and somewhat elusive Biometrika style!

%d bloggers like this: