Archive for anonymised data

estimation exam [best of]

Posted in Books, Kids, Statistics with tags , , , , , , , , on January 29, 2019 by xi'an

Yesterday, I received a few copies of our CRC Press Handbook of Mixture Analysis, while grading my mathematical statistics exam 160 copies. Among the few goodies, I noticed the always popular magical equality

E[1/T]=1/E[T]

that must have been used in so many homeworks and exam handouts by now that it should become a folk theorem. More innovative is the argument that E[1/min{X¹,X²,…}] does not exist for iid U(0,θ) because it is the minimum and thus is the only one among the order statistics with the ability to touch zero. Another universal shortcut was the completeness conclusion that when the integral

\int_0^\theta \varphi(x) x^k \text{d}x

was zero for all θ’s then φ had to be equal to zero with no further argument (only one student thought to take the derivative). Plus a growing inability in the cohort to differentiate even simple functions… (At least, most students got the bootstrap right, as exemplified by their R code.) And three stars to the student who thought of completely gluing his anonymisation tag, on every one of his five sheets!, making identification indeed impossible, except by elimination of the 159 other names.

bitcoin and cryptography for statistical inference and AI

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on April 16, 2018 by xi'an

A recent news editorial in Nature (15 March issue) reminded me of the lectures Louis Aslett gave at the Gregynog Statistical Conference last week, on the advanced use of cryptography tools to analyse sensitive and private data. Lectures that reminded me of a graduate course I took on cryptography and coding, in Paris 6, and which led me to visit a lab at the Université de Limoges during my conscripted year in the French Navy. With no research outcome. Now, the notion of using encrypted data towards statistical analysis is fascinating in that it may allow for efficient inference and personal data protection at the same time. As opposed to earlier solutions of anonymisation that introduced noise and data degradation, not always providing sufficient protection of privacy. Encryption that is also the notion at the basis of the Nature editorial. An issue completely missing from the paper, while stressed by Louis, is that this encryption (like Bitcoin) is costly, in order to deter hacking, and hence energy inefficient. Or limiting the amount of data that can be used in such studies, which would turn the idea into a stillborn notion.