**I** very recently read a 2021 paper by Mijung Park, Margarita Vinaroz, and Wittawat Jitkrittum on running ABC while ensuring data privacy (published in Entropy).

*“…adding noise to the distance computed on the real observations and pseudo-data suffices the privacy guarantee of the resulting posterior samples”*

For ABC tolerance, they use maximum mean discrepancy (MMD) and for privacy the standard if unconvincing notion of differential privacy, defined by ensuring an upper bound on the amount of variation in the probability ratio when replacing/removing/adding an observation. (But not clearly convincing users their data is secure.)

While I have no reservation about the validation of the double-noise approach, I find it surprising that noise must be (twice) added when vanilla ABC is already (i) noisy, since based on random pseudo-data, and (ii) producing only a sample from an approximate posterior instead of returning an exact posterior. My impression indeed was that ABC should be good enough by itself to achieve privacy protection. In the sense that the accepted parameter values were those that generated random samples sufficiently close to the actual data, hence not only compatible with the true data, but also producing artificial datasets that are close enough to the data. Presumably these artificial datasets should not be produced as the intersection of their ε neighbourhoods may prove enough to identify the actual data. (The proposed algorithm does return all generated datasets.) Instead the supported algorithm involves randomisation of both tolerance ε and distance ρ to the observed data (with the side issue that they may become negative since the noise is Laplace).

### Like this:

Like Loading...