## ABC by classification

**A**s a(nother) coincidence, yesterday, we had a reading group discussion at Paris Dauphine a few days after Veronika Rockova presented the paper in person in Oaxaca. The idea in ABC by classification that she co-authored with Yuexi Wang and Tetsuya Kaj is to use the empirical Kullback-Leibler divergence as a substitute to the intractable likelihood at the parameter value θ. In the generalised Bayes setting of Bissiri et al. Since this quantity is not available it is estimated as well. By a classification method that somehow relates to Geyer’s 1994 inverse logistic proposal, using the (ABC) pseudo-data generated from the model associated with θ. The convergence of the algorithm obviously depends on the choice of the discriminator used in practice. The paper also makes a connection with GANs as a potential alternative for the generalised Bayes representation. It mostly focus on the *frequentist* validation of the ABC posterior, in the sense of exhibiting a posterior concentration rate in n, the sample size, while requiring performances of the discriminators that may prove hard to check in practice. Expanding our 2018 result to this setting, with the tolerance decreasing more slowly than the Kullback-Leibler estimation error.

Besides the shared appreciation that working with the Kullback-Leibler divergence was a nice and under-appreciated direction, one point that came out of our discussion is that using the (estimated) Kullback-Leibler divergence as a form of distance (attached with a tolerance) is less prone to variability (or more robust) than using directly (and without tolerance) the estimate as a substitute to the intractable likelihood, if we interpreted the discrepancy in Figure 3 properly. Another item was about the discriminator function itself: while a machine learning methodology such as neural networks could be used, albeit with unclear theoretical guarantees, it was unclear to us whether or not a *new* discriminator needed be constructed for *each* value of the parameter θ. Even when the simulations are run by a deterministic transform.

## Leave a Reply