Archive for deep learning
matrix multiplication [cover]
Posted in Books, pictures, Statistics, University life with tags algorithms, AlphaTensor, cover, deep learning, deep neural network, DeepMind, Google, London, matrix algebra, matrix multiplication, Monte Carlo algorithm, Nature, reinforcement learning, tensor, UK on December 15, 2022 by xi'ansufficient statistics for machine learning
Posted in Books, Running, Statistics, Travel with tags deep learning, George Darmois, ICML 2019, minimal sufficient statistic, Pitman-Koopman theorem, sufficiency on April 26, 2022 by xi'anBy chance, I came across this ICML¹⁹ paper of Milan Cvitkovic and Günther Koliander, Minimal Achievable Sufficient Statistic Learning, on a form of sufficiency for machine learning. The paper starts with “our” standard notion of sufficiency albeit in a predictive sense, namely that Z=T(X) is sufficient for predicting Y if the conditional distribution of Y given Z is the same as the conditional distribution of Y given X. It also acknowledges that minimal sufficiency may be out of reach. However, and without pursuing this question into the depths of said paper, I am surprised that any type of sufficiency can be achieved there since the model stands outside exponential families… In accordance with the Darmois-Pitman-Koopman lemma. Obviously, this is not a sufficiency notion in the statistical sense, since there is no likelihood (albeit there are parameters involved in the deep learning network). And Y is a discrete variate, which means that
is a sufficient “statistic” for a fixed conditional, but I am lost at how the solution proposed in the paper, could be minimal when the dimension and structure of T(x) are chosen from the start. A very different notion, for sure!
Metropolis-Hastings via Classification [One World ABC seminar]
Posted in Statistics, University life with tags ABC, ABC consistency, Chicago, Chicago Booth School of Business, classification, deep learning, discriminant analysis, GANs, logistic regression, Metropolis-Hastings algorithm, seminar, summary statistics, synthetic likelihood, University of Oxford, University of Warwick, webinar on May 27, 2021 by xi'anToday, Veronika Rockova is giving a webinar on her paper with Tetsuya Kaji Metropolis-Hastings via classification. at the One World ABC seminar, at 11.30am UK time. (Which was also presented at the Oxford Stats seminar last Feb.) Please register if not already a member of the 1W ABC mailing list.
NCE, VAEs, GANs & even ABC…
Posted in Statistics with tags ABC, Bayesian GANs, CDT, deep learning, energy based model, generative adversarial networks, noise contrasting estimation, normalising constant, normalising flow, partition function, PhD course, Teams, University of Warwick, variational autoencoders on May 14, 2021 by xi'anAs I was preparing my (new) lectures for a PhD short course “at” Warwick (meaning on Teams!), I read a few surveys and other papers on all these acronyms. It included the massive Guttmann and Hyvärinen 2012 NCE JMLR paper, Goodfellow’s NIPS 2016 tutorial on GANs, and Kingma and Welling 2019 introduction to VAEs. Which I found a wee bit on the light side, maybe missing the fundamentals of the notion… As well as the pretty helpful 2019 survey on normalising flows by Papamakarios et al., although missing on the (statistical) density estimation side. And also a nice (2017) survey of GANs by Shakir Mohamed and Balaji Lakshminarayanan with a somewhat statistical spirit, even though convergence issues are not again not covered. But misspecification is there. And the many connections between ABC and GANs, if definitely missing on the uncertainty aspects. While Deep Learning by Goodfellow, Bengio and Courville adresses both the normalising constant (or partition function) and GANs, it was somehow not deep enough (!) to use for the course, offering only a few pages on NCE, VAEs and GANs. (And also missing on the statistical references addressing the issue, incl. [or excl.] Geyer, 1994.) Overall, the infinite variations offered on GANs leave me uncertain about their statistical relevance, as it is unclear how good the regularisation therein is for handling overfitting and consistent estimation. (And if I spot another decomposition of the Kullback-Leibler divergence, I may start crying…)
sampling with neural networks [seminar]
Posted in Statistics with tags deep learning, Flatiron Institute, generative model, neural network sampling, New York, webinar on March 29, 2021 by xi'anTomorrow (30 March, 11am ET, 16 GMT, 17 CET) Grant Rotskoff will give a webinar on Sampling with neural networks: prospects and perils, with links to developments in generative modeling to sample distributions that are challenging to sample with local dynamics, and the perils of neural network driven sampling to accelerate sampling.