importance weighting without importance weights [ABC for bandits?!]

I did not read very far in the recent arXival by Neu and Bartók, but I got the impression that it was a version of ABC for bandit problems where the probabilities behind the bandit arms are not available but can be generated. Since the stopping rule found in the “Recurrence weighting for multi-armed bandits” is the generation of an arm equal to the learner’s draw (p.5). Since there is no tolerance there, the method is exact (“unbiased”). As no reference is made to the ABC literature, this may be after all a mere analogy…

This entry was posted on March 27, 2015 at 12:15 am and is filed under Books, Statistics, University life with tags ABC, machine learning, multi-armed bandits, tolerance, Zurich. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Xi'an's Og