importance weighting without importance weights [ABC for bandits?!]

I did not read very far in the recent arXival by Neu and Bartók, but I got the impression that it was a version of ABC for bandit problems where the probabilities behind the bandit arms are not available but can be generated. Since the stopping rule found in the “Recurrence weighting for multi-armed bandits” is the generation of an arm equal to the learner’s draw (p.5). Since there is no tolerance there, the method is exact (“unbiased”). As no reference is made to the ABC literature, this may be after all a mere analogy…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s