Loading Events

Emilie Kaufmann (CNRS, France)

11 March 2016 @ 12:00

 

  • Past event

Details

Date:
11 March 2016
Time:
12:00
Event Category:

The information complexity of sequential resource allocation

I will talk about sequential resource allocation, under the so-called stochastic multi-armed bandit model. In this model, an agent interacts with a set of (unknown) probability distributions, called ‘arms’ (in reference to ‘one-armed bandits’, another name for slot machines in a casino). When the agent draws an arm, he observes a sample from the associated distribution. This sample can be seen as a reward, and the agent then aims at maximizing the sum of his rewards during the interaction. This ‘regret minimization’ objective makes sense in many practical applications, starting with medical trials, that motivated the introduction of bandit problems in the 1930’s.  Another possible objective for the agent, called best-arm identification, is to discover as fast as possible the best arm(s), that is the arms whose distributions have highest mean, but without suffering a loss when drawing ‘bad’ arms.
 
For each of these objectives, our goal will be to define a distribution-dependent notion of optimality, thanks to lower bounds on the performance of good strategies, and to propose algorithms that can be qualified as optimal according to these lower bounds. For some classes of parametric bandit models, this permits to characterize the complexity of regret minimization and best-arm identification in terms of (different) information-theoretic quantities.