Fast Learning in Bandit Optimization
We consider several problems of "bandit optimization", inspired by stochastic optimization and/or online learning where gradients are not computable.
In the first problem, the objective is to minimize a global, not necessarily cumulative, convex loss function. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve it, we analyze the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques ranging from bandits to convex optimization.
The second problem is concerned with the optimization of convex functions that are highly regular, which is standard in machine learning. We show how to leverage this assumption.
In the last problem (if time permits), I will present an interesting application of bandit to fast learning in repeated auctions.