Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

Wang, Yining; Chen, Xi; Zhou, Yuan

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

Yining Wang, Xi Chen, Yuan Zhou

Advances in Neural Information Processing Systems 31 (NeurIPS 2018)

Abstract

In this paper we consider the dynamic assortment selection problem under an uncapacitated multinomial-logit (MNL) model. By carefully analyzing a revenue potential function, we show that a trisection based algorithm achieves an item-independent regret bound of O(sqrt(T log log T), which matches information theoretical lower bounds up to iterated logarithmic terms. Our proof technique draws tools from the unimodal/convex bandit literature as well as adaptive confidence parameters in minimax multi-armed bandit problems.

Abstract

Name Change Policy