This paper proposes a method for learning text embeddings that uses EVT (extreme value theory) to generalize better in extreme regions of the input space. The contribution is sound, meaningful, as well a timely, and I think the paper will be of significant interest to a large portion of the Neurips audience.