Reviews: Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers

The paper proposes to integrate a stochastic relabelling embedding operator within the training of a neural net. The reviewers and the area chair are convinced of the merits of the approach which comes with a theoretical justification (smoothing the Rademacher complexity in the uniform case) and solid comparative empirical evidence. The visualization of the embeddings and their interpretation (in supplementary material and in the rebuttal) are appreciated. The AC hopes that the authors will take into account the suggestions/questions in the reviews, specifically concerning the scope of the approach and its limitations, when writing the camera-ready version of the paper. Another question which comes to mind is whether the knowledge graph (e.g. as learned from a teacher network) can facilitate the training of a student network, e.g. reducing the computational complexity.

Paper ID:	23
Title:	Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers