NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:5963
Title:Image Captioning: Transforming Objects into Words

An object relation module is included into the transformer model. Improvements are demonstrated using this approach. After reading the rebuttal the reviewers agreed that this is an interesting direction to pursue. The reviewers liked the method and partly the results presented in the rebuttal. However the reviewers also remained concerned that additional evidence is necessary (e.g., proper evaluation on test server, experimentation with different spatial features, more in-depth discussion of the attention visualizations, empirical comparison to prior work and human evaluation). AC concurs and recommends acceptance as a poster.