Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Dmitriy Smirnov, MICHAEL GHARBI, Matthew Fisher, Vitor Guizilini, Alexei Efros, Justin M. Solomon
Artists and video game designers often construct 2D animations using libraries of sprites---textured patches of objects and characters. We propose a deep learning approach that decomposes sprite-based video animations into a disentangled representation of recurring graphic elements in a self-supervised manner. By jointly learning a dictionary of possibly transparent patches and training a network that places them onto a canvas, we deconstruct sprite-based content into a sparse, consistent, and explicit representation that can be easily used in downstream tasks, like editing or analysis. Our framework offers a promising approach for discovering recurring visual patterns in image collections without supervision.