GauDP: Reinventing Multi-Agent Collaboration through Gaussian-Image Synergy in Diffusion Policies

Ziye Wang, Li Kang, Yiran Qin, Jiahua Ma, zhanglin peng, LEI BAI, Ruimao Zhang

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Main Conference Track

Despite significant advances in robotic policy generation, effective coordination in embodied multi-agent systems remains a fundamental challenge—particularly in scenarios where agents must balance individual perspectives with global environmental awareness. Existing approaches often struggle to balance fine-grained local control with comprehensive scene understanding, resulting in limited scalability and compromised collaboration quality. In this paper, we present GauDP, a novel Gaussian-image synergistic representation that facilitates scalable, perception-aware imitation learning in multi-agent collaborative systems. Specifically, GauDP reconstructs a globally consistent 3D Gaussian field from local-view RGB images, allowing all agents to dynamically query task-relevant features from a shared scene representation. This design facilitates both fine-grained control and globally coherent behavior without requiring additional sensing modalities. We evaluate GauDP on the RoboFactory benchmark, which includes diverse multi-arm manipulation tasks. Our method achieves superior performance over existing image-based methods and approaches the effectiveness of point-cloud-driven methods, while maintaining strong scalability as the number of agents increases. Extensive ablations and visualizations further demonstrate the robustness and efficiency of our unified local-global perception framework for multi-agent embodied learning.