What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Omar Bennouna, Amine Bennouna, Saurabh Amin, Asuman Ozdaglar

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Main Conference Track

We study the fundamental question of how informative a dataset is for solving a given decision-making task. In our setting, the dataset provides partial information about unknown parameters that influence task outcomes. Focusing on linear programs, we characterize when a dataset is sufficient to recover an optimal decision, given an uncertainty set on the cost vector. Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions---offering a principled foundation for task-aware data selection.