Learning Bayesian Networks with Thousands of Variables

Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)

Bibtex Metadata Paper Reviews Supplemental


Mauro Scanagatta, Cassio P. de Campos, Giorgio Corani, Marco Zaffalon


We present a method for learning Bayesian networks from data sets containingthousands of variables without the need for structure constraints. Our approachis made of two parts. The first is a novel algorithm that effectively explores thespace of possible parent sets of a node. It guides the exploration towards themost promising parent sets on the basis of an approximated score function thatis computed in constant time. The second part is an improvement of an existingordering-based algorithm for structure optimization. The new algorithm provablyachieves a higher score compared to its original formulation. On very large datasets containing up to ten thousand nodes, our novel approach consistently outper-forms the state of the art.