Learning Bayesian Networks with Thousands of Variables

Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)

Bibtex »Metadata »Paper »Reviews »Supplemental »


Mauro Scanagatta, Cassio P. de Campos, Giorgio Corani, Marco Zaffalon


We present a method for learning Bayesian networks from data sets containingthousands of variables without the need for structure constraints. Our approachis made of two parts. The first is a novel algorithm that effectively explores thespace of possible parent sets of a node. It guides the exploration towards themost promising parent sets on the basis of an approximated score function thatis computed in constant time. The second part is an improvement of an existingordering-based algorithm for structure optimization. The new algorithm provablyachieves a higher score compared to its original formulation. On very large datasets containing up to ten thousand nodes, our novel approach consistently outper-forms the state of the art.