Evaluating the statistical significance of biclusters

Part of Advances in Neural Information Processing Systems 28 (NIPS 2015)

Bibtex Metadata Paper Reviews


Jason D. Lee, Yuekai Sun, Jonathan E. Taylor


Biclustering (also known as submatrix localization) is a problem of high practical relevance in exploratory analysis of high-dimensional data. We develop a framework for performing statistical inference on biclusters found by score-based algorithms. Since the bicluster was selected in a data dependent manner by a biclustering or localization algorithm, this is a form of selective inference. Our framework gives exact (non-asymptotic) confidence intervals and p-values for the significance of the selected biclusters. Further, we generalize our approach to obtain exact inference for Gaussian statistics.