35% identity\nAll\n\nBPMF-1 BPMF-2 BPMF-3 Clustal [24]\n\nProbCons [25]\n\n0.68\n0.94\n0.97\n0.88\n\n0.74\n0.95\n0.98\n0.91\n\n0.76\n0.95\n0.98\n0.91\n\n0.71\n0.89\n0.97\n0.88\n\n0.72\n0.92\n0.98\n0.89\n\nTable 1: Average SP scores in the ref1/test1 directory of BAliBASE. BPMF-i denotes the average SP of the\nBPMF algorithm after i iterations of (parallel) message passing.\n\ndescribed in Section 3.1 speci\ufb01cally for HBM, is tighter. The experimental setup is based on a gen-\nerative model over noisy observations of bipartite perfect matchings described in Appendix C.2. We\nshow in Figure 3(c) the results of a sequence of these experiments for different bipartite component\nsizes N/2. This experiments demonstrates the scalability of sophisticated factorizations, and their\nsuperiority over simpler ones.\n\n4.2 Multiple sequence alignment\n\nTo assess the practical signi\ufb01cance of this framework, we also apply it to BAliBASE [6], a standard\nprotein multiple sequence alignment benchmark. We compared our system to Clustal 2.0.12 [24],\nthe most popular multiple alignment tool, and ProbCons 1.12, a state-of-the-art system [25] that also\nrelies on enforcing transitivity constraints, but which is not derived via the optimization of an objec-\ntive function. Our system uses a basic pair HMM [26] to score pairwise alignments. This scoring\nfunction captures a proper subset of the biological knowledge exploited by Clustal and ProbCons.6\nThe advantage of our system over the other systems is the better optimization technique, based on\nthe measure factorization described in Section 3.2. We used a standard technique to transform the\npairwise alignment marginals into a single valid multiple sequence alignment (see Appendix C.3).\nOur system outperformed both baselines after three BPMF parallel message passing iterations. The\nalgorithm converged in all protein groups, and performance was identical after more than three itera-\ntions. Although the overall performance gain is not statistically signi\ufb01cant according to a Wilcoxon\nsigned-rank test, the larger gains were obtained in the small identity subset, the \u201ctwilight zone\u201d\nwhere research on multiple sequence alignment has focused.\nOne caveat of this multiple alignment approach is its running time, which is cubic in the length of\nthe longest sequence, while most multiple sequence alignment approaches are quadratic. For exam-\nple, the running time for one iteration of BPMF in this experiment was 364.67s, but only 0.98s for\nClustal\u2014this is why we have restricted the experiments to the short sequences section of BAliBASE.\nFortunately, several techniques are available to decrease the computational complexity of this algo-\nrithm: the transitivity factors can be subsampled using a coarse pass, or along a phylogenetic tree;\nand computation of the factors can be entirely parallelized. These improvements are orthogonal to\nthe main point of this paper, so we leave them for future work.\n\n5 Conclusion\n\nComputing the moments of discrete exponential families can be dif\ufb01cult for two reasons: the struc-\nture of the suf\ufb01cient statistic that can create junction trees of high tree-width, and the structure of\nthe base measures that can induce an intractable combinatorial space. Most previous work on vari-\national approximations has focused on the \ufb01rst dif\ufb01culty; however, the second challenge also arises\nfrequently in machine learning. In this work, we have presented a framework that \ufb01lls this gap.\nIt is based on an intuitive notion of measure factorization, which, as we have shown, applies to\na variety of combinatorial spaces. This notion enables variational algorithms to be adapted to the\ncombinatorial setting. Our experiments both on synthetic and naturally-occurring data demonstrate\nthe viability of the method compared to competing state-of-the-art algorithms.\n\n6More precisely it captures long gap and hydrophobic core modeling.\n\n8\n\n\fReferences\n[1] Alexander Karzanov and Leonid Khachiyan. On the conductance of order Markov chains. Order,\n\nV8(1):7\u201315, March 1991.\n\n[2] Mark Jerrum, Alistair Sinclair, and Eric Vigoda. A polynomial-time approximation algorithm for the\nIn Proceedings of the Annual ACM Symposium on\n\npermanent of a matrix with non-negative entries.\nTheory of Computing, pages 712\u2013721, 2001.\n\n[3] David Wilson. Mixing times of lozenge tiling and card shuf\ufb02ing Markov chains. The Annals of Applied\n\nProbability, 14:274\u2013325, 2004.\n\n[4] Adam Siepel and David Haussler. Phylogenetic estimation of context-dependent substitution rates by\n\nmaximum likelihood. Mol Biol Evol, 21(3):468\u2013488, 2004.\n\n[5] Martin J. Wainwright and Michael I. Jordan. Graphical models, exponential families, and variational\n\ninference. Foundations and Trends in Machine Learning, 1:1\u2013305, 2008.\n\n[6] Julie Thompson, Fr\u00b4ed\u00b4eric Plewniak, and Olivier Poch. BAliBASE: A benchmark alignments database for\n\nthe evaluation of multiple sequence alignment programs. Bioinformatics, 15:87\u201388, 1999.\n\n[7] David A. Smith and Jason Eisner. Dependency parsing by belief propagation.\n\nIn Proceedings of the\nConference on Empirical Methods in Natural Language Processing (EMNLP), pages 145\u2013156, Honolulu,\nOctober 2008.\n\n[8] David Burkett, John Blitzer, and Dan Klein.\n\nJoint parsing and alignment with weakly synchronized\n\ngrammars. In North American Association for Computational Linguistics, Los Angeles, 2010.\n\n[9] Bert Huang and Tony Jebara. Approximating the permanent with belief propagation. ArXiv e-prints,\n\n2009.\n\n[10] Yusuke Watanabe and Michael Chertkov. Belief propagation and loop calculus for the permanent of a\n\nnon-negative matrix. J. Phys. A: Math. Theor., 2010.\n\n[11] Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Christopher Manning. Max-margin parsing.\n\nIn EMNLP, 2004.\n\n[12] Ben Taskar, Simon Lacoste-Julien, and Dan Klein. A discriminative matching approach to word align-\n\nment. In EMNLP 2005, 2005.\n\n[13] John Duchi, Daniel Tarlow, Gal Elidan, and Daphne Koller. Using combinatorial optimization within\n\nmax-product belief propagation. In Advances in Neural Information Processing Systems, 2007.\n\n[14] Aron Culotta, Andrew McCallum, Bart Selman, and Ashish Sabharwal. Sparse message passing algo-\nrithms for weighted maximum satis\ufb01ability. In New England Student Symposium on Arti\ufb01cial Intelligence,\n2007.\n\n[15] Percy Liang, Ben Taskar, and Dan Klein. Alignment by agreement. In North American Association for\n\nComputational Linguistics (NAACL), pages 104\u2013111, 2006.\n\n[16] Percy Liang, Dan Klein, and Michael I. Jordan. Agreement-based learning.\n\nInformation Processing Systems (NIPS), 2008.\n\nIn Advances in Neural\n\n[17] Leslie G. Valiant. The complexity of computing the permanent. Theoret. Comput. Sci., 1979.\n[18] Jonathan S. Yedidia, William T. Freeman, and Yair Weiss. Generalized belief propagation. In Advances\n\nin Neural Information Processing Systems, pages 689\u2013695, Cambridge, MA, 2001. MIT Press.\n\n[19] Carsten Peterson and James R. Anderson. A mean \ufb01eld theory learning algorithm for neural networks.\n\nComplex Systems, 1:995\u20131019, 1987.\n\n[20] Martin J. Wainwright, Tommi S. Jaakkola, and Alan S. Willsky. Tree-reweighted belief propagation algo-\nrithms and approximate ML estimation by pseudomoment matching. In Proceedings of the International\nConference on Articial Intelligence and Statistics, 2003.\n\n[21] Alexandre Bouchard-C\u02c6ot\u00b4e and Michael I. Jordan. Optimization of structured mean \ufb01eld objectives. In\n\nProceedings of Uncertainty in Arti\ufb01cal Intelligence, 2009.\n\n[22] Graham Brightwell and Peter Winkler. Counting linear extensions. Order, 1991.\n[23] Lars Eilstrup Rasmussen. Approximating the permanent: A simple approach. Random Structures and\n\nAlgorithms, 1992.\n\n[24] Des G. Higgins and Paul M. Sharp. CLUSTAL: a package for performing multiple sequence alignment\n\non a microcomputer. Gene, 73:237\u2013244, 1988.\n\n[25] Chuong B. Do, Mahathi S. P. Mahabhashyam, Michael Brudno, and Sera\ufb01m Batzoglou. PROBCONS:\n\nProbabilistic consistency-based multiple sequence alignment. Genome Research, 15:330\u2013340, 2005.\n\n[26] David B. Searls and Kevin P. Murphy. Automata-theoretic models of mutation and alignment. In Proc Int\n\nConf Intell Syst Mol Biol., 1995.\n\n9\n\n\f", "award": [], "sourceid": 243, "authors": [{"given_name": "Alexandre", "family_name": "Bouchard-c\u00f4t\u00e9", "institution": null}, {"given_name": "Michael", "family_name": "Jordan", "institution": null}]}