Paper ID: | 8657 |
---|---|

Title: | An Algorithm to Learn Polytree Networks with Hidden Nodes |

Learning causal structures with latent variables is a major challenge. This paper takes a shot at one of the simplest cases, polytree causal networks. While this is a limited special case, the ideas and methods may be useful more generally. It is interesting that the method needs only second and third order statistics of the observed variables. The paper would benefit from greater clarity in several areas. The paper defines of a quasi-skeleton and a collapsed representation, but I don't see a definition of a collapsed quasi-skeleton. Step 3 of the process in Section 3 should be worded more carefully. First, as noted above, collapsed quasi-skeletons are not defined. Second, what does it mean for the collapsed quasi-skeletons to "partially overlap"? The collapsed representation replaces each hidden cluster with a single hidden variable; so what does it mean for two single hidden variables to "partially overlap"? An example of partial overlap might be helpful; none of your quasi-skeleton examples have any overlap in their hidden clusters. Your supplemental material gives conditions required for this to happen, but there are no examples. This makes the algorithm hard to understand. The authors have already noted in the supplemental material that Figure 1(b) needed to include orientations on the edges. This confused me because I read the paper prior to the supplemental material. Thanks also for pointing out the errors in HRRA. The abstract could be greatly improved. The first sentence says that an approach is given by a formulation. This is nonsense! An ancestral graph is a mathematical object; what is the "ancestral graph approach"? Near the end of a very difficult to follow abstract, you finally say what the paper actually is about, but don't relate it back to the problems you have been discussing with "the ancestral graph approach" (whatever that is). Update: I thank the authors for their comments, which have satisfied me.

The paper considers learning a specific type of latent variable model from conditional independence relations, a latent polytree network. The paper shows that when the underlying network has a tree structure and when certain degree conditions are met for the latent variables, the latent variable model can be exactly recovered. They show the degree conditions are complete in the sense that when they are not met, there is a polytree network with fewer latent variables which satisfy the same conditional independence relations. An algorithm is provided for learning latent polytrees when these conditions are met. In general the paper is well written. The results are novel, sound, and theoretically interesting. The paper doesn't provide much motivation for the approach so it's difficult to gauge the significance. It's not completely obvious what a realistic scenario would be where you can assume both a tree structure and the given degree conditions on the latent nodes, but don't already understand the exact latent structure. No motivating examples were provided and there are no empirical results.

The text is very well-written; the algorithm is a bit involved (as it tests several cases), but a careful reader can follow all steps provided. The class of learnable polytrees is somewhat very limited; in particular, the assumption that the number of hidden variables is known limits the applicability of the method (from a machine learning perspective). Yet I find the work an important contribution for the field, and can certainly find application in some specific domain.