One of the advantages of supervised learning is that the final error met(cid:173) ric is available during training. For classifiers, the algorithm can directly reduce the number of misclassifications on the training set. Unfortu(cid:173) nately, when modeling human learning or constructing classifiers for au(cid:173) tonomous robots, supervisory labels are often not available or too ex(cid:173) pensive. In this paper we show that we can substitute for the labels by making use of structure between the pattern distributions to different sen(cid:173) sory modalities. We show that minimizing the disagreement between the outputs of networks processing patterns from these different modalities is a sensible approximation to minimizing the number of misclassifications in each modality, and leads to similar results. Using the Peterson-Barney vowel dataset we show that the algorithm performs well in finding ap(cid:173) propriate placement for the codebook vectors particularly when the con(cid:173) fuseable classes are different for the two modalities.