This paper presents a formulation for unsupervised learning of clus(cid:173) ters reflecting multiple causal structure in binary data. Unlike the standard mixture model, a multiple cause model accounts for ob(cid:173) served data by combining assertions from many hidden causes, each of which can pertain to varying degree to any subset of the observ(cid:173) able dimensions. A crucial issue is the mixing-function for combin(cid:173) ing beliefs from different cluster-centers in order to generate data reconstructions whose errors are minimized both during recognition and learning. We demonstrate a weakness inherent to the popular weighted sum followed by sigmoid squashing, and offer an alterna(cid:173) tive form of the nonlinearity. Results are presented demonstrating the algorithm's ability successfully to discover coherent multiple causal representat.ions of noisy test data and in images of printed characters.