Philip Sterne, Joerg Bornschein, Abdul-saboor Sheikh, Jörg Lücke, Jacquelyn Shelton
Modelling natural images with sparse coding (SC) has faced two main challenges: ﬂexibly representing varying pixel intensities and realistically representing low- level image components. This paper proposes a novel multiple-cause generative model of low-level image statistics that generalizes the standard SC model in two crucial points: (1) it uses a spike-and-slab prior distribution for a more realistic representation of component absence/intensity, and (2) the model uses the highly nonlinear combination rule of maximal causes analysis (MCA) instead of a lin- ear combination. The major challenge is parameter optimization because a model with either (1) or (2) results in strongly multimodal posteriors. We show for the ﬁrst time that a model combining both improvements can be trained efﬁciently while retaining the rich structure of the posteriors. We design an exact piece- wise Gibbs sampling method and combine this with a variational method based on preselection of latent dimensions. This combined training scheme tackles both analytical and computational intractability and enables application of the model to a large number of observed and hidden dimensions. Applying the model to image patches we study the optimal encoding of images by simple cells in V1 and compare the model’s predictions with in vivo neural recordings. In contrast to standard SC, we ﬁnd that the optimal prior favors asymmetric and bimodal ac- tivity of simple cells. Testing our model for consistency we ﬁnd that the average posterior is approximately equal to the prior. Furthermore, we ﬁnd that the model predicts a high percentage of globular receptive ﬁelds alongside Gabor-like ﬁelds. Similarly high percentages are observed in vivo. Our results thus argue in favor of improvements of the standard sparse coding model for simple cells by using ﬂexible priors and nonlinear combinations.