{"title": "Comparing Bayesian models for multisensory cue combination without mandatory integration", "book": "Advances in Neural Information Processing Systems", "page_first": 81, "page_last": 88, "abstract": null, "full_text": "Comparing Bayesian models for multisensory cue\n\ncombination without mandatory integration\n\nUlrik R. Beierholm\n\nComputation and Neural Systems\nCalifornia Institute of Technology\n\nPasadena, CA 91025\n\nbeierh@caltech.edu\n\nLadan Shams\n\nDepartment of Psychology\n\nUniversity of California, Los Angeles\n\nLos Angeles, CA 90095\n\nladan@psych.ucla.edu\n\nKonrad P. K\u00a8ording\n\nRehabilitation Institute of Chicago\n\nNorthwestern University, Dept. PM&R\n\nChicago, IL 60611\n\nkonrad@koerding.com\n\nWei Ji Ma\n\nDepartment of Brain and Cognitive Sciences\n\nUniversity of Rochester\nRochester, NY 14620\n\nweijima@gmail.com\n\nAbstract\n\nBayesian models of multisensory perception traditionally address the problem of\nestimating an underlying variable that is assumed to be the cause of the two sen-\nsory signals. The brain, however, has to solve a more general problem: it also has\nto establish which signals come from the same source and should be integrated,\nand which ones do not and should be segregated. In the last couple of years, a\nfew models have been proposed to solve this problem in a Bayesian fashion. One\nof these has the strength that it formalizes the causal structure of sensory signals.\nWe \ufb01rst compare these models on a formal level. Furthermore, we conduct a psy-\nchophysics experiment to test human performance in an auditory-visual spatial\nlocalization task in which integration is not mandatory. We \ufb01nd that the causal\nBayesian inference model accounts for the data better than other models.\nKeywords: causal inference, Bayesian methods, visual perception.\n\n1 Multisensory perception\n\nIn the ventriloquist illusion, a performer speaks without moving his/her mouth while moving a\npuppet\u2019s mouth in synchrony with his/her speech. This makes the puppet appear to be speaking.\nThis illusion was \ufb01rst conceptualized as \u201dvisual capture\u201d, occurring when visual and auditory stimuli\nexhibit a small con\ufb02ict ([1, 2]). Only recently has it been demonstrated that the phenomenon may be\nseen as a byproduct of a much more \ufb02exible and nearly Bayes-optimal strategy ([3]), and therefore\nis part of a large collection of cue combination experiments showing such statistical near-optimality\n[4, 5]. In fact, cue combination has become the poster child for Bayesian inference in the nervous\nsystem.\nIn previous studies of multisensory integration, two sensory stimuli are presented which act as cues\nabout a single underlying source. For instance, in the auditory-visual localization experiment by\nAlais and Burr [3], observers were asked to envisage each presentation of a light blob and a sound\nclick as a single event, like a ball hitting the screen. In many cases, however, the brain is not only\nposed with the problem of identifying the position of a common source, but also of determining\nwhether there was a common source at all. In the on-stage ventriloquist illusion, it is indeed primar-\nily the causal inference process that is being fooled, because veridical perception would attribute\nindependent causes to the auditory and the visual stimulus.\n\n1\n\n\fTo extend our understanding of multisensory perception to this more general problem, it is necessary\nto manipulate the degree of belief assigned to there being a common cause within a multisensory\ntask. Intuitively, we expect that when two signals are very different, they are less likely to be per-\nceived as having a common source. It is well-known that increasing the discrepancy or inconsistency\nbetween stimuli reduces the in\ufb02uence that they have on each other [6, 7, 8, 9, 10, 11]. In auditory-\nvisual spatial localization, one variable that controls stimulus similarity is spatial disparity (another\nwould be temporal disparity). Indeed, it has been reported that increasing spatial disparity leads to a\ndecrease in auditory localization bias [1, 12, 13, 14, 15, 16, 17, 2, 18, 19, 20, 21]. This decrease also\ncorrelates with a decrease in the reports of unity [19, 21]. Despite the abundance of experimental\ndata on this issue, no general theory exists that can explain multisensory perception across a wide\nrange of cue con\ufb02icts.\n\n2 Models\n\nThe success of Bayesian models for cue integration has motivated attempts to extend them to situa-\ntions of large sensory con\ufb02ict and a consequent low degree of integration. In one of recent studies\ntaking this approach, subjects were presented with concurrent visual \ufb02ashes and auditory beeps and\nasked to count both the number of \ufb02ashes and the number of beeps [11]. The advantage of the\nexperimental paradigm adopted here was that it probed the joint response distribution by requiring\na dual report. Human data were accounted for well by a Bayesian model in which the joint prior\ndistribution over visual and auditory number was approximated from the data. In a similar study,\nsubjects were presented with concurrent \ufb02ashes and taps and asked to count either the \ufb02ashes or\nthe taps [9, 22]. The Bayesian model proposed by these authors assumed a joint prior distribution\nwith a near-diagonal form. The corresponding generative model assumes that the sensory sources\nsomehow interact with one another. A third experiment modulated the rates of \ufb02ashes and beeps.\nThe task was to judge either the visual or the auditory modulation rate relative to a standard [23].\nThe data from this experiment were modeled using a joint prior distribution which is the sum of a\nnear-diagonal prior and a \ufb02at background.\nWhile all these models are Bayesian in a formal sense, their underlying generative model does\nnot formalize the model selection process that underlies the combination of cues. This makes it\nnecessary to either estimate an empirical prior [11] by \ufb01tting it to human behavior or to assume an\nad hoc form [22, 23]. However, we believe that such assumptions are not needed. It was shown\nrecently that human judgments of spatial unity in an auditory-visual spatial localization task can be\ndescribed using a Bayesian inference model that infers causal structure [24, 25]. In this model, the\nbrain does not only estimate a stimulus variable, but also infers the probability that the two stimuli\nhave a common cause. In this paper we compare these different models on a large data set of human\nposition estimates in an auditory-visual task.\nIn this section we \ufb01rst describe the traditional cue integration model, then the recent models based\non joint stimulus priors, and \ufb01nally the causal inference model. To relate to the experiment in the\nnext section, we will use the terminology of auditory-visual spatial localization, but the formalism\nis very general.\n\n2.1 Traditional cue integration\n\nThe traditional generative model of cue integration [26] has a single source location s which pro-\nduces on each trial an internal representation (cue) of visual location, xV and one of auditory lo-\ncation, xA. We assume that the noise processes by which these internal representations are gen-\nerated are conditionally independent from each other and follow Gaussian distributions. That is,\np (xV |s) \u223c N (xV ; s, \u03c3V )and p (xA|s) \u223c N (xA; s, \u03c3A), where N (x; \u00b5, \u03c3) stands for the normal\ndistribution over x with mean \u00b5 and standard deviation \u03c3. If on a given trial the internal representa-\ntions are xV and xA, the probability that their source was s is given by Bayes\u2019 rule,\n\np (s|xV , xA) \u221d p (xV |s) p (xA|s) .\n\nIf a subject performs maximum-likelihood estimation, then the estimate will be\n\u02c6s = wV xV +wAxA\n. It is important to keep in mind that this is the\nestimate on a single trial. A psychophysical experimenter can never have access to xV and xA, which\n\n, where wV = 1\n\u03c32\nV\n\nwV +wA\n\nand wA = 1\n\u03c32\nA\n\n2\n\n\fare the noisy internal representations. Instead, an experimenter will want to collect estimates over\nmany trials and is interested in the distribution of \u02c6s given sV and sA, which are the sources generated\nby the experimenter. In a typical cue combination experiment, xV and xA are not actually generated\nby the same source, but by different sources, a visual one sV and an auditory one sA. These sources\nare chosen close to each other so that the subject can imagine that the resulting cues originate from\na single source and thus implicitly have a common cause. The experimentally observed distribution\nis then\n\n(cid:90) (cid:90)\n\np (\u02c6s|sV , sA) =\n\np (\u02c6s|xV , xA) p (xV |sV ) p (xA|sA) dxV dxA\n\nGiven that \u02c6s is a linear combination of two normally distributed variables, it will itself follow a\nnormal distribution, with mean(cid:104)\u02c6s(cid:105) = wV sV +wAsA\n. The reason that we\nemphasize this point is because many authors identify the estimate distribution p (\u02c6s|sV , sA) with\nthe posterior distribution p (s|xV , xA). This is justi\ufb01ed in this case because all distributions are\nGaussian and the estimate is a linear combination of cues. However, in the case of causal inference,\nthese conditions are violated and the estimate distribution will in general not be the same as the\nposterior distribution.\n\nand variance \u03c32\n\n\u02c6s =\n\nwV +wA\n\nwV +wA\n\n1\n\n2.2 Models with bisensory stimulus priors\n\nModels with bisensory stimulus priors propose the posterior over source positions to be proportional\nto the product of unimodal likelihoods and a two-dimensional prior:\n\np (sV , sA|xV , xA) = p (sV , sA) p (xV |sV ) p (xA|sA)\n\nThe traditional cue combination model has p (sV , sA) = p (sV ) \u03b4 (sV \u2212 sA), usually (as above)\neven with p (sV ) uniform. The question arises what bisensory stimulus prior is appropriate. In [11],\nthe prior is estimated from data, has a large number of parameters, and is therefore limited in its\npredictive power. In [23], it has the form\n\n\u2212 (sV \u2212sA)2\ncoupling\n\n2\u03c32\n\nthree models,\n\np (sV , sA) \u221d \u03c9 + e\nwhile in [22] the additional assumption \u03c9 = 0 is made1.\nIn all\nthe response distribu-\ntion p (\u02c6sV , \u02c6sA|sV , sA) is obtained by iden-\ntifying it with the posterior distribution\np (sV , sA|xV , xA). This procedure thus implic-\nitly assumes that marginalizing over the latent\nvariables xV and xA is not necessary, which\nleads to a signi\ufb01cant error for non-Gaussian pri-\nors. In this paper we correctly deal with these\nissues and in all cases marginalize over the la-\ntent variables. The parametric models used for\nthe coupling between the cues lead to an ele-\ngant low-dimensional model of cue integration\nthat allows for estimates of single cues that dif-\nfer from one another.\n\n2.3 Causal inference model\n\nIn the causal inference model [24, 25], we\nstart from the traditional cue integration model\nbut remove the assumption that two signals are\ncaused by the same source. Instead, the num-\nber of sources can be one or two and is itself a\nvariable that needs to be inferred from the cues.\n\nFigure 1: Generative model of causal inference.\n\n1This family of Bayesian posterior distributions also includes one used to successfully model cue combina-\ntion in depth perception [27, 28]. In depth perception, however, there is no notion of segregation as always a\nsingle surface is assumed.\n\n3\n\nXAXVSCXVXASSVAC=1C=2\fIf there are two sources, they are assumed to be independent. Thus, we use the graphical model\ndepicted in Fig. 1. We denote the number of sources by C. The probability distribution over C\ngiven internal representations xV and xA is given by Bayes\u2019 rule:\n\np (C|xV , xA) \u221d p (xV , xA|C) p (C) .\n\nIn this equation, p (C) is the a priori probability of C. We will denote the probability of a common\ncause by pcommon, so that p (C = 1) = pcommon and p (C = 2) = 1 \u2212 pcommon. The probability of\ngenerating xV and xA given C is obtained by inserting a summation over the sources:\np (xV |s) p (xA|s)p (s) ds\n\np (xV , xA|s)p (s) ds =\n\np (xV , xA|C = 1) =\n\n(cid:90)\n\nHere p (s) is a prior for spatial location, which we assume to be distributed as N (s; 0, \u03c3P ). Then all\nthree factors in this integral are Gaussians, allowing for an analytic solution: p (xV , xA|C = 1) =\n\n\u221a\n\u03c32\nV \u03c32\n\n1\nA+\u03c32\n\n2\u03c0\n\nV \u03c32\n\nP +\u03c32\n\nA\u03c32\n\nP\n\nexp\n\n(xV \u2212xA)2\u03c32\nA+\u03c32\n\n\u03c32\nV \u03c32\n\nP +x2\nV \u03c32\n\nV \u03c32\nP +\u03c32\n\nA+x2\nA\u03c32\n\nP\n\nA\u03c32\n\nV\n\nFor p (xV , xA|C = 2) we realize that xV and xA are independent of each other and thus obtain\n\n(cid:90)\n(cid:105)\n(cid:19)(cid:18)(cid:90)\n\n.\n\n(cid:104)\u2212 1\n(cid:18)(cid:90)\n\n2\n\n(cid:19)\n\np (xV , xA|C = 2) =\n\np (xV |sV )p (sV ) dsV\n\np (xA|sA)p (sA) dsA\n\nAgain, as all these distributions are assumed to be Gaussian, we obtain an analytic solution,\np (xV , xA|C = 2) =\n+ x2\n. Now that we have com-\n\u03c32\nA+\u03c32\nputed p (C|xV , xA), the posterior distribution over sources is given by\n\n1\np)(\u03c32\nV +\u03c32\n\n\u03c32\nV +\u03c32\n\nexp\n\n2\n\nA\n\nV\n\np\n\np\n\n(cid:104)\u2212 1\n\n(cid:16) x2\n\n(cid:17)(cid:105)\n\n2\u03c0(cid:112)(\u03c32\np (si|xV , xA) = (cid:88)\n\np)\nA+\u03c32\n\np (si|xV , xA, C) p (C|xV , xA)\n\nC=1,2\n\nwhere i can be V or A and the posteriors conditioned on C are well-known:\n\n(cid:82) p (xA|s) p (xV |s) p (s) ds\np (si|xA, xV , C = 1) = p (xA|si) p (xV |si) p (si)\n(cid:68)\np (C = 2|xV , xA) (\u02c6s \u2212 sV or A)2(cid:69)\n\nThe former is the same as in the case of mandatory integration with a prior, the latter is simply\nthe unimodal posterior in the presence of a prior. Based on the posterior distribution on a given\ntrial, p (si|xV , xA), an estimate has to be created. For this, we use a sum-squared-error cost func-\ntion, Cost =\n. Then the best\nestimate is the mean of the posterior distribution, for instance for the visual estimation:\n\np (C = 1|xV , xA) (\u02c6s \u2212 s)2(cid:69)\n\n(cid:82) p (xi|si) p (si) dsi\n\np (si|xA, xV , C = 2) =\n\np (xi|si) p (si)\n\n(cid:68)\n\n+\n\n,\n\n\u02c6sV = p (C = 1|xA, xV ) \u02c6sV,C=1 + p (C = 2|xA, xV ) \u02c6sV,C=2\n\n\u22122\nP\n\n\u22122\nV +xP \u03c3\n\u22122\n\u22122\nV +\u03c3\nP\n\n\u22122\n\u22122\nA +xP \u03c3\nP\n\n\u22122\nV +xA\u03c3\n\u22122\n\u22122\n\u03c3\nA +\u03c3\nP\n\n\u22122\nV +\u03c3\n\nand \u02c6sV,C=2 = xV \u03c3\n\nwhere \u02c6sV,C=1 = xV \u03c3\nIf pcommonequals 0 or\n1, this estimate reduces to one of the conditioned estimates and is linear in xV and xA.\nIf\n0 < pcommon < 1, the estimate is a nonlinear combination of xV and xA, because of the func-\ntional form of p (C|xV , xA). The response distributions, that is the distributions of \u02c6sV and \u02c6sA given\nsV and sA over many trials, now cannot be identi\ufb01ed with the posterior distribution on a single trial\nand cannot be computed analytically either. The correct way to obtain the response distribution is to\nsimulate an experiment numerically.\nNote that the causal inference model above can also be cast in the form of a bisensory stimulus prior\nby integrating out the latent variable C, with:\n\n\u03c3\n\n.\n\np (sA, sV ) = p (C = 1) \u03b4 (sA \u2212 sV ) p (sA) + p (sA) p (sV ) p (C = 2)\n\nHowever, in addition to justifying the form of the interaction between the cues, the causal inference\nmodel has the advantage of being based on a generative model that well formalizes salient properties\nof the world, and it thereby also allows to predict judgments of unity.\n\n4\n\n\f3 Model performance and comparison\n\nTo examine the performance of the causal inference model and to compare it to previous models, we\nperformed a human psychophysics experiment in which we adopted the same dual-report paradigm\nas was used in [11]. Observers were simultaneously presented with a brief visual and also an auditory\nstimulus, each of which could originate from one of \ufb01ve locations on an imaginary horizontal line\n(-10\u25e6, -5\u25e6, 0\u25e6, 5\u25e6, or 10\u25e6 with respect to the \ufb01xation point). Auditory stimuli were 32 ms of white\nnoise \ufb01ltered through an individually calibrated head related transfer function (HRTF) and presented\nthrough a pair of headphones, whereas the visual stimuli were high contrast Gabors on a noisy\nbackground presented on a 21-inch CRT monitor. Observers had to report by means of a key press\n(1-5) the perceived positions of both the visual and the auditory stimulus. Each combination of\nlocations was presented with the same frequency over the course of the experiment. In this way, for\neach condition, visual and auditory response histograms were obtained.\nWe obtained response distributions for each the three models described above by numeral simula-\ntion. On each trial, estimation is followed by a step in which, the key is selected which corresponds\nto the position closed to the best estimate. The simulated histograms obtained in this way were\ncompared to the measured response frequencies of all subjects by computing the R2 statistic.\nThe parameters in the causal inference model were optimized using fminsearch in MATLAB to\nmaximize R2. The best combination of parameters yielded an R2 of 0.97. The response frequencies\nare depicted in Fig. 2. The bisensory prior models also explain most of the variance, with R2 = 0.96\nfor the Roach model and R2 = 0.91 for the Bresciani model. This shows that it is possible to model\ncue combination for large disparities well using such models.\n\nFigure 2: A comparison between subjects\u2019 performance and the causal inference model. The blue\nline indicates the frequency of subjects responses to visual stimuli, red line is the responses to\nauditory stimuli. Each set of lines is one set of audio-visual stimulus conditions. Rows of conditions\nindicate constant visual stimulus, columns is constant audio stimulus. Model predictions is indicated\nby the red and blue dotted line.\n\n5\n\n10no audiono visionAuditory responseAuditory modelVisual responseVisual model\f3.1 Model comparison\n\nTo facilitate quantitative comparison with other models, we now \ufb01t the parameters of each model2 to\nindividual subject data, maximizing the likelihood of the model, i.e., the probability of the response\nfrequencies under the model. The causal inference model \ufb01ts human data better than the other\nmodels. Compared to the best \ufb01t of the causal inference model, the Bresciani model has a maximal\nlog likelihood ratio (base e) of the data of \u221222 \u00b1 6 (mean \u00b1 s.e.m. over subjects), and the Roach\nmodel has a maximal log likelihood ratio of the data of \u221218 \u00b1 6. A causal inference model that\nmaximizes the probability of being correct instead of minimizing the mean squared error has a\nmaximal log likelihood ratio of \u221218 \u00b1 3. These values are considered decisive evidence in favor of\nthe causal inference model that minimizes the mean squared error (for details, see [25]).\nThe parameter values found in the likelihood optimization of the causal model are as follows:\npcommon = 0.28 \u00b1 0.05, \u03c3V = 2.14 \u00b1 0.22\u25e6, \u03c3A = 9.2 \u00b1 1.1\u25e6, \u03c3P = 12.3 \u00b1 1.1\u25e6 (mean \u00b1\ns.e.m. over subjects). We see that there is a relatively low prior probability of a common cause. In\nthis paradigm, auditory localization is considerably less precise than visual localization. Also, there\nis a weak prior for central locations.\n\n3.2 Localization bias\n\n(cid:16)\n\n(cid:17)\u22121\n\nA useful quantity to gain more insight into the structure of multisensory data is the cross-modal\nbias. In our experiment, relative auditory bias is de\ufb01ned as the difference between the mean au-\nditory estimate in a given condition and the real auditory position, divided by the difference be-\ntween the real visual position and the real auditory position in this condition.\nIf the in\ufb02uence\nof vision on the auditory estimate is strong, then the relative auditory bias will be high (close\nto one).\nIt is well-known that bias decreases with spatial disparity and our experiment is no\nexception (solid line in Fig. 3; data were combined between positive and negative disparities).\nIt can easily be shown that a traditional cue in-\ntegration model would predict a bias equal to\n1 + \u03c32\n, which would be close to 1 and\nV\n\u03c32\nA\nindependent of disparity, unlike the data. This\nshows that a mandatory integration model is an\ninsuf\ufb01cient model of multisensory interactions.\nWe used the individual subject \ufb01ttings from\nabove and and averaged the auditory bias val-\nues obtained from those \ufb01ts (i.e. we did not\n\ufb01t the bias data themselves). Fits are shown\nin Fig. 3 (dashed lines). We applied a paired\nt-test to the differences between the 5\u25e6 and\n20\u25e6 disparity conditions (model-subject com-\nparison). Using a double-sided test, the null\nhypothesis that the difference between the bias\nin the 5\u25e6 and 20\u25e6 conditions is correctly pre-\ndicted by each model is rejected for the Bres-\nciani model (p < 0.002) and the Roach model\n(p < 0.042) and accepted for the causal infer-\nence model (p > 0.17). Alternatively, with a\nsingle-sided test, the hypothesis is rejected for\nthe Bresciani model (p < 0.001) and the Roach\nmodel (p < 0.021) and accepted for the causal\ninference model (> 0.9).\nThe reason that the Bresciani model fares worst\nis that its prior distribution does not include a component that corresponds to independent causes. On\n\nFigure 3: Auditory bias as a function of spatial\ndisparity. Solid blue line: data. Red: Causal infer-\nence model. Green: Model by Roach et al. [23].\nPurple: Model by Bresciani et al.\n[22]. Mod-\nels were optimized on response frequencies (as in\nFig. 2), not on the bias data.\n\n2The Roach et al. model has four free parameters (\u03c9,\u03c3V , \u03c3A, \u03c3coupling), the Bresciani et al. model has three\n(\u03c3V , \u03c3A, \u03c3coupling), and the causal inference model has four (pcommon,\u03c3V , \u03c3A, \u03c3P ). We do not consider the\nShams et al. model here, since it has many more parameters and it is not immediately clear how in this model\nthe erroneous identi\ufb01cation of posterior with response distribution can be corrected.\n\n6\n\n510152020253035404550Spatial Disparity (deg.)% Auditory Bias510152020253035404550\fthe contrary, the prior used in the Roach model contains two terms, one term that is independent of\nthe disparity and one term that decreases with increasing disparity. It is thus functionally somewhat\nsimilar to the causal inference model.\n\n4 Discussion\n\nWe have argued that any model of multisensory perception should account not only for situations\nof small, but also of large con\ufb02ict. In these situations, segregation is more likely, in which the two\nstimuli are not perceived to have the same cause. Even when segregation occurs, the two stimuli can\nstill in\ufb02uence each other.\nWe compared three Bayesian models designed to account for situations of large con\ufb02ict by apply-\ning them to auditory-visual spatial localization data. We pointed out a common mistake: for non-\nGaussian bisensory priors without mandatory integration, the response distribution can no longer\nbe identi\ufb01ed with the posterior distribution. After correct implementation of the three models, we\nfound that the causal inference model is superior to the models with ad hoc bisensory priors. This is\nexpected, as the nervous system actually needs to solve the problem of deciding which stimuli have\na common cause and which stimuli are unrelated.\nWe have seen that multisensory perception is a suitable tool for studying causal inference. How-\never, the causal inference model also has the potential to quantitatively explain a number of other\nperceptual phenomena, including perceptual grouping and binding, as well as within-modality cue\ncombination [27, 28]. Causal inference is a universal problem: whenever the brain has multiple\npieces of information it must decide if they relate to one another or are independent.\nAs the causal inference model describes how the brain processes probabilistic sensory information,\nthe question arises about the neural basis of these processes. Neural populations encode probability\ndistributions over stimuli through Bayes\u2019 rule, a type of coding known as probabilistic population\ncoding. Recent work has shown how the optimal cue combination assuming a common cause can\nbe implemented in probabilistic population codes through simple linear operations on neural activ-\nities [29]. This framework makes essential use of the structure of neural variability and leads to\nphysiological predictions for activity in areas that combine multisensory input, such as the superior\ncolliculus. Computational mechanisms for causal inference are expected have a neural substrate that\ngeneralizes these linear operations on population activities. A neural implementation of the causal\ninference model will open the door to a complete neural theory of multisensory perception.\n\nReferences\n[1] H.L. Pick, D.H. Warren, and J.C. Hay. Sensory con\ufb02ict in judgements of spatial direction. Percept.\n\nPsychophys., 6:203205, 1969.\n\n[2] D. H. Warren, R. B. Welch, and T. J. McCarthy. The role of visual-auditory \u201dcompellingness\u201d in the ven-\ntriloquism effect: implications for transitivity among the spatial senses. Percept Psychophys, 30(6):557\u2013\n64, 1981.\n\n[3] D. Alais and D. Burr. The ventriloquist effect results from near-optimal bimodal integration. Curr Biol,\n\n14(3):257\u201362, 2004.\n\n[4] R. A. Jacobs. Optimal integration of texture and motion cues to depth. Vision Res, 39(21):3621\u20139, 1999.\n\n[5] R. J. van Beers, A. C. Sittig, and J. J. Gon. Integration of proprioceptive and visual position-information:\n\nAn experimentally supported model. J Neurophysiol, 81(3):1355\u201364, 1999.\n\n[6] D. H. Warren and W. T. Cleaves. Visual-proprioceptive interaction under large amounts of con\ufb02ict. J Exp\n\nPsychol, 90(2):206\u201314, 1971.\n\n[7] C. E. Jack and W. R. Thurlow. Effects of degree of visual association and angle of displacement on the\n\n\u201dventriloquism\u201d effect. Percept Mot Skills, 37(3):967\u201379, 1973.\n\n[8] G. H. Recanzone. Auditory in\ufb02uences on visual temporal rate perception. J Neurophysiol, 89(2):1078\u201393,\n\n2003.\n\n[9] J. P. Bresciani, M. O. Ernst, K. Drewing, G. Bouyer, V. Maury, and A. Kheddar. Feeling what you hear:\n\nauditory signals can modulate tactile tap perception. Exp Brain Res, 162(2):172\u201380, 2005.\n\n7\n\n\f[10] R. Gepshtein, P. Leiderman, L. Genosar, and D. Huppert. Testing the three step excited state proton\ntransfer model by the effect of an excess proton. J Phys Chem A Mol Spectrosc Kinet Environ Gen\nTheory, 109(42):9674\u201384, 2005.\n\n[11] L. Shams, W. J. Ma, and U. Beierholm. Sound-induced \ufb02ash illusion as an optimal percept. Neuroreport,\n\n16(17):1923\u20137, 2005.\n\n[12] G Thomas. Experimental study of the in\ufb02uence of vision on sound localisation. J Exp Psychol, 28:167177,\n\n1941.\n\n[13] W. R. Thurlow and C. E. Jack. Certain determinants of the \u201dventriloquism effect\u201d. Percept Mot Skills,\n\n36(3):1171\u201384, 1973.\n\n[14] C.S. Choe, R. B. Welch, R.M. Gilford, and J.F. Juola. The \u201dventriloquist effect\u201d: visual dominance or\n\nresponse bias. Perception and Psychophysics, 18:55\u201360, 1975.\n\n[15] R. I. Bermant and R. B. Welch. Effect of degree of separation of visual-auditory stimulus and eye position\n\nupon spatial interaction of vision and audition. Percept Mot Skills, 42(43):487\u201393, 1976.\n\n[16] R. B. Welch and D. H. Warren. Immediate perceptual response to intersensory discrepancy. Psychol Bull,\n\n88(3):638\u201367, 1980.\n\n[17] P. Bertelson and M. Radeau. Cross-modal bias and perceptual fusion with auditory-visual spatial discor-\n\ndance. Percept Psychophys, 29(6):578\u201384, 1981.\n\n[18] P. Bertelson, F. Pavani, E. Ladavas, J. Vroomen, and B. de Gelder. Ventriloquism in patients with unilateral\n\nvisual neglect. Neuropsychologia, 38(12):1634\u201342, 2000.\n\n[19] D. A. Slutsky and G. H. Recanzone. Temporal and spatial dependency of the ventriloquism effect. Neu-\n\nroreport, 12(1):7\u201310, 2001.\n\n[20] J. Lewald, W. H. Ehrenstein, and R. Guski. Spatio-temporal constraints for auditory\u2013visual integration.\n\nBehav Brain Res, 121(1-2):69\u201379, 2001.\n\n[21] M. T. Wallace, G. E. Roberson, W. D. Hairston, B. E. Stein, J. W. Vaughan, and J. A. Schirillo. Unifying\n\nmultisensory signals across time and space. Exp Brain Res, 158(2):252\u20138, 2004.\n\n[22] J. P. Bresciani, F. Dammeier, and M. O. Ernst. Vision and touch are automatically integrated for the\n\nperception of sequences of events. J Vis, 6(5):554\u201364, 2006.\n\n[23] N. W. Roach, J. Heron, and P. V. McGraw. Resolving multisensory con\ufb02ict: a strategy for balancing the\n\ncosts and bene\ufb01ts of audio-visual integration. Proc Biol Sci, 273(1598):2159\u201368, 2006.\n\n[24] K. P. Kording and D. M. Wolpert. Bayesian decision theory in sensorimotor control. Trends Cogn Sci,\n\n2006. 1364-6613 (Print) Journal article.\n\n[25] K.P. Kording, U. Beierholm, W.J. Ma, S. Quartz, J. Tenenbaum, and L. Shams. Causal inference in\n\nmultisensory perception. PLoS ONE, 2(9):e943, 2007.\n\n[26] Z. Ghahramani. Computational and psychophysics of sensorimotor integration. PhD thesis, Mas-\n\nsachusetts Institute of Technology, 1995.\n\n[27] D. C. Knill. Mixture models and the probabilistic structure of depth cues. Vision Res, 43(7):831\u201354,\n\n2003.\n\n[28] D. C. Knill. Robust cue integration: A bayesian model and evidence from cue con\ufb02ict studies with\n\nstereoscopic and \ufb01gure cues to slant. Journal of Vision, 7(7):2\u201324.\n\n[29] W. J. Ma, J. M. Beck, P. E. Latham, and A. Pouget. Bayesian inference with probabilistic population\n\ncodes. Nat Neurosci, 9(11):1432\u20138, 2006.\n\n8\n\n\f", "award": [], "sourceid": 368, "authors": [{"given_name": "Ulrik", "family_name": "Beierholm", "institution": null}, {"given_name": "Ladan", "family_name": "Shams", "institution": null}, {"given_name": "Wei", "family_name": "J.", "institution": null}, {"given_name": "Konrad", "family_name": "Koerding", "institution": null}]}