The authors present a way to improve uncertainty quantification in regression networks by placing "evidential priors" over the conditional distributions and optimizing the marginal likelihood. Pros: - simplicity and elegance of the approach - experimental results showing that this can match SOTA (deep ensemble) performance with lower computational complexity at test time - Code accompanying the paper that allows the community to easily build on this work Cons: - Theoretical justification could be improved. In particular, R1 raised some concerns about why this works. Some of the ablations in rebuttal partly address this concern. I'd also encourage the authors to include additional ablations to investigate the relative contributions of (i) using prior over Gaussian parameters and (ii) details of how these parameters are optimized (e.g. maximum likelihood vs the proposed regularization). During the discussion, the consensus leaned towards accept. I have read the paper as well and I think this is an useful contribution. I recommend accept.