Efficient Forward Architecture Search
Do Deep Nets Really Need to be Deep?
Using multiple samples to learn mixture models
(Not) Bounding the True Error
Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping
Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs
Using the Future to "Sort Out" the Present: Rankprop and Multitask Learning for Medical Risk Evaluation
Learning Many Related Tasks at the Same Time with Backpropagation