Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs

Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)

Bibtex Metadata Paper


Rich Caruana, Virginia de Sa


In supervised learning there is usually a clear distinction between inputs and outputs - inputs are what you will measure, outputs are what you will predict from those measurements. This paper shows that the distinction between inputs and outputs is not this simple. Some features are more useful as extra outputs than as inputs. By using a feature as an output we get more than just the case values but can. learn a mapping from the other inputs to that feature. For many features this mapping may be more useful than the feature value itself. We present two regression problems and one classification problem where performance improves if features that could have been used as inputs are used as extra outputs instead. This result is surprising since a feature used as an output is not used during testing.