Harris Drucker, Robert Schapire, Patrice Simard
A boosting algorithm converts a learning machine with error rate less than 50% to one with an arbitrarily low error rate. However, the algorithm discussed here depends on having a large supply of independent training samples. We show how to circumvent this problem and generate an ensemble of learning machines whose performance in optical character recognition problems is dramatically improved over that of a single network. We report the effect of boosting on four databases (all handwritten) consisting of 12,000 digits from segmented ZIP codes from the United State Postal Service (USPS) and the following from the National Institute of Standards and Testing (NIST): 220,000 digits, 45,000 upper case alphas, and 45,000 lower case alphas. We use two performance measures: the raw error rate (no rejects) and the reject rate required to achieve a 1% error rate on the patterns not rejected. Boosting improved performance in some cases by a factor of three.