Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track
Miklós Horváth, Mark Müller, Marc Fischer, Martin Vechev
Tree-based models are used in many high-stakes application domains such as ﬁnance and medicine, where robustness and interpretability are of utmost importance. Yet, methods for improving and certifying their robustness are severely under-explored, in contrast to those focusing on neural networks. Targeting this important challenge, we propose deterministic smoothing for decision stump ensembles. Whereas most prior work on randomized smoothing focuses on evaluating arbitrary base models approximately under input randomization, the key insight of our work is that decision stump ensembles enable exact yet efﬁcient evaluation via dynamic programming. Importantly, we obtain deterministic robustness certiﬁcates, even jointly over numerical and categorical features, a setting ubiquitous in the real world. Further, we derive an MLE-optimal training method for smoothed decision stumps under randomization and propose two boosting approaches to improve their provable robustness. An extensive experimental evaluation on computer vision and tabular data tasks shows that our approach yields signiﬁcantly higher certiﬁed accuracies than the state-of-the-art for tree-based models. We release all code and trained models at https://github.com/eth-sri/drs.