# tradeoffs between robustness and accuracy

We study this tradeoff in two settings, adversarial examples and minority groups, creating simple examples which highlight generalization issues as a major source of this tradeoff. Previous works attempt to explain the tradeoff between standard error and robust error in two settings: when no accurate classifier is consistent with the perturbed data (Tsipras et al., 2019; Zhang et al., 2019; Fawzi et al., 2018), and when the hypothesis class is not expressive enough to contain the true classifier (Nakkiran, 2019). Although this problem has been widely studied empirically, much remains unknown concerning the theory underlying this trade-off. We deﬁne Mutually Exclusive Perturbations (MEPs) as pairs of perturbation types for which robustness to one type implies vulnerability to the other. These results suggest that the "more data" and "bigger models" strategy that works well for the standard setting where train and test distributions are close, need not work on out-of-domain settings. We study this tradeoff in two settings, adversarial examples and minority groups, creating simple examples which highlight generalization issues as a major source of this tradeoff. While one can train robust models, this often comes at the expense of standard accuracy (on the training distribution). For adversarial examples, we show that even augmenting with correct data can produce worse models, but we develop a simple method, robust self training, that mitigates this tradeoff using unlabeled data. Haotao Wang*, Tianlong Chen*, Shupeng Gui, Ting-Kuei Hu, Ji Liu, Zhangyang Wang. We present a novel once-for-all adverarial training (OAT) framework that addresses a new and important goal: in-situ "free" trade-off between robustness and accuracy at testing time. Standard machine learning produces models that are highly accurate on average but that degrade dramatically when the test distribution deviates from the training distribution. This is based on joint work with Sang Michael Xie, Shiori Sagawa, Pang Wei Koh, Fanny Yang, John Duchi and Percy Liang. For minority groups, we show that overparametrization of models can hurt accuracy on the minority groups, though it improves standard accuracy. Experimental results show that OAT/OATS achieve similar or even superior performance, when compared to dedicatedly trained models. This paper asks this new question: how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies, without (re-)training it many times? We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples. Our approaches meanwhile cost only one model and no re-training. 