Paper ID: | 1252 |
---|---|

Title: | Blind Attacks on Machine Learners |

The paper provides minimax rates for estimation when the data is provided by a user seeking to protect his privacy. This implies that the data is coming from an alpha-contaminated class. However, no references are provided to the extensive work on alpha-contamination.

The paper reads very well, and it explains the problem and model in detail. While the results are in some sense natural, they are interesting from the perspective of the interplay between robustness and security. However, no mention is made of the extensive literature on estimation under epsilon-contamination, e.g. "A General Decision Theory for Huber’s epsilon-Contamination Model". I am not quite sure whether this paper is subsumed by previous results; however, I think it could still be a worthy contribution to NIPS if the added value of this paper was made clear. As the authors state in their response, it's true that the contamination here is chosen with less knowledge, but these link should be made more prominent and discussed in more detail. In particular, are there cases where the minimax rates coincide, e.g. when the optimal informed contamination is the same as the optimal uninformed contamination?

2-Confident (read it all; understood it all reasonably well)

An interesting model for maliciously injecting data into learning. Hard for me to compare with previous work.

This is an interesting paper in an adversarial model of learning, where an attacker injects malicious data into a data-stream without knowing the data distribution used for learning or the examples the learner learns from. This model appears to be new and is well-motivated. Of course, in theory, one cannot model this without giving the attacker some data, so the attacker gets a set of data-sets and the learner will get one of them uniformly at random. Now an attacker has a minimax problem to solve (for minimax risk). Then, there are two models involving what the learner knows -- informed vs uninformed, depending on whether the learner gets to know the malicious data distribution or not. This immediately makes me think of variants of adversarial or agnostic learning, where an "attacker" can corrupt a label distribution, based on input data. Of course, there an attacker would be more powerful that what is needed here. But I immediately wished for some comparison to adversarial PAC models. Then minimax rates of convergence are given for the permutations of the various learner/attacker families. I became somewhat uncomfortable once I saw the introduction of Markov chain to model the parameters of the setting, since this did not seem to be part of the modeling assumption. I am also unfamiliar with the Le Cam method, so the importance of the theorems was not so clear to me. In the end, my review is closer to an informed guess. I like the idea of the paper and the model seems reasonable. However, I wished this paper's analysis would have started in the PAC setting, where I think one can compare to many things that are already known, including the variant connections to differential privacy, statistical queries, etc. I am borderline on this paper, but it is closer to a guess.

1-Less confident (might not have understood significant parts)

The paper invstigates attacks on learnes, where the training set is drawn from a mixed distribution with some control by malicious actors. Both informed learner/blind attack as well as blind learner\blind attack are studied.

This paper makes weaker assumptions than previous studies make, by assuming a blind attacker with no knowledge of the training set, the learner’s algorithm or the learner’s hypothesis. The problem of a one-shot data injection attack carried out by a blind attacker who does not observe the training set, the true distribution of interest, or the learner’s algorithm is addressed. The results have impact with regards to both the security setting and privacy setting of learning systems. Furhter, the results improve the knowledge about what influence blind attackers can have on learning systems.

1-Less confident (might not have understood significant parts)

This paper establishes theoretical lower bounds on the minimax risk of a learner within the statistical decision theory framework. The setting is that an attacker is allowed to add data of their choice to the training set to attempt to confuse the learner. This paper tries to characterize the attacker's effectiveness within two primary scenarios: if the learner is aware of the malicious data distribution and if the learner is unaware of the malicious distribution but is aware of the percentage of corrupt data. The paper studies two different attacks: if the attacker draws uniformly from the data distribution and if informed attacker mimics the learning distribution. Overall, this paper represents a small but important theoretical step toward addressing the problem of training using a mixture of genuine and malicious training data.

Overall, I find this branch of work quite interesting and am glad the authors are choosing to study this problem. The attacks mentioned in the paper may become feasible in the age of large web-scale datasets or human-in-the-loop training systems, along with the privacy scenario mentioned in the paper. The authors do an excellent job of motivating the problem. The paper appears clearly written if the reader is an expert within the field of statistical decision theory. I must admit that this is not my area of expertise. I am unable to comment intelligently on the proofs given in the paper. Perhaps the authors could consider presenting these ideas more clearly for a broader audience. I did find a potential problem with the equation given on Line 355, which states that a mimic attack reduces the effective sample size by a factor of (2a-1)^2 / (1-a). This resul doesn't seem right: this function is negative for a < 0.5, has a zero at a=0.5, and is greater than 1 a > 0.75. This means that if 90% of the data is genuine and 10% of the data comes from a malicious attacker, the effective sample size of the training set is increased by a factor of 6.4. I would have expected the opposite: the sample size should be effectively 0 for any a < 0.5, and should smoothly increase to 1 until a=1.0. I don't think the training set should ever increase given the presence of malicious data. I do not know whether this is a simple typo or whether it indicates a deeper problem in the proof. Perhaps a more competent reviewer can comment on this point. I do encourage the authors to consider harder, more realistic scenarios in future work. For example, most actual learning systems will not have knowledge about alpha, the amount of malicious data in the dataset. Perhaps there should be some bounds on ways to detect whether an attack is taking place. This work is theoretical in nature. I would appreciate some experiments (even synthetic ones) simply to help understand what's happening, but the authors do acknowledge that creating an actual learner that satisfies the given minimax risk bounds is left to future work, making experiments difficult. Finally, I do appreciate that the author took the time to restate their main results using simpler concrete examples in sections 3.1 and 3.2.

1-Less confident (might not have understood significant parts)

This paper abstracts a problem of interest in (applied) adversarial machine learning, whereby an adversary manipulates a learner trained partially on benign stochastic data, with adversarially-chosen, stochastic data. Specifically the paper studies a learner intending to estimate parameter (theta) for distribution (F) from which training data is drawn, and a blind attacker – not knowing the distribution or the training set – who injects data with from a different distribution (G(phi)), in order to maximise the minimax risk for the learner. The authors discuss two settings – “informed learner” and “blind learner”, and investigate the effect of the relationship (KL divergence) between F(theta) and G(phi).

The paper presents an appropriate theoretical treatment of an interesting question that has persisted in adversarial machine learning: that of the effect of a blind attacker poisoning training data without access to benign iid training data. The paper's approach is one of minimax theory, with analysis variations given for a defender/learner aware of the attacker's poison distribution, and (the more realistic) case of a blind defender who is unaware of the attacker's choice. Results depend on the KL divergence between benign and malicious data distributions. There have been relatively few theoretical treatments of adversarial machine learning settings. The authors suitably note the recent work of Duchi et al. on minimax rates and differential privacy (that appeared recently at NIPS), with this paper being complementary. However some issues remain: The work on Kearns and Li (1993) who looked at the effect of a fraction of contaminated data in the PAC learning framework should be cited and related briefly. Similarly the vast body of work on robust statistics: the presented framework seems to not be unlike the gross-error model (sampling distribution is a mixture of benign and a contamination distribution; the minimax approach there is to consider a contamination distribution that is worst case; this yields Huber estimators that is a robust class of estimators in the setting; etc.) The paper could be better structured. For example, Section 1 spans half of the paper and would be better organised if divided into 2- 3 sections. While the paper is written appropriately for a specialised COLT sub-audience present at NIPS, since NIPS is broader (than ever), and the (more applied) adversarial learning community should find the paper interesting, it would be beneficial to rework the exposition and background with such readers in mind. For example detailing the connection between problems at #89 Several notations are used before being properly defined, e.g., delta in #199, Vol(Z) in #292 (instead, these are defined in the supplementary material). Not sure Fig1,2 strictly represent Markov chains (typically identical transition probabilities across finite states, etc.) but no significant matter. Also, consider shading nodes that are observed. Quotations leading quoted phrases should be opening quotation marks. Something with the typesetting (#161) seems broken with the definition of theta in relation to theta_i. **After author response: Thankyou for the thoughtful response. The assurances that connections with robust stats and PAC will be added as appropriate, are well received. The initial explanation on the differences between the blind model and the typical (non-agnostic) model typical in robustness appear to make sense.

1-Less confident (might not have understood significant parts)