NeurIPS 2020

### Review 1

Summary and Contributions: This paper considers explanations of voting outcomes as a series of formal proofs based oil a set of axioms. They demonstrate that Borda outcomes can be explained in a small number of outcomes.

Strengths: The paper considers an important problem - that of explainability in an outcome of a non majority based voting system. Having such methods in place could lead to more trust in the outcome of something like a Borda score ranking.

Weaknesses: My main comment is that I am not sure that Neurips is the correct venue of this paper. The results seem interesting, and maybe applicable to interpretable ML - however the methods and results are not really aligned with the Neurips community. It is on the authors to justify why Neurips is an appropriate venue for this work.

Correctness: I struggled to understand the proofs of the results.

Clarity: I found the paper difficult to read - most likely due to a lack of knowledge in this area. I didn’t really understand the proofs of the main results. More examples would have been helpful - for example, Figure 1 needed some more explanation.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: ##### After Author Feedback ###### I have read the author feedback, and the other reviews. I still feel like I have very little confidence in reviewing this fairly, however I am happy to raise my review to a marginal accept. I would urge the authors to communicate these ideas more clearly to the community at large.

### Review 2

Summary and Contributions: The paper builds on a recent paper by Cailloux and Endriss'16, which proposes a way to justify outcomes of a particular voting rule by "proofs" of how this particular outcome follows from a specific set of axioms (an particular axiomatization of the voting rule). In this paper the authors consider the *length* of such proofs. They show that the CE'16 axioms can prove any outcome of Borda in O(m^2) steps. The main result is describing an abstract class of axiomatizations (that can apply to a broad class of voting rules), and provide a lower bound for the (maximal and average) proof length using such axiomatizations. ** after rebuttal ** I raised my score a bit, since I understand that there is *something* that makes the INIT axiom you use more reasonable than the INIT' axiom I suggested as a thought experiment. However I am not fully happy with your answer. I think there should be some more formal criterion to decide what are reasonable axioms (e.g. by their description size). In any case I think this requires some more discussion in the paper.

Strengths: The idea is very cool. I also see the potential application to machine learning once the framework becomes more developed: the Borda rule is very simple to explain even without logical proofs (in fact I suspect that most people will find that a "proof" that just adds up candidate scores more convincing). But if we have some black-box function and all we know is that it satisfies some axioms (are there good examples of such functions in machine learning?), then justifying its outcome with a short logical proof could be great.

Weaknesses: , I have serious concerns about the axiomatic system used in the main proof. The length of explanations is tightly related to the specific set of axioms we use. In this respect, the main theorem essentially shows that *for a particular set of axioms* the length is lower bounded by the dimension. But a-priori it is not clear why the set of axioms in Def. 2 is the one we should care about. In particular, the INIT axiom (which is at the heart of the lower bound proof) seems somewhat arbitrary: essentially it is a long table specifying the outcome of every profile in S. Suppose we would add an alternative axiom INIT': \forall R\in \cal R [R -> f(R)] (i.e. that applies to all profiles rather than to profiles in S alone) then clearly we can now provide a proof of length 1 for any profile of any voting rule. Possibly related, it seems like there should be a simple embedding of Borda (and in fact any PSR) to the space N^m that essentially counts the scores of all candidates. It seems that this space would hold properties 1-3 in Def.1. Are there no simple axioms that capture this? (e.g. the INIT axiom could apply to a single voter profile, and then we just need to somehow add all votes. It seems like the length of proofs with such embedding should be \Theta(nm) for all PSRs.

Correctness: seems ok

Clarity: the paper is well written

Relation to Prior Work: I wonder how does Def.1 relates to the definition of generalized scoring rules. Does any of them generalize the other?

Reproducibility: Yes

Additional Feedback: It looks like there should be some way to define/measure the "strength/dimension" of an axiomatic system (E.g. by the description length of the axioms themselves?), and then have "tradeoff theorems" stating that if the axiomatic system is "simple/small/weak" then the proofs must be long (on average/worst case). Theorem 2 shows one point on this fronteer (and the INIT' axiom is another) but it is hard to say why any particular point is important. In that sense I also find Def.3 a bit too specific. Would make more sense to define the "asym. weaker" relation in a way that would apply to any two axiomatic systems and not just to S_emb. Another idea is to measure the strength of the axiom system by the set of voting rules it *can* explain. Intuitively we would expect that if S is sufficiently strong to explain many rules then proofs will be short (as in the case of the INIT' axiom), and thus possibly surprising results like Thm 2 show the existence of systems that are strong enough to explain any rules (all linear-embedding rules?) and yet still requires long proofs. I'm not sure how appealing this direction is though :/ Is there a meaningful sense in which you can treat the axiomatic system and the definition of f (in what language?) as "syntax" and "semantics"? In Def.2 EMB I think you meant [R1->A] -> [R2->A]

### Review 3

Summary and Contributions: This paper aims at explaining complicated voting mechanisms using axioms in natural language. The contributions of this paper are all theoretical. The authors first prove that Borda can be explained using at most O(m^2) steps, then give an asymptotic lower bound on the number of steps to explain other voting rules in general.

Strengths: 1. I believe this paper tackles an important problem since it is usually hard to persuade people to use good but complicated voting mechanisms. 2. The theorems in this paper are sound and novel to the best of my knowledge. I also appreciate the proof sketches.

Weaknesses: 1. No empirical examples are provided in this paper. I believe they are unnecessary but simple real-world examples may help improve this paper. 2. Not sure whether the NeurIPS community would appreciate this paper since this paper belongs to an area newly added to NeurIPS.

Correctness: I think so.

Clarity: Yes.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: The paragraph in lines 230-231 is confusing at the first peek. It should be explicit that $\circ$ is an abstract operation. ------------------ after rebuttal ------------------ I have read all reviews and the authors' response. My score remains unchanged.

### Review 4

Summary and Contributions: The paper addresses the challenge of explaining why---given the preferences of several voters in an election---a given alternative should win that election. These explanations appeal to first principles about how to treat voters fairly and consistently (so-called axioms) rather than the definition of the voting rule in use (because that voting rule may seem overly complicated or ill-motivated to the person consuming the explanation). The main contribution is a lower bound on the length of such explanations (Theorem 2). The derivation of this result required the development of a novel framework for embedding a voting rule into a vector space relative to a given axiomatisation of that rule. The dimension of that vector space then is the central parameter determining the length of explanations in terms of axioms coming from that axiomatisation. Applications of the general result to three specific and widely-used voting rules are discussed, with most attention being given to the Borda rule. For that rule, the authors also provide a matching upper bound (Theorem 1), showing that we can explain the Borda winner for any given profile of preferences using at most a number of steps that is quadratic in the number of alternatives. This complements an earlier result in the literature, where the existence of such explanations was proved but no bounds on their length had been established.

Strengths: This is a technically strong paper, asking an original question ("can we bound the length of explanations in voting?") addressing an important domain of applications (various forms of online decision support, including democratic decision making). The results are deep and were somewhat surprising to me. I particularly liked the idea of embedding voting rules into vector spaces and the idea of grouping familiar axioms from social choice theory into certain "classes" of axioms (ADD, EMB, INIT, PRED) to be able to prove general results about them. This seems like a very powerful technique that I can imagine will also prove helpful for addressing other problems. The topic of the paper (algorithms for generating explanations for why a given election outcome is the right one), without any doubt, is highly significant. While it certainly is not a typical topic for a paper at NeurIPS, the authors make a convincing case for why it indeed is relevant to the conference. The most direct connection to machine learning is the fact that political elections are not the only scenarios in which we have to aggregate the preferences of several agents, but we face that very same problem also, for instance, in ensemble learning when combining the outputs of several models. Beyond that, the paper connects to the broader discussion of what different subcommunities within the AI community mean when they talk about "explanations". Finally, the paper points to future challenges of an algorithmic nature (computing optimal explanations) that might also be addressed using approaches more commonly discussed at NeurIPS.

Weaknesses: I cannot point to any specific weakness in the paper. Having said this, in my opinion, the division into main paper and supplementary material is not very natural for this kind of topic. My personal preference would have been to reduce the scope of the paper, and to present that part in more depth, ideally avoiding the need for supplementary material altogether. The rest of the material could have been saved for a later journal paper.

Correctness: I have only skimmed through the supplementary material. But I found the proof sketches in the paper itself very useful and believe I managed to get a fairly good impression of how the full proofs might work in detail. Based on this, I have confidence in the technical correctness of the results reported, even if I have not verified them line by line. A few specific remarks regarding very minor points that could be clarified follow. At the very end of Section 2, when you give a concrete example for a sound and complete proof system, it is not immediately obvious to me that the system you give (axioms + tautologies + modus ponens) really is complete. Is it obvious that modus ponens is the only rule required? Maybe a simple reference to a logic textbook will do. In line 287, you write "[this] implies [formula] satisfies ADD...". It's not clear what you mean here (I think a formula cannot "satisfy" an axiom, which is another set of formulas). Should this maybe be "is consistent with"?