Review for NeurIPS paper: Algorithmic recourse under imperfect causal knowledge: a probabilistic approach

NeurIPS 2020

Algorithmic recourse under imperfect causal knowledge: a probabilistic approach

Review 1

Summary and Contributions: This paper proposes a new method for algorithmic recourse when complete causal knowledge maybe unavailable. Under the assumption that the causal graph is known (but not the structural equations), i) A negative result is proved suggesting that without knowing structural equations, recourse cannot be guaranteed. ii) For the class of structural equations with additive gaussian noise, recourse is framed using a counterfactual query to the SCM along with an algrorithm that can help estimate the counterfactual distribution (while accounting for uncertainty over the functional form of the SCM). iii) Additive gaussian noise assumptions are further relaxed where recourse is now formulated as an interventional query that conditions on a subpopulation (determined by similarity over a subset of features determined by the causal graph) under the different interventions (potential feasible actions that could change outcomes). Finally experimental evaluation demonstrates the benefits of each of these methods on extensive evaluation.

Strengths: The paper is well motivated, results are technically as well as practically interesting. This is in my opinion a significant contribution the area of actionable recourse in fairness literature and is definitely relevant to the NeurIPS audience. The results of GP-SCMs are probably of independent interest and authors could highlight that more strongly.

Weaknesses: 1. High level comment: Since the proposed solutions are both based on a counterfactual as well as interventional query, it seems important to justify which framing is more appropriate for the recourse task. On the face of it, recourse is a counterfactual query rather than an interventional one. The justification and motivation for each potential formulation is unclear and it would be good if the authors incorporate that in their motivation. 2. Authors do not address the issue of feasibility in the level of detail as is warranted for recourse. This makes the formulation a little impractical for actual practice. Can the authors clarify the details of feasibility or enumeration of feasible actions? All comments in the paper allude to searching over potential intervention sets \mathcal{I}. However, there are causal dependencies in the graph which can determine allowable feasible sets. It is unclear how to address this challenge here. 4. Is the learned CVAE completely respect the causal graph? The procedure of training CVAEs for interventional recourse should be clarified in further detail. 5. In experimental evaluation, I did not see a comparison to existing baselines in recourse, neither a qualitative assessment of the type of recourses that the model learns.

Correctness: I have gone over the proofs and they are reasonably detailed and correct. Empirical claims need further evaluation.

Clarity: The paper is well written, motivated and structured well. Experimental evaluation is also extensively presented.

Relation to Prior Work: The paper is clearly situated in existing literature. However, the authors do fail to cite an important critique of recourse work - The philosophical basis of algorithmic recourse by Venkatasubramanian and Alfano (FAccT* 2020) and contextualize their contributions in terms of issues raised in this work. This is important for calibrating broader impact.

Reproducibility: Yes

Additional Feedback: Edit: I have read the rebuttal, other reviews and acknowledge that authors addressed my concerns.

Review 2

Summary and Contributions: This paper provides a probabilistically correct method of recourse under a more realistic expectation of knowledge of the underlying causal structure. The paper demonstrates concretely the weakness of the most similar result, showing that there is no guaranteed recourse for unknown structural equations. Then the paper then goes on to provide two forms of recourse, one individualized and one subgroupwise; both based on GP- Structural Causal Models. Finally they proved an optimization technique to solve the proposed problems for recourse and present experimental results showing improvements in validity and cost.

Strengths: this work provides a sound solution to an important problem and extends the literature in a novel way. Providing recourse recommendations is required or desirable in many domains, so reliable techniques for doing so under realistically specified problems is important. This is relevant to the NeurIPS community as explanation is by regulation and actionable recourse solves the underlying problem in a more robust way for impacted individuals. the claims are sound and the empirical evidence is clear, enough description and discussion of propositions is included in the paper that they're beleivable, even though all proofs are left to the supplemental materials.

Weaknesses: The results are only on synthetic data, though this is understandable because true SCMs are not generally available. Collaboration with a

Correctness: the claims are well supported and correct. The method is correct at the modeling level and the algorithmic solution makes sense as a solution. The baselins of the experiments and the metrics are well described.

Clarity: Overall the paper is well written and clear. There are some parts that are difficult to parse or places where clarity could be improved. In the early sections, intervention and action seem to be used to refer to related, but different concepts, but later they seem to be used interchangeably. It could make the overall exposition clearer to use intervention for the more general causal intervention and action to the resourse- specific action. UPDATE: thanks for recognizing this and agreeing to update it Minor: Figure 1 caption is hard to parse quickly because the subpart labels are in the sentence and mixed between before and after the content that goes with them. line 119: the "not" in the dashed clause reads like a double negative, and could be removed. Figure 2c is labeled as a tradeoff between two variables, but the plots are each one versus the control parameter. Also, validity and cost increase in the same direction, because high cost is bad, it is a tradeoff in a sense, but the figure doesn't match expectations when labeled "tradeoff" line 189: put commas around nd(I) Table 2's caption incorrectly says that it si 3 variables, when the paper text says that table 2 is the 7 variable model results.

Relation to Prior Work: The work is clearly positioned in the broader context and the innovations are clearly stated.

Reproducibility: Yes

Additional Feedback:

Review 3

Summary and Contributions: The paper studies the problem of algorithmic recourse when the true underlying structural model is unknown. The paper first shows that algorithmic recourse cannot be guaranteed if the true structural equations are unknown. Then, the paper proposes two probabilistic approaches to solve the problem: 1) the first approach assumes the structural equations to be additive noise models (ANMs) with Gaussian noise and estimates a counterfactual distribution under this assumption, 2) the second approach estimates the effect of interventions on a subpopulation around the given factual datapoint. All approaches have been evaluated on three classes of SCMs for a synthetic 3-variable case and a semi-synthetic 7-variable loan approval problem.

Strengths: - The paper builds upon the idea of counterfactual explanation to generate actionable recommendations that leads to a change in prediction. - The paper does not treat features as independently manipulable and takes the causal structure between features into account. - The proposed approaches don’t need the true underlying structural model to work. - The paper is written very well.

Weaknesses: - What are the computation costs for the proposed approaches and how do they compare to non-probabilistic baselines? - For 3-variable case, non-probabilistic baselines are performing as well as CATE-CVAE for non-additive SCM and kernel regression baseline is performing as well as CATE-CVAE for non-linear SCM. - In both experiments, the subpopulation-based approach CATE-CVAE is underperforming in terms of validity as well as cost compared to the pseudo-counterfactual M-CVAE. The paper should include a scenario/case where subpopulation-based approach might be preferable. - The authors should include some application-based examples in the paper to highlight the advantages of their approach as compared to prior works as being done in [7]. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Update after rebuttal: I thank the authors for their responses to my questions/concerns. They satisfactorily answer most of my concerns. Therefore, I am upgrading my rating.

Correctness: The work is technically sound. I have some questions/concerns about the experimental results, as highlighted in the weaknesses section above.

Clarity: The paper is very well written.

Relation to Prior Work: Yes, the paper situates itself well among prior works.

Reproducibility: Yes

Additional Feedback: