__ Summary and Contributions__: In this paper, the authors proposed a learning framework utilizing outcomes and biological measures to learn the patient's underlying mental states for treatments recommendation. The proposed method was evaluated on simulated data and real-world randomized controlled trial data.

__ Strengths__: - The proposed method has several advantages, such as the utilization of multi-domain data, modeling of patients’ heterogeneity, cleaner and more reliable representations of patients’ mental health status, and lastly, invariance representation before and after-treatment, enabling doubling the amount of data for training.
- The authors evaluated the proposed method on both a synthetic dataset and a real world clinical trial data demonstrating the effectiveness of the method. Comparisons with other related methods were also conducted.
- Availability of code for the reproduction of the experiments.

__ Weaknesses__: The work is quite well done, with a nice derivation of the proposed method and evaluation on synthetic and real-world datasets. To nitpick, it will be interesting to dive into the mental state representation and understand what the representation means.

__ Correctness__: Yes, claims and methods seem to be correct.

__ Clarity__: Yes, the paper is well written.

__ Relation to Prior Work__: Yes, related works are described in the "Related work" section and are compared in the experiments.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: The paper proposes a novel approach for evaluation of individualized treatment rules for mental disorders. The gist of the approach is to model the observed symptoms using latent space variables reflecting an underlying mental state, and the transition of that state under different treatment options. The latent state to symptoms maps are time invariant. The approach is tested on the data for the randomized treatment trial for mental disorders and shows superior results compared to the existing models.

__ Strengths__: The paper proposes a novel model for predicting treatment effects with a latent space relating the observed symptoms. Predicted scores are calculated on the end-intervention symptoms predicted through the latent space transition model.
A clear presentation of the model and its comparison to the existing treatment models.
Suitable for NIPS, though the novelty of the model in context of general ML research and existing learning algorithms is limited.

__ Weaknesses__: - Scope of the paper is limited to randomized trials. The approach tailored to work on randomized trials data where initial and end states for each intervention are clearly defined. Confounding and treatment group matching is not considered. This should be stressed as one of the limitations of the approach for estimating the treatments effects.
The model and the algorithms for learning the model are straightforward and not surprising. The benefit and impact is on the application side.
It would be good if the authors added some text on how their model can be used in other then clinical data domain.

__ Correctness__: Appears correct. Develops models for treatment effects and machine learning solutions/algorithms tailored for these models.

__ Clarity__: Yes.

__ Relation to Prior Work__: The paper reviews the majority of existing methods for modeling treatment effects and compares the proposed method to these experimentally. But, it should also put the method in context of existing ML work on latent space transition models and general latent space models.

__ Reproducibility__: Yes

__ Additional Feedback__: Post rebuttal comments:
The authors gave reasonable answers to most of the questions/points raised in the reviews. Because of a somewhat limited novelty on the ML methodology side I would like to keep my review at the marginal accept.

__ Summary and Contributions__: This paper proposes a method for learning an optimal treatment policy when the target outcome is a function of latent constructs, in this case, latent psychological constructs. The method starts by simultaneously estimating the conditional distribution of latent variables given the confounders and treatment variable. Next, given some observed features and measurements, the MAP latent constructs are inferred. Finally, treatment is chosen to maximize a predefined function of the MAP latent variables. This methodology is tested using synthetic and real data.

__ Strengths__: Optimizing treatment when the true outcome of interest is latent is an extremely challenging problem that I think is highly relevant to the NeurIPS community. Further, I think the high level ideas proposed in this paper are a step in the right direction. In particular, I think that the idea of utilizing knowledge about the causal structure to learn the latent constructs is an excellent idea.

__ Weaknesses__: I have two major concerns about the paper:
1. The premise of the paper is that practitioners would like to optimize a predefined function of latent patient state; however, such a function cannot be predefined since the learned latent constructs do not, a priori, have meaning. More specifically, latent psychological constructs learned by a model such as factor analysis only take on meaning *after* practitioners interpret the learned factors in the context of the underlying items. Further, latent variables may be subject to many different identifiability issues that make defining the function g impossible without first examining the learned latent factors. For example, the proposed model cannot distinguish between Z and 1-Z which would have complementary interpretations. To see this, simply observe that \alpha + \sum_k \beta_k Z_k = \alpha' + \sum_k \beta'_k U_k, where U_k = 1 - Z_k, \alpha' = \alpha + \sum_k \beta_k, and \beta'_k = -\beta_k.
2. More broadly, I have serious concerns about the potential risks of optimizing treatment with respect to learned latent constructs without first extensively validating those constructs. As I'm sure the authors are aware, there is a substantial body of literature on validating psychological measure in order to ensure that they are measuring what is expected. In particular, in the broader impact statement, the authors state "No individual may be put at disadvantage from this research"; however, the authors absolutely cannot guarantee that latent factors learned using the proposed method do not have biases. Latent constructs such as IQ are well-known to have racial and socio-economic biases and to use such measures to optimize treatment before testing for such biases may lead to biased treatment.

__ Correctness__: The paper appears technically correct.

__ Clarity__: I found the description of the methodology hard to follow. I recommend modifying section 3 to start with an overview or pseudocode description of the main steps is the method. Something like:
1. Specify a latent variable causal model describing the relationship between the treatment, latent constructs, and measurements.
2. Estimate the parameters of the model using a hard EM style procedure. This gives us a mapping, m, from pre-treatment measurements and treatment a treatment value to post-treatment latent constructs.
3. For a pre-specified function of the post-treatment latent constructs, g, estimate the optimal treatment policy as the policy that maximizes g(m(Y_0,X,a)).
Then structure the description of the method around each of these pieces: model structure, estimation method, policy optimization.

__ Relation to Prior Work__: I thought the discussion of related work was missing any description the relevant methods from psychometrics. In particular, could the authors please describe in more detail how this approach is related to/distinct from first using a latent variable structural equation model to estimate latent constructs (as is standard in psychology) and then choosing treatment based on those constructs?

__ Reproducibility__: No

__ Additional Feedback__: --- Post discussion phase ---
After discussion with the other reviewers and reading the author response, I have decided to keep my score at a 4. In particular, I do not think the reviewers have adequately addressed the concerns related to potentially harmful bias encoded in the method. Specifically, I would like to respond to the following points made in the author response:
1) The authors claim that because the individual items in HAMD have been sufficiently validated, the risk of unintentional bias is low. This may be true in the case of HAMD (and it is laudable that the authors consulted psychologists regarding the resulting measures), but the method is presented as general, not just for HAMD. Further, items often cannot be evaluated in isolation from the way they are aggregated. For example, the degree (and even direction) of racial bias in IQ tests can be changed by changing the weight given to various questions and sections.
2) The authors argue in their response that "We have empirically shown that our latent constructs lead to improved value function evaluated by other external outcomes not used in training (Table 2)". The value function is an expected reward and does not reflect potential disparities. I feel that the literature on health disparities and algorithmic fairness is pretty clear on this point.
3) Finally, the authors respond that "the risk of such bias is not unique to our method". This is absolutely true, but by removing the typical validation step, I believe the risk of such bias increases.

__ Summary and Contributions__: The paper presents a model for latent-space representation of mental state integrated in a predictive multi-domain outcome model using deep neural networks. The model is then used to learn optimal individualized treatment rules (ITR). The authors compared the model against 5 other methods in two experiments using simulated data and data from a randomized clinical trial. The results show evidence to support the model superiority compared to previously suggested models in the field domain.

__ Strengths__: The motivation and objectives of this study are well presented. The theoretical grounding and empirical evaluation in the study are adequate. The work presents a fair novelty to the field, building on previous work by presenting a framework incorporating multi-domain outcomes. The implementation of the model preserves principles from the psychiatry and information theory and provides a cross-field work and is relevant to the NeurIPS community.

__ Weaknesses__: Information on the number of patients in the clinical trial and data distribution in the experiments is missing. Further experimental analysis on different psychiatric disorders is necessary to validate the model, while this might be out of the scope for this work to present these analysis, the authors should relate to the future studies that could lead to a practical and validated applicable model in the future.

__ Correctness__: The claims and methods presented in this work are correct.

__ Clarity__: The paper is communicated with clarity.

__ Relation to Prior Work__: Relation to previous work is well established throughout the paper. The paragraph on advantages of the proposed method is particularly useful.

__ Reproducibility__: Yes

__ Additional Feedback__: Please add the abbreviations and more details on the performance scores in the STAR*D experiment (i.e. HAM-D, QIDS etc.). Since these are not typical machine-learning performance measures and are taken from the domain, it’s important to present the readers with a short explanation to gain better intuition to interpret the results.