__ Summary and Contributions__: In this paper, the authors propose an approach to learning physical dynamical systems using NNs via optimization under constraints in Cartesian coordinates. They discuss the effectiveness of their approach compared with the existing Hamiltonian and Lagrangian NNs, which learn systems in generalized coordinates with implicit constraints.

__ Strengths__: Their approach is based on the idea of enforcing conditions that have to be followed explicitly in the procedure where differential equations with Hamiltonian systems through transforming the objective to non-constrained one with Lagrange multipliers. This sounds reasonable and follows the well-established principles both in physics, ML and others

__ Weaknesses__: The approach, which incorporate constraints in Cartesian coordinates through transforming the objective to non-constrained one with Lagrange multipliers, sounds somehow straightforward. Also, the procedure is restricted to physical systems, typically, in our 3d space (or 2d) because we need the calculation of the Hamiltonian systems although I admit this restricted focus is still an important problem.

__ Correctness__: The claims and method in the paper seem to be correct.
-----
Thank you for your rebuttal. I found misunderstanding in my first review.

__ Clarity__: I think the paper does not include several necessary information. For example, I wonder what is a guideline to to design f_\theta with NNs given an objective physical system.

__ Relation to Prior Work__: The paper describes the relation to existing works in the paper.

__ Reproducibility__: No

__ Additional Feedback__:

__ Summary and Contributions__: The authors show certain advantages of using explicit constraints in learning physical systems.

__ Strengths__: The approach appears sound and has an impressive performance.

__ Weaknesses__: There is a bigger picture question: In classical mechanics, we typically try to avoid the explicit constraints as they are numerically more brittle. Why is that not the case for the presented approach?
Also, the competing approaches were evaluated on complex real world systems, e.g., DeLaN was evaluated on a 7 DoF robot for control. Why should your results be believable?

__ Correctness__: The paper appears correct.

__ Clarity__: Only experts on the topic can follow the paper. More Background is needed.

__ Relation to Prior Work__: Yes

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: The authors propose a parametrization to learn Hamiltonian or Lagrangian dynamics using neural networks. Their idea is to modify the variational problem of Hamilton's principle so that the expression in the Cartesian coordinates satisfy the constraints required from the problem. Finally, the Hamiltonian (and Lagrangian) mechanics are modified with the projection operator so that the resulting trajectories satisfy the constraints. They show the effectiveness of the proposed formulation with variety of numerical examples.

__ Strengths__: The idea is simple yet reasonable, and the experimental materials are quite convincing to understand the validity of the proposed parameterization. I believe this work is very important in the area of learning physical systems using neural networks.

__ Weaknesses__: Overall, I did not find big flaws when reading.

__ Correctness__: The method and the empirical methodology indeed seem valid and correct.

__ Clarity__: The paper is very clear. I really enjoyed reading the paper.

__ Relation to Prior Work__: I think the related work is sufficiently mentioned.

__ Reproducibility__: Yes

__ Additional Feedback__: Line 211: What V(X) denotes? I could not find where it was defined. Sorry if I just missed it somewhere.
If I should say something more here, I would say that negative examples, if any, could make the discussion more complete. I understand that the proposed parameterization, seemingly just an incremental modification, is advantageous in many problems of physics learning as the Cartesian coordinate is an orthodox representation. Meanwhile, could you elaborate on cases where the proposed parameterization is not really advantageous and constitute such examples numerically?
-----
Thank you for the rebuttal. As my evaluation is originally positive and I posed no serious concerns, I leave my score just unchanged.

__ Summary and Contributions__: In this paper, the Authors explicit use of the cartesian coordinates to represent the state of dynamical systems. They use this Cartesian representation to then derived either the Hamiltonian or the Langrangian related to the system dynamics.
They derive the mathematics for both cases and shows that they might handle many complex systems not only in 3d but any dimensional space (which is required for problem going beyond mechanics).

__ Strengths__: The proposed approach is very sound. The mathematics and their derivations are classic and very standard (maybe, they should appear in the appendix instead of section 5).

__ Weaknesses__: It would have been nice to have a complete picture of the proposed approach to really describe the whole contribution. For instance, the learning problem is detailed at various places in the paper, making it hard to get the whole picture.
I would rather suggest the Authors reorganizing their paper to get something smoother in terms of reading.
I might appear that the matrix in Eq (5) and (6) might be singular. How do the Authors handle it in their numerical implementation?

__ Correctness__: The experimental validation seems to contain sufficient comparison to show the effectiveness of the approach. They illustrate their approach to various dynamical systems. It would have been nice to really show the effectiveness on a robotic system like a 6d arm which may evolve in 3d with hard joint constraints.

__ Clarity__: The paper is clear and sounds. I do think that the paper organization can be largely improved by moving some results (like the energy one) to the main corpus and reducing the size of the introduction (which seems a bit longer with too many details that may be postponed to the appendix).
It seems that line 188, the notation differs from Eq. (7).
The summary line 206 is not clear and should be revised.

__ Relation to Prior Work__: The difference between the existing state of the art approaches is clearly stated.

__ Reproducibility__: Yes

__ Additional Feedback__: