This exciting paper introduces some interesting and novel theoretical contributions to the graph neural network literature. The authors also verified some of their theoretical findings empirically as well. This paper is worth presenting at NeurIPS with the condition that the authors will address the concerns raised by the reviewers on writing and clarity. This paper has valuable contributions in better characterizing 1-WL's power, yet it steps too large directly to using distance while skipping the discussion of those somewhat more straightforward conditioning ways (such as annotation, etc.) It seems like there is a too big jump from 1-WL directly to DE without discussing how much you gain by just doing more straightforward conditioning. We would suggest the authors address this in the camera-ready version of the paper by revising the writing a bit. More concretely after discussion with other reviewers and in particular with reviewer 3, we have identified that the following changes needs to be done for the camera ready version of the paper: 1. Discuss the differences between DE and node subset conditioning. Explicitly point out distance encoding incorporates the annotation trick and is more powerful than node conditioning (the author response did a good job). Point out that node conditioning is also more powerful than WL (i.e., DE > node conditioning > 1WL). 2. Clarify the contribution is on characterizing DE theoretically, not proposing the DE. This can be done by putting some discussions of prior work using DE (such as SEAL and PGNN) in Section 4 or to the introduction.