Review for NeurIPS paper: Neural Unsigned Distance Fields for Implicit Function Learning

NeurIPS 2020

Neural Unsigned Distance Fields for Implicit Function Learning

Review 1

Summary and Contributions: The paper proposes to regress unsigned distance function instead of signed distance as commonly did in implicit shape representation. It's shown that with the unsigned version of implicit functions, complicated 3D subjects with interior contexts could be faithfully represented.

Strengths: The idea of using unsigned disance field to represent open surfaces is an interesting idea. The paper also provides enough technical details on how to perform different kinds of rendering with the proposed unsigned implicit representation. It certainly holds technical value towards the ML & graphics community.

Weaknesses: I'm a bit confused by the claim that one of the benifit of NDF over SDF is the ability to also model functions and manifolds (Fig.1) Is it trivial that a function can be directly fitted by a network, what's does it mean to use unsigned distance field to represent a function? and what are the benifits? Similar question goes for manifold. The current experiment settings are limited to toy settings where near perfect 3D groundtruth are already given for training. What happens if the groundtruth is corrupted, incomplete, or sparse? Any comments or insights on how the proposed NDF could be applied to unsupervised learning with differentiable rendering?

Correctness: The intuition of using unsigned distance representation as well as the technical details in rendering appears to be correct to me. Some minor issues I had is discussed in above.

Clarity: The paper is mostly clearly written. It would be better if the author could explain a bit more why they view modeling functions / manifolds with NDF is prefered.

Relation to Prior Work: yes

Reproducibility: Yes

Additional Feedback:

Review 2

Summary and Contributions: Paper proposes an approach to produce un-signed distance field as a 3D shape representation for input sparse point cloud. This approach facilitate training on 3D dataset for which water tight meshes are hard to generate. Furthermore, this approach is also applicable for general curve, surface and manifold (spiral) approximation. The approach is simple and general, which promises wider applicability. Update: Authors have provided more experimental results on shapenet and comparison with SAL. Since baseline methods do no generate internal surfaces, chamfer distance will definitely be lower for NDF while using full meshes for ground truth. Therefore, the provided chamfer distance score for shapenet seems reasonable. I would be interested in an experiment where chamfer distance is only computed using external surfaces. Also, renderings are a bit noisy. I acknowledge the limitation of the paper that it doesn't fit in the classical rendering pipeline, I still think that the paper can be a good contribution to community, considering that it helps in generating internal surfaces and can work on non-water tight meshes (garments). As an after thought, how realistic is the setting of having just points without normals? Scanners give both points and normals albeit noisy. It seems bit unrealistic that you will have points sampled from internal surfaces, without any normals. If we have noisy points with normals, we can use the approaches from following papers: 1. Implicit geometric regularization for learning shapes 2. Implicit Neural Representations with Periodic Activation Functions Though both papers should be considered contemporary. I am willing to accept the paper if authors include detailed experiments on shapenet category, along with experiments on non water tight meshes like garments and open scenes.

Strengths: 1. Though, it is obvious how to produce un-signed distance field using implicit-function neural networks. It is less obvious that a simple gradient descent based method can help in recovering inherent surface as shown in algorithm-1,2. The approach is founded on the observation that you can recover the surface point by traversing in negative gradient direction of distance field, weighted by some constant. The observation is well founded. 2. The main contribution is how to extract the surface once you have a network that gives you reasonable un-signed distance field, which is explained in algorithm-1,2. 3. I appreciate the wider applicability of the approach, in terms of its usage in non-water-tight surfaces and approximation of manifolds, though more rigorous analysis is needed to establish the validity of latter. 4. Experiments are done on car category from shapenet and garments for shape reconstructions. Which definitely shows improvement over the presented baselines.

Weaknesses: 1. Unsigned distance field has been explored before in the context of surface approximation [A], which explores the scenario where signed distance is not available for training, which makes this approach less novel. A: SAL: Sign Agnostic Learning of Shapes from Raw Data 2. Since the experiments are only done on car and garments in the main paper and large scene reconstruction in the supplementary, it is hard to place this work in wider context of shape reconstruction tasks done in contemporary literature. For example, I would like to know how well it compares on other categories of shapenet. Is it better that signed distance based approaches? How does it compare with SAL [A] approach for shape reconstruction on water tight surfaces? I would prefer more details on comparison with SAl approach, may that be in supplementary material. If a researcher wants you use this method, can they expect better performance on water-tight as well as non water-tight surfaces?

Correctness: 1. The experiment section lacks comparison on all categories of shapenet, which makes it hard to judge whether this approach is applicable on wider categories. 2. Comparison with SAL is missing, which in my opinion is very relevant this work.

Clarity: 1. Paper is mostly well written. 2. There is a typo on line-76 (manifoldq -> manifold).

Relation to Prior Work: The paper covers most relevant works.

Reproducibility: Yes

Additional Feedback: 1. From what I understand, projection of point p to a surface point q as detailed in line 54 is only valid when norm of gradient is 1. If it is so, please either use different symbol or explicitly write it. 2. Please provide more information about run-time and num_steps. How does performance varies by changing num_steps. Can higher order derivatives be helpful in finding the surface points faster? 3. A good reconstruction describes the shape using small number of triangles. The fact that this algorithm can process millions of points and recover the inherent shape may not be desirable if you require millions of triangles to faithfully reconstruct the shape. This work needs a bit more analysis on how many initial points are needed to get a reasonable performance, along with tradeoffs. 4. Why normals computed away from the surface are good approximation of normals on the surface? How good is this approximation quantitatively? 5. Unlike signed distance function learning, does this approach guarantee f(p) to be zero on the surface up to certain tolerance? I can imagine the performance of this approach being sensitive to hyper-parameters. Does hyper parameter range vary much across different categories? In general, this paper can lead to good contribution to community if more detailed analysis and experiments are done. Paper does not inspire confidence in this particular approach.

Review 3

Summary and Contributions: The authors propose to represent 3D shapes using "NDFs" --- deep implicit unsigned distance functions. This representation admits a broader class of shapes than the recently popular SDF representations. Algorithms are provided to extract point clouds, meshes, or images from the learned implicit NDFs.

Strengths: Deep learning on 3D shapes in general and deep implicit representations for 3D geometry in particular are useful and exciting research topics. Expanding the class of shapes that can be represented in this fashion (e.g., not being limited to watertight manifolds) is a step towards more general 3D learning pipelines.

Weaknesses: The authors claim that their method has the benefits of (1) being able to learn from unoriented geometry, for which an SDF is not known/available and (2) being able to reconstruct non-manifold manifold geometry that does not have a well-defined SDF. With respect to (1), a more discussion and experimental comparison to "SAL: Sign Agnostic Learning of Shapes from Raw Data" is necessary. While the authors mention this work, they note that it still outputs SDFs. This should not preclude comparison experiments, e.g., on ShapeNet. On the other hand, with respect to (2), the authors don't sufficiently motivate the benefit of being able to learn implicit representations for non-manifold geometry---the examples under "Functions and Manifolds" are mainly toy examples. A lot of effort has been put into computing orientations for unoriented surfaces or point clouds, since algorithms for rendering, simulation, etc. often require consistent normals. By definition, the geometry represented by the proposed method is incompatible with these graphics pipelines. Additionally, there is insufficient details experimental details provided to evaluate the method compared to previous work. The authors only compare reconstruction quality on a single shape category (cars), and, moreover, very little information is provided about the set-up: what is the network architecture, learning rate, training time? Is there a test/train split, and are quantitive statistics provided on the test set?

Correctness: Overall, the claims made in the paper are valid, though a lot of technical details are missing that make it difficult to fully evaluate the experiments. The authors state that SDFs are "limited to 3D shape representations," whereas their method is more generic. This isn't true --- SDFs can certainly be used to capture 2D watertight manifolds (e.g., closed curves).

Clarity: The paper is generally clearly written. As mentioned above, more details about the specific set-up for the experiments is necessary for reproducibility and fair comparison.

Relation to Prior Work: The prior work is discussed sufficiently, and novel contributions are explicitly presented.

Reproducibility: No

Additional Feedback: Post-rebuttal update: Thank you to the authors for the clarifications. These have largely addressed my concerns with respect to comparisons with SAL. However, I am still not convinced that the contribution is sufficiently motivated. Generalizing SDFs to unsigned distances is not a particularly novel idea, and has even been tried in the context of deep learning. I'm not sure that demonstrated target application of modeling internal surfaces (e.g., cars) is a sufficiently convincing use case. I think the paper would be much stronger if it showed some of: novel theoretical results or ideas specific to unsigned learned implicit fields, application to modeling scenarios that truly require such a representation (e.g., garments), or extension of standard rendering pipelines to this representation.

Review 4

Summary and Contributions: This paper proposes an implicit representation for 3D geometries. Previous works apply signed distance field and are limited to water-tight surfaces. In contrast, this paper proposes unsigned distance field that can represent both water-tight and non-water-tight surfaces. The experiments show that the proposed representation can accurately represent complex geometries as well as curves and manifolds.

Strengths: 1. The introduction of signed distance field tackles a major limitation of previously commonly used signed distance field, and achieves more accurate reconstructions on non-water-tight surfaces such as vehicles with complex interior structures. 2. The paper proposes solutions to extracting dense point clouds from the learnt unsigned distance field, and present techniques for rendering surfaces and curves from the distance field. 3. The experiment results on representing complex interior structures of objects are impressive. Such a representation could be used in many other tasks in 3D reconstructions and neural rendering.

Weaknesses: 1. While the unsigned distance field is first applied in a deep learning setting, I would imagine that it has been well studied in traditional CG and CV. However, I didn't find many discussions on this in the paper. How do the algorithms used for point cloud extraction and ray tracing relate to previous works? Adding citations to previous works on related topics would help better position the paper. 2. Compared to SDF, the advantage of NDF is obvious. What are the potential disadvantages of NDF? Discussions on limitations would help readers to better understand the method. Overall I like the results presented in the paper, and the method also is interesting to me. The usage of NDF would be a good add-on to 3D deep learning.

Correctness: The paper is technically sound.

Clarity: The paper is well written.

Relation to Prior Work: More discussions to previous methods on unsigned distance field could be added.

Reproducibility: No

Additional Feedback: More details on the network architectures should be provided. Post-rebuttal: The rebuttal addresses my concerns. While signed distance field has its own problems, I believe it can be useful in some scenarios where waterlight surfaces are not applicable. So I keep my original score and agree with accepting the paper.