__ Summary and Contributions__: Carlsson’s TDA is one of the most elegant analysis for deciding on the most stable topology of a cloud of points. Here, that theory is used in an innovative manner to study task-evoked within-subject brain state similarities (here expressed as clusters) over the decoded topology. This construct is associated by the authors to the construct of functional connectivity (ln 34) although some argumentation can be made regarding whether the two constructs are analogous or the same. Regardless of the philosophical discussion, this is an excellent paper; with rock solid maths, replicable experiments, strong and clear results, and conclusions commensurate with evidence provided.
It is my understanding, that the distinctive key to the proposal is (i) the chosen projection to the cubical complex as this defines the topology from where the filtration of the simplicial complex is derived, and (ii) the trajectory analysis, which encodes the solution to the neuroscientific question at hand. Although, the method is exemplified to capture age differences, it is easy to see how the proposed method can be reused to explore other questions.
In terms of innovation, this paper is arguably the best of my reviewing lot for this year. Moreover, the paper makes a critical statement: the simplicial complex generalizes the graph (ln 64). This is known -no news here-, yet the remark is extremely important in my opinion; despite all the excellent contributions to understanding brain function based on graph theory, the graph (any kind!) is insufficient to express all the complexity of the brain integrational activity and thus more general mathematical objects must be explored.

__ Strengths__: + The maths in section 2 makes an excellent effort to convey the essence of TDA, which is otherwise an extremely sophisticated theory. The complement in A.1 is superb.
+ This is a very innovative work.

__ Weaknesses__: + The link between the construct explored and “classical” neuroscientific questions e.g. connectivity or other is weak. This could be strength, if indeed it is a new construct with value for neuroscience, but this is not shown.
+ Synthetic verification and validation is lacking.

__ Correctness__: The underpinning TDA theory is very strong mathematically. Theorem 1 (stability) is relatively simple, and the proof in the supplementary material is virtually trivial. So unless I’m missing something, as far as I can tell, the theory is correct.
Experimentally, the authors have opted for a relatively simple design which also add to the strength of the paper and the message. The dataset is external and hence, control over its acquisition is limited, but this is in principle irrelevant for the hypothesis made.

__ Clarity__: Given the complexity of the underpinning theory (TDA), the clarity is excellent in my opinion (although I may be biased in the sense that I am fairly familiar with TDA). A novice may struggle to follow.
The rest of the paper, introductions, experiment description and results are also easy to follow again if you are somewhat knowledgeable in the field. Perhaps only some more detail on the description of the processing of the MRI dataset is missed e.g. specific functions and parameterizations. (Note that this is not the description of the fMRI dataset, which can be found in Supp. Material A.3)
Figures labelling in some cases are unreadable because of font size (sorry, I’m old and my sight is starting to fail!). Since some of the colors are only for aesthetics purposes e.g. Fig 2c, FigA.5, etc they can be perhaps be avoided to avoid potential confusion.

__ Relation to Prior Work__: The relation to previous work is perhaps one of the weakest parts of the paper. However, the text is very dense throughout the full draft -the authors succeed in saying a lot with very few words-, hence, I believe that they might have sacrificed a bit of the referential framework, to have more room for presenting their research which is a totally understandable decision.

__ Reproducibility__: Yes

__ Additional Feedback__: + Something that I appreciate from graph theory analysis in neuroscience, is that most properties of the graph have a clear neuroscientific interpretation. Bringing a new object to the field, requires some time to establish such a “dictionary” between mathematical properties and associated domain interpretations. Here, the curvature analysis is fascinating from a mathematical point of view, and it is brilliantly used to discriminate groups. However, prediction does not necessarily go hand in hand with explanation, and I reckon there would be other mathematical properties of the trajectory object that would be expressive of such group differences. Is there any translational meaning of the curvature to neuroscience?
+ Only face validity is established. Other types of validity could have been attempted -although I reckon the authors are keeping this for a journal-.
+ The relevance of the work in my opinion is beyond doubt. Notwithstanding, it is difficult not to wonder whether a simpler approach would have perhaps succeeded in answering the question at hand. No effort is made to show whether this is the case or not. For instance, in the coarsest sense, classification of age groups from the observations could have been addressed from a classical classification approach leading also to significant differences.
===========
After Rebuttal
===========
After seeing the rebuttal by the authors, the discussion on this paper has been open. If I am interpreting my fellow reviewers' opinion correctly, it is my feeling that while we did not reach to a full agreement -rejection position was not too strong, but neither push for oral-, but at least paper acceptance can be recommended.

__ Summary and Contributions__: The authors proposed to apply time-varying persistence diagrams from algebraic topology on fMRI volumes to show that these topological representations are capable of capturing age-related differences between adults and children.

__ Strengths__: - This work makes use of the topological representations from persistence diagrams which are robust to noise and variability among individuals, which is an interesting approach.
- Being the first to apply topological data analysis directly on fMRI data, as claimed by the authors.

__ Weaknesses__: - Persistent homology on brain topology has been studied before, e.g., Chung et al, Persistent homological approach to detecting white matter abnormality in maltreated children: MRI and DTI multimodal study, and many other papers from his group. What is the difference between Chung's work and this one and what novelty does this paper provide other than going from network to 4D?
- What is the dimension of the real data?
- The paper lacks detailed explanation in the neuroscience background, which supposed to be a very important piece in this paper.

__ Correctness__: Yes

__ Clarity__: Not quite. It is a little difficult to follow

__ Relation to Prior Work__: Not quite.

__ Reproducibility__: Yes

__ Additional Feedback__: Authors mainly apply the existing methods from algebraic topology and seems to lack critical contributions. Even as an application paper, it is not very clear why this work is significantly important in the neuroscience domain.

__ Summary and Contributions__: The paper applies persistent homology with cubical complex to time-varying fMRI data. The idea is to use the whole brain fMRI image (with whole brain mask or ROI masks applied) as the filter function for the computation of persistent homology. And then the persistence diagrams over different time points are used to differentiate age groups. A first study uses the total persistence measurement (one scalar value per diagram) over different time points. A second study use persistence image which maps the diagrams into vectors. Both cases show that the topological feature are able to differentiate the age groups.

__ Strengths__: The result seems scientifically significant. The usage of topological methods is well-thought and properly carried out. The experiments are well executed. Baseline methods on non-topology descriptors for fMRI are compared with. The research clearly will lead to interesting discovery of the data.

__ Weaknesses__: My main concern is that the novelty of the methodology is very limited given abundant previous applications of persistent homology to various images (including fMRI). There is a long list of previous results on applying persistent homology to fMRI, structural MRI (mostly resting-state though), and EEG data (the first published in 2009, "Persistence Diagrams of Cortical Surface Data", IPMI 2009). These methods should have been cited and compared with.
I do agree that the findings over the dataset can be potentially impactful. And I think the paper is quite well-written. However, I think its value is only in the application of the method to this particular dataset and the novel domain-specific insights. I do not think this paper fits the NeurIPS conference quite well. it seems to be a better fit to a neuroscience venue such as NeuroImage, Human Brain Mapping, etc.
After reading the rebuttal and other reviews, I am raising my score but I am staying on the negative side.
This paper is a good one for the application of persistent homology to this particular long-term task fMRIs. I do not think it is a good fit to NeurIPS if we were to judge by the methodology part. But I am partially convinced by my fellow reviewers that if NeurIPS were to have any paper on the neuroscience track, this should be one.
The idea of using cubical persistence in the imaging context is straightforward, although I have not seen any methods using cubical complex on fMRIs.
One thing I would encourage the authors to add to the final version of the paper: the baselines (baseline-PP and PP-TDA) are only taking average values within 100 ROIs (an 100-times dimension reduction), also they only use the correlation between ROIs. Meanwhile, persistent homology is using the full voxel image. To be more convincing, the authors should use the actual full fMRI image (like MVPA). If the data samples are insufficient, the authors should use average/max/min values of each ROIs as features and show that persistent homology features outperform these voxel-value-based features.

__ Correctness__: Yes.

__ Clarity__: Yes. The paper is clearly written.

__ Relation to Prior Work__: No. While the authors tried to cover quite some papers from original persistent homology and from recent progress in applying persistent homology to learning. It seems that they missed a whole literature of persistent homology applied to brain imaging.

__ Reproducibility__: Yes

__ Additional Feedback__: The implementation details are well presented and the code is (or will be) shared. I really appreciate it. I am a bit curious about the time-series visualization tool. Perhaps some brief explanation in the paper can be helpful.