__ Summary and Contributions__: This paper studies the coding accuracy of networks with 'tight' or 'classic' balance within a mean-field theory. The new formulation allows to dissociate the different factors influencing network performance (type of balance, delays, weight disorder, and noise), and is simple enough to allow for in-depth mathematical analysis. The latter is used to analyze several network properties and behaviors such as the expected bias and variance of a coded signal, and the interactions of balance, noise (or chaos) and delays.

__ Strengths__: Overall, the paper is well written, and I think it presents an elegant unification of various properties of balanced networks in a mathematically tractable way. Some of the results had been known before individually, and this paper sheds new light on them.

__ Weaknesses__: A potential weakness of the paper is that it seeks to analyze the properties of balanced *spiking* networks, but does so in a firing rate framework. That leaves the question of how the presented firing-rate models relate to the previous spiking models (beyond a reference to an anonymized paper, which I can therefore not judge). These differences should be more clearly elaborated.
Also, the paper only shows results for static 1D representations. While the supplementary material provides theoretical results for extensions (that seem sound), it would nice if these were demonstrated and compared to numerical results.

__ Correctness__: Everything seems correct

__ Clarity__: The paper is generally well written. A few suggestions for improvement:
(1) Figure 1 - there is too much variance in font-sizes in the figure
(2) Figure 2 - the example traces and sampling of parameters could use more than one 'oscillations' example.
(3) All figures - I appreciate that the paper should fit in 8 pages, but the figures and axis labels etc. are made so small as to be hard to read in many places.
(4) Line 38 - When discussing predictive coding, a reference is made to Eliasmith & Anderson (2004), which does not seem to be the correct reference.

__ Relation to Prior Work__: Yes

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: This paper addresses the theoretical principles governing the efficacy of balanced predictive coding and its robustness to noise, synaptic weight heterogeneity and communication delays. For this purpose, the paper introduces an analytically tractable model of balanced predictive coding, in which the degree of balance and the degree of weight disorder can be dissociated. This model is used thereafter to infer a mean field theory of coding accuracy. The work is a step forward towards dissecting the differential contributions of neural noise, synaptic disorder, chaos synaptic delays, and balance, to the fidelity of predictive neural codes.

__ Strengths__: The mathematical formulation seems to be sound, and the results relevant.

__ Weaknesses__: The decoding of the input used in the main body of the paper is a strongly simplifying assumption, that largely ignores the sophisticated nonlinear transformation of the network itself.

__ Correctness__: The claims made by the paper seem to be correct.

__ Clarity__: The paper is clearly written, but it may be hard to read for researchers not familiar with the theoretical work in spiking neural networks domain.

__ Relation to Prior Work__: The previous work seems to be well addressed.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: The goal of this paper is to understand under which conditions recurrent networks efficiently encode stimuli. To this end, the manuscript focuses on a class of models that interpolate between randomly-connected and predictive coding networks. Using a detailed mean-field theory, the authors determine the influence of different parameters (recurrent feedback, noise, random connectivity, delays) on stimulus encoding. The theoretical predictions show an excellent match to simulations, and provide a clear picture of how various parameters influence coding.

__ Strengths__: - remarkable theoretical analysis
- unified picture for a variety of different modelling frameworks

__ Weaknesses__: The paper ends up not answering the main question set up in the Abstract and Intro. The initial goal is to determine the conditions under which the coding error is 1/N rather than 1/sqrt(N), where N is the size of the network. But most of the paper focuses on the scaling with other parameters, in particular the feedback b. The scaling with N is not shown in the Figures, nor mentioned in the Discussion (although it can be directly inferred from the results). This looks like an important omission, but can be fixed easily. My understanding is that 1/N scaling can never be achieved in presence of delays?
Update: unfortunately the last question was not answered in the authors' response.

__ Correctness__: yes

__ Clarity__: Very clearly written.

__ Relation to Prior Work__:
The model in absence of delays, and much of the corresponding mean-field theory, look like a specific case of the class of models studied in Ref 23. It would be fair to state this. This fact does not remove anything to the merits of the present study. The main questions addressed here (negative feedback and its impact on coding) were not treated in Ref 23.
Discussion: it may be important to comment on the fact that Ref 7 ascribed the 1/N scaling to the spiking nature of the networks. One possible criticism of the present work is that it deals with rate rather than spiking networks. I don't think the conclusions would change in spiking networks, but it may be useful to pre-empt this criticism.

__ Reproducibility__: Yes

__ Additional Feedback__: Minor comments:
l. 87: "were" -> "where"
l.128: the relation to E-I balanced networks could be made more explicit. In some versions of those networks, there are also two independent effective parameters that scale separately the negative feedback and the variance of the connectivity (see e.g. Mastrogiuseppe and Ostojic 2017)
l. 223 "the full solution for the chaotic system is highly involved" - the solution for adiabatic inputs seems to be available from Ref.23, but perhaps the situation here is different? My understanding is that we are here in the adiabatic limit, not in the case of Ref 38? In the adiabatic case, why does the (finite) correlation timescale of the noise matter for coding? Is there a transition out of chaos as either b or the strength of the stimulus are increased? It would be useful to clarify these points.

__ Summary and Contributions__: This work theoretically analyzed the dynamics of neural networks achieving predictive coding in balanced conditions, and gives us insight into understanding how different factors including noise, synaptic disorder, synaptic delay, and balance affect the network performance. It is a vluable piece of work to the field.
The authors partly addressed my concerns, and I keep the same score.

__ Strengths__: The theoretical analysis is quite deep. The concept of balanced predicitive coding was proposed already, but this study gives us some new insight on details.

__ Weaknesses__: The analysis is based on a simple form of the network dynamics, Eqs.1-2, which seems to be quite artificial, e.g., the feedforward input takes the form of w_ix, and the same w_i appears in the recurrent connections and the read-out vector. It is unclear to me how a real neural system achieves this. Can the authors give an example?

__ Correctness__: The claims and method in this paper are reasonable.

__ Clarity__: It is OK, consider the authors have to compress many mathematical details into 8 pages.

__ Relation to Prior Work__: It is clearly stated.

__ Reproducibility__: Yes

__ Additional Feedback__: