__ Summary and Contributions__: The paper presents an privacy-preserving protocol for learning a logistic regression model on data distributed across multiple data holders. The protocol utilizes secret sharing and Lagrange encodings, and provides informational theoretic privacy against semi-honest adversaries. The protocol tolerates a customizable number of colluding parties.

__ Strengths__: The ideas presented in the paper are clearly justified (proofs in the Appendix) and seem to be correct, although I did not check all details. Key contributions of the protocol are the capability to distribute the computational cost across multiple parties, and the reduction in communication cost due to the use of Lagrange encodings instead of more communication-heavy MPC computations. The authors implemented two baseline protocols to compare against, and as far as I can tell the comparisons are meaningful and fair. The results are particularly powerful due to flexibility in the protocol parameters (T, K, N, r).

__ Weaknesses__: My biggest concerns is the the linear approximation to sigmoid. Does this result in essentially learning a linear regression model instead of logistic regression? If so, how does this scale to larger r? What happens to the communication cost?
I also feel like the empirical evaluation is somewhat lacking:
- The only scenario evaluated is one with 50 data providers. How does this scale to more or less?
- What exactly is the communication size? It seems that the communication time is dominating in all protocols, so it would be important to provide more details on that.
- For the baseline protocols, was a similar linear approximation of sigmoid used?

__ Correctness__: The claims and methods seem correct and justified, but the empirical evaluation is too narrow to give a clear picture of how well this method works beyond the small examples presented.

__ Clarity__: The paper is clear and easy to understand for people with a background in cryptography.

__ Relation to Prior Work__: Prior work is certainly discussed, but I'm not sure if the main differences to prior work are made sufficiently clear for a broader audience to understand.

__ Reproducibility__: Yes

__ Additional Feedback__: - The graphs in Figure 4 are kind of hard to read because they are so small and the graphs are all clustered together. Maybe a table would be better to represent these results.
- There is a typo in the caption of Figure 4a: it says "plain" instead of "plane".

__ Summary and Contributions__: This paper proposed a fully-decentralized training framework that achieves scalability and privacy-protection. The key idea is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. Convergence analysis and experimentation are provided.

__ Strengths__: This paper is generally well-organised and well-written.

__ Weaknesses__: 1. The major concern of the reviewer is on the novelty. All the adopted techniques are well-known, this paper just combines previous techniques into one framework.
2. FL with MPC is especially susceptible to poisoning attacks as the individual updates cannot be inspected.
3. Lack of theoretic support on the statement "Our framework can provide strong privacy guarantees against colluding parties with unbounded computational power".
4. Lack of comparison with the most recent MPC protocols, all the compared protocols were proposed 10 years ago.
5. Figure 3 is weird, training time of CodedPrivateML even decreases with the increasing number of of clients N, which seems counterfactual, any explanation here?
6. Implementation on simple logistic regression model cannot fully validate the scalability of CodedPrivateML. It is expected to see how CodedPrivateML performs on more complex DNN models.

__ Correctness__: Probably correct

__ Clarity__: yes

__ Relation to Prior Work__: Lack of comparison with the most recent MPC protocols, all the compared protocols were proposed 10 years ago.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: The authors present distributed privacy-preserving logistic regression training using packed Shamir secret sharing. The degree of distribution and number of corrupted parties can be set independently, thus allowing exploring a space of computation cost and security considerations. Benchmarks are given in comparison to using plain Shamir secret sharing.

__ Strengths__: Correctness and security are evident. The application of packed Shamir secret sharing to distribute the training computation is interesting.

__ Weaknesses__: There isn't much novelty in terms of the components.
Post rebuttal: I'm satisfied with the rebuttal, and I agree in particular with the authors' point that there isn't much recent work to compare with.

__ Correctness__: I cannot find any faults.

__ Clarity__: The paper is very well written in terms of grammar style, but there is potential for improvements in terms of clarity: Section 3 spells out Lagrange interpolation several times. I think the paper would benefit from a general introduction of the Secret sharing scheme followed by a more abstract description of the protocol.

__ Relation to Prior Work__: The authors attribute their main technique to Yu et al. (AISTATS '19). However, it was first proposed by Frankling and Yung (STOC '92), and subsequent works have established the name packed secret sharing (Damgard et al., Eurocrypt '10). I find the term Lagrange coded computing confusing.

__ Reproducibility__: Yes

__ Additional Feedback__: Have the authors considered using pseudo-random secret sharing (Damgard et al., TCC '06) to generate the random parameters Z_k instead of a "crypto-service provider)?

__ Summary and Contributions__: The paper aims at proposing a fast encryption based method for training a logistic regression model privately, on data from multiple parties.

__ Strengths__: The problem of training ML models securely and privately is a very relevant problem, and I commend the authors for addressing it and trying to provide speedup.

__ Weaknesses__: The paper claims 16X speedup over a baseline from 1988, and 8X over a baseline from 2008. I wonder if there are any more recent works to compare with? I know that for deep learning there is constant improvement in the speed of encryption based methods, however I am not entirely familiar with the existing work for algorithms with machine learning. For example you could maybe compare to [25] and [26] in the paper.

__ Correctness__: The paper seems solid in terms of experimentation and reasoning. However I am not an expert in encryption.

__ Clarity__: yes

__ Relation to Prior Work__: The paper has cited related work, but does not very well explain the difference between their contributions and prior work.

__ Reproducibility__: Yes

__ Additional Feedback__: I would like to see comparison to more recent work, and also I would like to see a break down of memory consumption and communication costs, since communication is also a bottleneck in these systems.
___________
After rebuttal:
The rebuttal addressed my concern, and I have updated my score accordingly.