In the context of network architecture search, this paper studies the role of parameter sharing in trading off search efficiency vs architecture discrimination. It does so by proposing a new parameter sharing scheme (based on a single shared underlying weight matrix and and learnt basis function), a new metric to track the degree of sharing (based on covariance of sampled weight matrices) and a heuristic which gradually anneals the degree of sharing during training. Authors present extensive experimental validation on competitive datasets, along with a proper analysis of the proposed method. Reviewers generally like the direction of the paper, and in particular I note [R1]’s enthusiasm for laying the groundwork in studying the role of parameter sharing in architecture search. While I note the objections raised by some reviewers regarding the weak results, the paper does a thorough job analyzing the impact of parameter sharing both in terms of empirical results, and ablative analyses. I am thus happy to push for acceptance at this point. Note that I do share some reservations with [R2] regarding how APS integrates into the full network architecture search, and [R3] with respect to limited discussion and comparison to Meta-Pruning. I sincerely hope the authors will address these points before publication. Detailed feedback: * please include APS-[O,I] baselines for ImageNet results.