Export Reviews, Discussions, Author Feedback and Meta-Reviews

Paper ID:	418
Title:	Recurrent networks of coupled Winner-Take-All oscillators for solving constraint satisfaction problems

Reviews

Submitted by Assigned_Reviewer_5

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)

Summary: The authors presented a recurrent neuronal network, modeled as a continuous-time dynamical system, that can solve constraint satisfaction problems.

Since the authors did not completely define their recurrent neuronal network, it is hard to understand at least for me. For example, v(t) and u(t) appear without definition in line 107. The single WTA network described in line 151 and Figure 1 (a) is not mathematically defined. Synaptic connections of the coupled WTA circuit described in lines 158-159 and Figure2 are not defined. The authors just said that they asymmetrically coupled a number of WTA circuits. I could not understand how the binary value is encoded as the state of the oscillatory network regarding lines 178-180. WA and WB appear without definition in line 191. The variables X1 and X2 appear without definition in line 199. Are X1 and X2 the synaptic weights or the names of two WTA circuits?

It is well known that neural network models tends to behave chaotic in the case that the synaptic connections dynamically change, for example, the short term synaptic depression. I suppose that that is the reason why the present model seems to be stochastic even thought it is defined as the deterministic system.

I also think that the system size of the present is too small (ten units with 9 constraint). For example, more than 10^3 units are usually discussed in the 3-SAT problem. I have to say that the example of the present paper is not enough in the framework of the CSP problem .

Q2: Please summarize your review in 1-2 sentences

I could not understand contents of this paper, because the description seems to be incomplete as described in 1.

Submitted by Assigned_Reviewer_7

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)

The authors propose a new approach for solving constraint satisfaction problems (CSPs) in a neural architecture. While previous attempts in this direction have largely relied on stochastic neuron models to prevent a neural network from getting stuck in local optima (e.g. through the use of Boltzmann machines), the proposed architecture is capable of finding optimal solutions to problems using purely deterministic network dynamics. This is achieved by coupling oscillator modules such that the strength of their mutual interaction depends on their phase difference. Although the paper does not provide a rigorous theoretical explanation for it, these phase differences are observed to vary irregularly in simulations, thus providing sufficient exploratory drive to escape local optima. In small simulations of a CSP with ten binary variables and nine tertiary constraints the network is observed to find either of the two correct solutions (which satisfy all constraints) in each trial. Interestingly, exploration is driven purely by violated constraints and thus, in contrast to stochastic approaches to solving CSPs, the network will stay in the global optimum once it is found (without the need for morphing into a greedy search as, for example, in simulated annealing). If no solution exists which satisfies all constraints the network behavior resembles a sampling process from a Gibbs distribution with a linear energy term. Unfortunately, a theoretical explanation of this behavior could not be provided by the authors.

The presented approach to solving CSPs in a deterministic neural architecture is, to the best of my knowledge, original and novel. It provides both an innovative approach for solving CSPs in neuromorphic hardware and a fresh perspective on oscillations and synchronization between oscillating modules (and lack thereof in the form of irregularly varying phase differences) in neuroscience.

The applications are non-trivial, although too few and too small to provide insight into the scalability of the method. On the positive side, the problems were chosen to be both illustrative and difficult in the sense that the energy landscape was designed to possess several local optima and more than one global optimum (especially in Fig. 3). On the negative side, the state spaces of the considered problems are very small (2^10 or 2^11). For such state spaces simple exhaustive search works well and is a much easier option. I would suspect that larger simulations would show that the algorithm still performs well, within what one may expect from a parallel iterative algorithm. Unfortunately the paper does not address scalability, and so the utility of the approach for larger problems remains unclear. At least a discussion of scalability is warranted and would be highly useful for the understanding of the method.

In terms of clarity and quality of figures and text the paper rates highly, yet no theoretical arguments were given for the observed behavior of the network. The paper is well organized, the model is for the most part clearly defined, the chosen examples are illustrative, the figures convey a clear message, and the explanations of the network dynamics are insightful. The lack of theoretical arguments is an obvious shortcoming of the paper. Furthermore, I'm missing a specification of the parameters used in the simulations (parameters in Eq. (1) and (2)). Were the same parameters used in the simulations for Fig 2, 3 & 4?

The paper is potentially significant, as it provides a new and relatively simple neural architecture for solving CSPs with deterministic neurons and has a large potential for neuromorphic applications. Furthermore, from the perspective of neuroscience, the potential relation between oscillations and the search for “consistent states” in the brain is intriguing and certainly worth pursuing.

Q2: Please summarize your review in 1-2 sentences

This is an interesting and well written paper that provides an innovative model for solving CSPs in neural networks with oscillations, offering potential applications in neuromorphic engineering and a new perspective on oscillations in neuroscience. Two shortcomings of the paper are the lack of a theoretical treatment, and the lack of simulations concerning (or at least a discussion of) scalability.

Submitted by Assigned_Reviewer_8

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)

Overview:

The paper studies a neuro-dynamic system that is able to solve
constraint satisfaction problems (CSP). The dynamic equations for
neural interaction and synaptic weight-changes are relatively
straight-forward (equations 1 and 2), the archictectures of the
networks are relatively complex. The elementary computational units
Are winner-take-all (WTA) type networks that oscillate.
It is stated that a crucial property of the elementary circuits is
their sensitivity to input only at specific points of their oscillatory
period(s). If the connectivity between such WTA units is chosen
to reflect a CSP, it is stated that the dynamics finally settles into
a state that represents a solution to the CSP. Simulations of the
neuro-dynamics for small and intermediately large CSPs empirically
show that the neuro-dynamics can find solutions, and that it can
find approximately optimal solutions if there are no fully consistent
variable assignments.

Evaluation:

The paper is generally well written and the main points become clear.
The question what computations neural circuits perform and how they are
performed is certainly important and interesting. There is also evidence
for WTA mechanisms that implement a form of possible competition between
neural untis. Also the effectiveness of input during certain points of
the phase is an interesting property because a small input differences
can be amplified (and response can be normalized). Even if weakly coupled
to an input, such networks can be very sensitive.

While the general setup is clear, I was missing a more extensive discussion
about the parameters of the network. Usually such parameters have to be tuned.
It does not become clear how difficult this is for the suggested system.
Also a discussion of related neuro-dynamic approaches is missing. Neural
dynamics that become sensitive to input in a critical period of their oscillation have been described before, sometimes in the form of sensitivity to initial
states. Such networks all have a WTA characteristics and at least some
should be discussed (see, e.g., work by O'Reilly, 2001; Spratling and
Johnson, 2002; Lucke, Malsburg, 2004). The difference to the here discussed
system is the goal: here it is finding solutions to CSPs, the other networks
address component finding in data. Finding CSP solutions is a well accepted
problem in computer science but it becomes much less clear why biological
networks should be optimized for such a problem - please discuss. Still it
is interesting from a technical point, and I agree that implementations of
inherently parallel neuro-dynamic approaches via analog VLSI are very
interesting. In this respect please discuss two questions, however: (1) isn't
the complexity of the problem mapped into the complexity of the network
architecture? (2) can one be confident that the problem solving dynamics do
not break down if the problem sizes are scaled up? What about the total number
of variables?

The behavior of the network is interesting including the behavior for not
fully satisfiable problems. The empirical finding of exponential scaling is
not uninteresting but the sampling analogy is taken too far for my taste.
The energy function could also be used for the satisfiable problem (Sec. 3.1).
But if the network remains in a solution there, it is not spending time
proportional to exp(-cE). Additionally, the authors themselves say that the
network will probably not explore high-energy states. So there is only a
coarse analogy here as far as I can see.

Also the criticism of other approaches is too general. Of course there are
many approaches on the technical side that have similar properties. Also it is
not clear how computationally costly the mechanisms to avoid local optima really
are for the studied dynamics - time to find optimal solutions
may scale exponentially with problem size - or the architecture does.

In response to author feedback:

The feedback has clarified most of my questions. I agree that scalability will
be exponential in the worst case. What would be interesting, though, is how
the average time to find a solution scales with problem size. This should be
a point for the revised discussion. I also assume that more care is taken regarding
the too broad criticism of other approaches, that it is discussed why CSP solving
is interesting for biological systems and about the analogy to sampling.

Given these improvements, I remain happy with my initial score.

Q2: Please summarize your review in 1-2 sentences

Interesting neural dynamic approach to solve constraint satisfaction problems. May ultimately result in state-of-the-art approaches in terms of energy cost per CSP
problem solution if implemented on VLSI.

Submitted by Assigned_Reviewer_9

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)

Summary: The authors propose a neural network model whose deterministic dynamics implement a search and ultimately global solution to a constraint satisfaction problem (CSP) when a solution exists. The fundamental computational unit of the network is a multi-stable oscillator that represent a binary variable. Multiple network modules are connected to represent multiple variables, and additional variables are defined as operations on these variables. When the modules are connected in such a way to implement a constraint (a desired operation on the variables), if the network dynamics are such that the constraint is violated, the the phase difference between different oscillators (variables) shifts such as to alter the certain state variables which cause the constraint violation. Once the relevant state variables have flipped, synaptic plasticity will cause this new state to become stable and persist, leading to a network state that is consistent with all of the desired constraints. When large simulations were conducted, the phase relationship between the oscillators was seen to vary irregularly, which resulted in sufficient exploratory drive for the network to escape local optima and search for solutions consistent with all of the desired constraints. With this property, the network has no need for often employed stochastic neuron models to escape local optima. Simulations were run showing that the network can find one of two globally optimal solutions to a 10 variable, 9 constraint problem.

Review: The paper is generally clear and well written. The figures are mostly well-chosen (although figure 4 doesn't seem to add much) and support the text. Network simulation examples are illustrative and provide the reader with insight as to how the network elements operate and how the oscillatory dynamics give rise to the plasticity which leads to a state that satisfies all the constraints. To the best of my knowledge, this approach is novel and the work posits a unique role for network oscillations. Furthermore, the model has the desirable property that, since search for a consistent solution is driven purely by violated constraints, external input can be evaluated by the network in real-time without the need for the system to switch between exploratory and greedy optimization modes. This makes the proposed model relevant to biological neural networks, which can rapidly assimilate external input with internal network state without global parameter alterations.

The largest shortcoming of the paper is that no explanation is provided as to why the network exhibits the exploratory phase varying that gives rise to its CSP solving ability for large network simulations. Considering the fact that the main result of the paper is a direct result of this network phenomenon some discussion of the origin of this phase variability would drastically improve the submission. Section 3.2 only weakly supports the main results of the paper and could be omitted to make room for a more elaborate discussion or analysis of the phase variability. Additionally, no mention is made as to the sensitivity of the network to parameters. A priori, most network models require careful parameter tuning so that the network dynamics are stable and exhibit a desired property. I imagine the same would be true for the proposed model, specifically to achieve the state of spontaneous phase shifting responsible for the network explorations. Alternatively, perhaps the rich, phase varying dynamics are relatively easy to achieve without careful parameter tuning due to the large size of the network, similar to a chaotic neural network state, in which case such a result would itself be interesting. Lastly, the scale of examples solved by the network- although useful to both illustrate how the network operates and to prove that it can solve CSPs- do not provide any insight into how the network's ability to solve CSPs scales with problem size. A more complete scaling analysis would be helpful to understand how the proposed network would perform on larger optimizations, or if the oscillatory dynamics would even perform the same function at a much larger scale.

Q2: Please summarize your review in 1-2 sentences

This is an interesting, original model for solving CSPs in neural networks which employs a novel function for oscillatory dynamics. A rigorous analysis of the origin of the phase variability that gives rise to the network's ability to solve CSPs as well as the network's parameter sensitivity would greatly improve the paper.

Author Feedback

Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 6000 characters. Note however that reviewers and area chairs are very busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.

We thank the reviewers for their comments.
Regarding the issues raised by reviewer_5:
1) "v(t) and u(t) appear without definition in line 107."
v(t) and u(t) are defined in the text directly following the equation in
line 107, i.e. in lines 140 - 142

2) "The single WTA network described in line 151 and Figure 1 (a) is not mathematically defined."
The WTA network from Figure 1(a) is well described in [15], which is the source given for the paragraph in lines 82 - 90. The mathematical definition further follows directly from the definition of its components, given in equation (1). All synaptic weights within a single WTA are static

3)"Synaptic connections of the coupled WTA circuit described in lines 158-159 and Figure2 are not defined"
The synaptic connections in figure 2(a) follow the same notation as those in figure 1(b): Black lines indicate fixed weight connections. Orange lines indicate plastic connections that follow the plasticity rule given in equation (2). All synaptic connections are shown in Fig. 1b and Fig. 2a. In Fig. 2a, synaptic connections within the same variable/oscillator are shown in light colors to avoid cluttering the figure.

4)” I could NOT understand how the binary value is encoded”
The paragraph starting at line 177 describes how the value of the binary variable is encoded. We mention in several places that variable values are encoded in the identity of the limit cycle attractor chosen by the network representing the variable.

5) "WA and WB appear without definition in line 191"
WA and WB are introduced in figure 1(b) and (d) (ca. line 110) and equation (2). They are the strengths of two plastic weights. their effect is further described in lines 180 - 183.

6) "The variables X1 and X2 appear without definition in line 199"
In line 198 we refer the reader to figure 2(a) that illustrates X1 and X2. X1 and X2 are just generic variables and we mention in line 196 that a variable is represented by the network in Fig. 1b

Regarding the comments of the other three reviewers

7)”state spaces of the considered problems are very small”.
In the problem in Section 3.1, an intermediate variable with cardinality 4 was introduced
for each constraint. See Fig. 2c. The value of this intermediate variable can become inconsistent with the value of the 3 problem variables connected to it. There are thus 2^10*4^9 or 268 million possible states for the
variables/oscillators. This is the actual size of the space explored by the network

8)Scalability issues: CSPs are in general NP-hard. Having the variables in the problem update their values concurrently like in the proposed network does not change the fact that time to solution will grow (in the worst case) exponentially with problem size. The size of the network grows linearly.

9)”Lack of theoretical treatment”, “origin of phase variability”:
We recognize this as a shortcoming of the paper. A work similar to ours, [Hopfield and Tank, 1985], does not provide theoretical arguments for why the proposed Hopfield network is able to escape local optima to find good solutions to the travelling salesman problem, yet this work had significant impact. Many complex systems are analyzed using simulations, and we have provided a qualitative explanation for why the network behaves as it does. Phase variability is an emergent feature of the dynamics, similar to chaos. Like chaos in complex systems which in many cases is established using numerical simulations, irregularity in phase relations is a phenomenon that we observe in simulations. We are currently in the process of studying analytically the phase variability and plan to describe our preliminary results directly at NIPS and eventually in future publications.