Neural Network Computation by In Vitro Transcriptional Circuits

Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)

Bibtex Metadata Paper

Authors

Jongmin Kim, John Hopfield, Erik Winfree

Abstract

The structural similarity of neural networks and genetic regulatory net- works to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regula- tion and whose components allow for in vitro implementation of arbi- trary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and de- rive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components.

1 Introduction

Biological organisms possess an enormous repertoire of genetic responses to everchang- ing combinations of cellular and environmental signals. Characterizing and decoding the connectivity of the genetic regulatory networks that govern these responses is a major chal- lenge of the post-genome era [4]. Understanding the operation of biological networks is in- tricately intertwined with the ability to create sophisticated biochemical networks de novo. Recent work developing synthetic genetic regulatory networks has focused on engineered circuits in bacteria wherein protein signals are produced and degraded [5, 6]. Although remarkable, such network implementations in bacteria have many unknown and uncontrol- lable parameters.

We propose a biochemical model system a simplified analog of genetic regulatory circuits that provides well-defined connectivity and uses nucleic acid species as fuel and signals that control the network. Our goal is to establish an explicit model to guide the laboratory construction of synthetic biomolecular systems in which every component is known and

    RNAP                       transcript



                                                   RNase



   inhibitor

(A) activator DNA switch (B)

Figure 1: (A) The components of an in vitro circuit. The switch template (blue) is shown with the activator (red) attached. The dotted box indicates the promoter sequence and the downstream direction. (B) The correspondence between a neural network and an in vitro biochemical network. Neuron activity corresponds to RNA transcript concentration, while synaptic connections correspond to DNA switches with specified input and output.

where quantitative predictions can be tested. Only two enzymes are used in addition to syn- thetic DNA templates: RNA polymerase, which recognizes a specific promoter sequence in double-stranded DNA and transcribes the downstream DNA to produce an RNA tran- script, and ribonuclease, which degrades RNA but not DNA. In this system, RNA transcript concentrations are taken as signals. Synthetic DNA templates may assume two different conformations with different transcription efficiency: ON or OFF. Upon interaction with a RNA transcript of the appropriate sequence, the DNA template switches between differ- ent conformations like a gene regulated by transcription factors. The connectivity which RNA transcripts regulate which DNA templates is dictated by WatsonCrick base-pairing rules and is easy to program. The network computation is powered by rNTP that drives the synthesis of RNA signals by RNAP, while RNase forces transient signals to decay. With a few assumptions, we find that this stripped-down analog of genetic regulatory networks is mathematically equivalent to recurrent neural networks, confirming that a wide range of programmable dynamical behaviors is attainable.

2 Construction of the transcriptional network

The DNA transcriptional switch. The elementary unit of our networks will be a DNA switch, which serves the role of a gene in a genetic regulatory circuit. The basic require- ments for a DNA switch are to have separate input and output domains, to transcribe poorly by itself [7], and to transcribe efficiently when an activator is bound to it. A possible mech- anism of activation is the complementation of an incomplete promoter region, allowing more favorable binding of RNAP to the DNA template. Figure 1A illustrates our proposed design for DNA transcriptional switches and circuits. We model a single DNA switch with the following binding reactions:

                                              A  +  I        AI

                OFF                           D  + A         DA            ON

                ON                           DA  +  I        D +  AI       OFF

where D (blue) is a DNA template with an incomplete promoter region, A (red) is an activator that complements the incomplete promoter region, and I (green) is an inhibitor complementary to A. Thus, I can bind free A. Furthermore, activator A contains a "toe- hold" region [8] that overhangs past the end of D, allowing inhibitor I to strip off A from the DA complex. D is considered OFF and DA is considered ON, based on their efficiency as templates for transcription. This set of binding reactions provides a means to choose the threshold of the sigmoidal activation function, as will be explained later.

RNAP and RNase drive changes in RNA transcript concentration; their activity is modeled using a first-order approximation for enzyme kinetics. For the moment, we assume that the input species (activator and inhibitor) are held at constant levels by external control.

               By RNA polymerase                                                                                  By RNase                            k                                                                                                   k                    DA p                             DA + R                                                                                   R d                           k                                                                                                                        D            p                            D + R where 0 <  < 1 due to lack of activation and  represents the complete degradation of RNA products by RNase. kd and kp are set by the concentration of enzymes.

In general, a set of chemical reactions obeying mass action have dynamics described by

                                               d[Xi] =                                k                    [X                              )                                                         dt                                                          j ]r                                                                                                                           j (p                                                                                                                                     i - ri                                                                                                         j

where k is the rate constant, ri is the stoichiometry of species Xi as a reactant (typically 0 or 1), and pi is the stoichiometry of Xi as a product in reaction . Analysis of our system is greatly simplified by the assumption that the binding reactions are fast and go to completion. We define Dtot as the sum of free and bound species:Dtot = [D] + [DA]. Similarly, Itot = [I]+[AI] and Atot = [A]+[DA]+[AI]. Then, [DA] depends on Dtot and , where = Atot - Itot. Because I can scavenge A whether the latter is free or bound to D, A can activate D only when > 0. The amount of [DA] is proportional to when 0 < < Dtot, as shown in Figure 2A. It is convenient to represent this nonlinearity using a piecewise-linear approximation of a sigmoidal function, specifically, (x) = |x+1|-|x-1| . 2 Thus, we can represent [DA] using and a rescaled : [DA] = 1 Dtot(1 + ( ^ )), where 2 ^ = 2 Dtot - 1 is called the signal activity. At steady-state, kd[R] = kp[DA] + kp[D]; thus, 1 k [R] = p Dtot((1 2 k - )(^) + 1 + ) . d If we consider the activator concentration as an input and the steady-state transcript con- centration as an output, then the (presumed constant) inhibitor concentration, I tot, sets the threshold, and the function assumes a sigmoidal shape (Fig. 2D). Adjusting the amount of template, Dtot, sets the magnitude of the output signal and the width of the transition re- gion (Fig. 2C). We can adjust the width of the transition region independent of the threshold such that a step function would be achieved in the limit. Thus, we have a sigmoidal func- tion with an adjustable threshold, without reliance on cooperative binding of transcription factors as is common in biological systems [9].

Networks of transcriptional switches. The input domain of a DNA switch is upstream of the promoter region; the output domain is downstream of the promoter region. This separation of domains allows us to design DNA switches that have any desired connectivity.

         [ DA ]                                               (x)                    tot               D                                                            1                                          [ R ]                                              [ R ]



                                                            -1                   1              x                                      tot                                 D                                                                                  -1                                                                                                                                                      tot                                    tot      (A)                                         (B)                                                          (C)                               A                (D)                   A

Figure 2: (A) [DA] as a function of . (B) The sigmoid (x). (C,D) [R] as a function of Atot for three values of Dtot and Itot, respectively.

We assume that distinct signals in the network are represented as distinct RNA sequences that have negligible crosstalk (undesired binding of two molecules representing different signals). The set of legitimate binding reactions is as follows:

                                                                            j                                                                           A  +  Ij                                     j                                                                                                                 A  I j

             OFF                                                      D ij                  j               D ij             j                                                                                                                             A                                                            ON                                                                                                                                                                                   + A

             ON                                                       ij                                                                      D               j                                                                                 A + I j                         D ij +                j                                                                                                                                  A  I j                                                  OFF

where Dij is the DNA template that has the jth input domain and ith output domain, the activator Aj complements the incomplete promoter region of Dij, and the inhibitor Ij is complementary to Aj. Note that Ij can strip off Aj from the DijAj complex, thus imposing a sharp threshold as before. Again, we assume fast and complete binding reactions.

The set of enzyme reactions for the transcriptional network is as follows:

                By RNA polymerase                                                                                                           By RNase                                    k                                                                                                                  k                     D                   p                                                                                                                  d                           ij Aj                             k  Dij Aj + Ai                                               if sij = 1                                           Ij                                                                                                                                                              k                     D              p                                                                                                                            d                           ij  Dij + Ai                                                                                                         Aj                                    k                                                                                                                                                                                                                                                                                              k                     D                   p                                                                                                                                 d                           ij Aj                             k  Dij Aj + Ii                                               if sij = -1                                          AjIj                                                                                                                                                                             k                     D              p                                                                                                                                           d                           ij  Dij + Ii                                                                                                         DijAj  Dij where sij  {+1, -1} indicates whether switch ij will produce an activator or an inhibitor. This notation reflects that the production of Ii is equivalent to the consumption of Ai. The change of RNA concentrations over time is easy to express with i = Atot                                                                                                                                                                                     i     - Itot                                                                                                                                                                                                  i     :

                                    di =                                                                   s                                         dt              -kd  i + kp                                                ij ([Dij Aj ] + [Dij ]) .                                                             (1)                                                                                                       j

Network equivalence. We show next that the time evolution of this biochemical network model is equivalent to that of a general Hopfield neural network model [3]:

                                                         dx                                                                   i =                                          w                                                               dt            -xi +                                     ij (xj ) + i .                                                                      (2)                                                                                                            j

Equation 1 can be rewritten to use the same nonlinear activation function defined earlier. Let ^ i = 2i Dtot - 1 be a rescaled difference between activator and inhibitor concentrations, i where Dtot i is the load on Ai, i.e., the total concentration of all switches that bind to Ai: Dtot i = Dtot j ji and Dtot ij = [DijAj] + [Dij]. Then, we can derive the following rate equation, where ^ i plays the role of unit i's activity xi:

1 d ^ i k Dtot k Dtot = p (1 ij ( ^ p (1 + )s ij k -^i + - )sij j )+ ij - 1 . d dt k Dtot k Dtot j d i j d i (3) Given the set of constants describing an arbitrary transcriptional network, the constants for an equivalent neural network can be obtained immediately by comparing Equations 2 and 3. The time constant is the inverse of the RNase degradation rate: fast turnover of RNA molecules leads to fast response of the network. The synaptic weight wij is proportional to the concentration of switch template ij, attenuated by the load on Ai. However, the thresh- old i is dependent on the weights, perhaps implying a lack of generality. To implement an arbitrary neural network, we must introduce two new types of switches to the transcrip- tional network. To achieve arbitrary thresholds, we introduce bias switches DiB which

have no input domain and thus produce outputs constitutively; this adds an adjustable con- stant to the right hand side of Equation 3. To balance the load on Ai, we add null switches D0i which bind to Ai but have no output domain; this allows us to ensure that all Dtot i are equal. Consequently, given any neural network with weights wij and thresholds i, we can specify concentrations Dtot ij such that the biochemical network has identical dynamics, for some .

MichaelisMenten enzyme reactions. Next, we explore the validity of our assumption that enzyme kinetics are first-order reactions. A basic but more realistic model is the MichaelisMenten mechanism [10], in which the enzyme and substrate bind to form an enzyme-substrate complex. For example, if E is RNAP,

                                                k+                    k                                   E + D                                        cat                                            ij Aj          EDijAj  E + DijAj + Ii/Ai .                                                     k- An important ramification of MichaelisMenten reactions is that there is competition for the enzyme by the substrates, because the concentration of available enzymes is reduced as they bind to substrates, leading to saturation when the enzyme concentration is limit- ing. Using the steady-state assumption for MichaelisMenten reactions, we establish the following relations to the rate constants of first-order reactions:

                 Etot    k                                     Etot               k                                  Etot               k      k                            cat                                                      cat                            d                       d,cat           p =                                                                                              k                                                 (4)                  1 + L  K                                kp =                                                  d =                                                     M                                1 + L  K                                            1 + L               K                                                                                              M                                   d                d,M

where kcat and KM = (k- + kcat)/k+ are the catalytic constant (enzyme's speed) and Michaelis constant (enzyme's affinity to target) of RNAP for the ON state switch, kcat and KM are for the OFF state switch, and kd,cat and Kd,M are the constants of RNase. Etot [D and Etot ij Aj ] + d are the concentrations of RNAP and RNase, respectively. L = i,j KM [Dij ] is the load on RNAP and L [Aj ]+[Ij ]+[Aj Ij ]+[Dij Aj ] is the load on i,j K d = i,j K M d,M RNase (i.e., the total concentration of binding targets divided by the Michaelis constants of the enzymes), both of which may be time varying. To make the first-order approximation valid, we must keep L and Ld constant. Introduction of a new type of switch with different Michaelis constants can make L constant by balancing the load on the enzyme. A scheme to keep Ld constant is not obvious, so we set reaction conditions such that Ld 1.

3 Example computations by transcriptional networks

Feed-forward networks. We first consider a feed-forward network to compute f (x, y, z) = xyz + yz + x. From the Boolean circuit shown in Figure 3A, we can construct an equivalent neural network. We label units 1 through 6: units 1, 2, 3 correspond to inputs x, y, z whereas units 4, 5, 6 are computation units. Using the conversion rule discussed in the network equivalence section, we can calculate the parameters of the transcriptional network. Under the first-order approximation of Equation 3, the simulation result is exact (Fig. 3C). For comparison, we also explicitly simulated mass action dynamics for the full set of chemical equations with the MichaelisMenten enzyme reactions, using biologically plausible rate constants and with Etot and Etot d calculated from Equation 4 using estimated values of L and Ld. The full model performs the correct calculation of f for all eight 3-bit inputs, although the magnitude of signals is exaggerated due to an underestimate of RNase load (Fig. 3C).

Associative memories. Figure 4A shows three 4-by-4 patterns to be memorized in a con- tinuous neural network [3]. We chose orthogonal patterns because a 16 neuron network has limited capacity. Our training algorithm is gradient descent combined with the perceptron learning rule. After training, the parameters of the neural network are converted to the parameters of the transcriptional network as previously described. Starting from a random

                                                                                                                                              6

                                                              x           x                                                                                        1                                                                                  -1                                                               4                                                                                               1                                                                                         -2                                                        2                                                       f                     1                                                                   y                                                1                            i^ 0           y                                                                 -1                                           f                                                                                               1               2                                                                                                                                                 -2                                                                        1                -1                                                                                                                                                 -4   (A) z                                                    (B) z            1                                                 (C)                   0    200       400       600      800      1000                                                                                                                                                                    time(sec)

Figure 3: (A,B) A Boolean circuit and a neural network to compute f (x, y, z) = xyz+ yz+ x. (C) The activity of computation units (first-order approximation: solid lines; Michaelis- Menten reaction: dotted lines) for x=True=1, y=False=-1, z=True=1. 3

                                                                                                                                    2

                                                                                                                                    1

                                                                                                                                  i^ 0

                                                                                                                                  -1

                                                                                                                                  -2    (A)                                                                                                                  (B)           -30                200              400                600                                                                                                                                                                 time(sec)

Figure 4: (A) The three patterns to be memorized. (B) Time-course for the transcriptional network recovery of the third pattern. (odd columns: blue lines, even columns: red lines)

initial state, a typical response of the transcriptional network (with the first-order approx- imation of Equation 3) is shown in Figure 4B. Thus, our in vitro transcriptional networks can support complex sets of stable steady-states.

A winner-take-all network. Instead of trying to compensate for the saturation phenomena of MichaelisMenten reactions, we can make use of it for computation. As an example, consider the winner-take-all computation [11], which is commonly implemented as a neu- ral network with O(N 2) mutually inhibitory connections (Fig. 5A), but which can also be implemented as an electrical circuit with O(N ) interconnections by using a single global inhibitory feedback gate [12]. In a biochemical system, a limited global resource, such as RNAP, can act to regulate all the DNA switches and thus similarly produce global inhibi- tion. This effect is exploited by the simple transcriptional network shown in Figure 5B, in which the output from each DNA switch activates the same DNA switch itself, and mutual inhibition is achieved by competition for RNAP. Specifically, we have switch templates Dii with fixed thresholds set by Ii, and Dii produces Ai as its output RNA. With the instant binding assumption, we then derive the following equation:

           dAtot                  i                Etot         k                                  Etot                  k                               k                            =       d                d,cat Atot                                               cat [D                            cat [D                 dt              -1 + L                      i         +                                                iiAi] +                                  ii]             .                  (5)                                           d    Kd,M                              1 + L                  KM                             KM

The production rate of Ai depends on Atot i and on L, while the degradation rate of Ai depends on Atot i and on Ld, as shown in Figure 6A. For a winner-take-all network, an ON state switch draws more RNAP than an OFF state switch (because of the smaller Michaelis constant for the ON state). Thus, if the other switches are turned OFF, the load on RNAP (L) becomes small, leading to faster production of the remaining ON switches. When the production rate curve and the degradation rate curve have three intersections, bistability is achieved such that the switches remain ON or OFF, depending on their current state.

Consider n equivalent switches starting with initial activator concentrations above the threshold, and with the highest concentration at least above the rest (as a percentage). Analysis indicates that a less leaky system (small ) and sufficient differences in initial activator concentrations (large ) can guarantee the existence of a unique winner. Simula- tions of a 10-switch winner-take-all network confirm this analysis, although we do not see perfect behavior (Fig. 6B). Figure 6C shows a time-course of a unique winner situation. Switches get turned OFF one by one whenever the activator level approaches the threshold, until only one switch remains ON.

                                                                  -1

                                                                  -1                                                                 -1           -1                                                       0.5             0.5                     0.5                      0.5                0.5           0.5                                                                 -1           -1


                                       (A)             1           1                            1         (B)       1                  1             1

Figure 5: (A) A 3-unit WTA network with explicit mutual inhibition. (B) An equivalent biochemical network.

                                                                                          30                                   1                                                                                                                                                                                                                                                                                                                                                                                                                                    3                     tot                     i             d A                                                                               25                                   0.8                                         2.5

          dt                                       L : low                                20                                                                                2                                                                                                                                   0.6                        /   M]                                                                                               15                                                                                         1/                                                                                    i 1.5                                                                                                                                    0.4                                  [A                                                       L : high                                10                                                                                 1                                                                                                                                    0.2                                                                                                5                                                                               0.5

                                                                                                                               0                             tot     tot     tot       tot                                                5       10          15                                                  0      (A)                    i                             I       i +D                                     I       ii      A i                (B)                                    (%)                               (C)                              0              5000     10000    15000                                                                                                                                                                                                    time(sec)

Figure 6: For WTA networks: (A) Production rates (solid lines) for two different L's, compared to a linear degradation rate (dotted line). (B) Empirical probability of correct output as a function of and . (C) Time-course with = 0.33% and = 0.04.

Similarly, we can consider a k-WTA network where k winners persist. If we set the pa- rameters appropriately such that k winners are stable but k + 1 winners are unstable, the simulation result recovered k winners most of the time. Even a single k-WTA gate can provide impressive computational power [13].