{"title": "A Connectionist Model for Constructive Modal Reasoning", "book": "Advances in Neural Information Processing Systems", "page_first": 403, "page_last": 410, "abstract": "", "full_text": "A Connectionist Model for Constructive\n\nModal Reasoning\n\nDepartment of Computing, City University London\n\nArtur S. d\u2019Avila Garcez\n\nLondon EC1V 0HB, UK\n\naag@soi.city.ac.uk\n\nInstitute of Informatics, Federal University of Rio Grande do Sul\n\nLu\u00b4\u0131s C. Lamb\n\nPorto Alegre RS, 91501-970, Brazil\n\nLuisLamb@acm.org\n\nDepartment of Computer Science, King\u2019s College London\n\nDov M. Gabbay\n\nStrand, London, WC2R 2LS, UK\n\ndg@dcs.kcl.ac.uk\n\nAbstract\n\nWe present a new connectionist model for constructive, intuitionistic\nmodal reasoning. We use ensembles of neural networks to represent in-\ntuitionistic modal theories, and show that for each intuitionistic modal\nprogram there exists a corresponding neural network ensemble that com-\nputes the program. This provides a massively parallel model for intu-\nitionistic modal reasoning, and sets the scene for integrated reasoning,\nknowledge representation, and learning of intuitionistic theories in neural\nnetworks, since the networks in the ensemble can be trained by examples\nusing standard neural learning algorithms.\n\n1\n\nIntroduction\n\nAutomated reasoning and learning theory have been the subject of intensive investigation\nsince the early developments in computer science [14]. However, while (machine) learn-\ning has focused mainly on quantitative and connectionist approaches [16], the reasoning\ncomponent of intelligent systems has been developed mainly by formalisms of classical\nand non-classical logics [7, 9]. More recently, the recognition of the need for systems that\nintegrate reasoning and learning into the same foundation, and the evolution of the \ufb01elds of\ncognitive and neural computation, has led to a number of proposals that attempt to integrate\nreasoning and learning [1, 3, 12, 13, 15].\n\nWe claim that an effective integration of reasoning and learning can be obtained by neural-\nsymbolic learning systems [3, 4]. Such systems concern the application of problem-speci\ufb01c\nsymbolic knowledge within the neurocomputing paradigm. By integrating logic and neural\n\n\fnetworks, they may provide (i) a sound logical characterisation of a connectionist system,\n(ii) a connectionist (parallel) implementation of a logic, or (iii) a hybrid learning system\nbringing together advantages from connectionism and symbolic reasoning.\n\nIntuitionistic logical systems have been advocated by many as providing adequate logical\nfoundations for computation (see [2] for a survey). We argue, therefore, that intuitionism\ncould also play an important part in neural computation. In this paper, we follow the re-\nsearch path outlined in [4, 5], and develop a computational model for integrated reasoning,\nrepresentation, and learning of intuitionistic modal knowledge. We concentrate on reason-\ning and knowledge representation issues, which set the scene for connectionist intuitionistic\nlearning, since effective knowledge representation should precede learning [15]. Still, we\nbase the representation on standard, simple neural network architectures, aiming at future\nwork on experimental learning within the model proposed here.\n\nA key contribution of this paper is the proposal to shift the notion of logical implication\n(and negation) in neural networks from the standard notion of implication as a partial func-\ntion from input to output (and of negation as failure to activate a neuron), to an intuitionistic\nnotion which we will see can be implemented in neural networks if we make use of network\nensembles. We claim that the intuitionistic interpretation introduced here will make sense\nfor a number of problems in neural computation in the same way that intuitionistic logic is\nmore appropriate than classical logic in a number of computational settings. We will start\nby illustrating the proposed computational model in an appropriate constructive reasoning,\ndistributed knowledge representation scenario, namely, the wise men puzzle [7]. Then, we\nwill show how ensembles of Connectionist Inductive Learning and Logic Programming\n(C-ILP) networks [3] can compute intuitionistic modal knowledge. The networks are set\nup by an Intuitionistic Modal Algorithm introduced in this paper. A proof that the algorithm\nproduces a neural network ensemble that computes a semantics of its associated intuitionis-\ntic modal theory is then given. Furthermore, the networks in the ensemble are kept simple\nand in a modular structure, and may be trained from examples with the use of standard\nlearning algorithms such as backpropagation [11].\n\nIn Section 2, we present the basic concepts of intuitionistic reasoning used in the paper. In\nSection 3, we motivate the proposed model using the wise men puzzle. In Section 4, we\nintroduce the Intuitionistic Modal Algorithm, which translates intuitionistic modal theories\ninto neural network ensembles, and prove that the ensemble computes a semantics of the\ntheory. Section 5 concludes the paper and discusses directions for future work.\n\n2 Background\n\nIn this section, we present some basic concepts of arti\ufb01cial neural networks and intuition-\nistic programs used throughout the paper. We concentrate on ensembles of single hidden\nlayer feedforward networks, and on recurrent networks typically with feedback only from\nthe output to the input layer. Feedback is used with the sole purpose of denoting that the\noutput of a neuron should serve as the input of another neuron when we run the network,\ni.e. the weight of any feedback connection is \ufb01xed at 1. We use bipolar semi-linear acti-\nvation functions h(x) =\n1+e\u2212\u03b2x \u2212 1 with inputs in {\u22121, 1}. Throughout, we will use 1 to\ndenote truth-value true, and \u22121 to denote truth-value f alse.\nIntuitionistic logic was originally developed by Brouwer, and later by Heyting and Kol-\nmogorov [2]. In intuitionistic logics, a statement that there exists a proof of a proposition\nx is only made if there is a constructive method of the proof of x. One of the consequences\nof Brouwer\u2019s ideas is the rejection of the law of the excluded middle, namely \u03b1 \u2228 \u00ac\u03b1, since\none cannot always state that there is a proof of \u03b1 or of its negation, as accepted in classi-\ncal logic and in (classical) mathematics. The development of these ideas and applications\nin mathematics has led to developments in constructive mathematics and has in\ufb02uenced\n\n2\n\n\fseveral lines of research on logic and computing science [2].\n\nAn intuitionistic modal language L includes propositional letters (atoms) p, q, r..., the con-\nnectives \u00ac, \u2227, an intuitionistic implication \u21d2, the necessity (\u00a4) and possibility (\u2666) modal\noperators, where an atom will be necessarily true in a possible world if it is true in every\nworld that is related to this possible world, while it will be possibly true if it is true in some\nworld related to this world. Formally, we interpret the language as follows, where formulas\nare denoted by \u03b1, \u03b2, \u03b3...\n\nDe\ufb01nition 1 (Kripke Models for Intuitionistic Modal Logic) Let L be an intuitionistic\nlanguage. A model for L is a tuple M = h\u2126, R, vi where \u2126 is a set of worlds, v is a\nmapping that assigns to each \u03c9 \u2208 \u2126 a subset of the atoms of L, and R is a re\ufb02exive,\ntransitive, binary relation over \u2126, such that: (a) (M, \u03c9) |= p iff p \u2208 v(\u03c9) (for atom p);\n(b) (M, \u03c9) |= \u00ac\u03b1 iff for all \u03c9\u2032 such that R(\u03c9, \u03c9\u2032), (M, \u03c9\u2032) 6\u00b2 \u03b1; (c) (M, \u03c9) |= \u03b1 \u2227 \u03b2 iff\n(M, \u03c9) |= \u03b1 and (M, \u03c9) |= \u03b2; (d) (M, \u03c9) |= \u03b1 \u21d2 \u03b2 iff for all \u03c9\u2032 with R(\u03c9, \u03c9\u2032) we have\n(M, \u03c9\u2032) |= \u03b2 whenever we have (M, \u03c9\u2032) |= \u03b1; (e) (M, \u03c9) |= \u00a4\u03b1 iff for all \u03c9\u2032 \u2208 \u2126 if\nR(\u03c9, \u03c9\u2032) then (M, \u03c9\u2032) |= \u03b1; (f) (M, \u03c9) |= \u2666\u03b1 iff there exists \u03c9\u2032 \u2208 \u2126 such that R(\u03c9, \u03c9\u2032)\nand (M, \u03c9\u2032) |= \u03b1.\n\nWe now de\ufb01ne labelled intuitionistic programs as sets of intuitionistic rules, where each\nrule is labelled by the world at which it holds, similarly to Gabbay\u2019s Labelled Deductive\nSystems [8].\n\nDe\ufb01nition 2 (Labelled Intuitionistic Program) A Labelled Intuitionistic Program is a \ufb01nite\nset of rules C of the form \u03c9i : A1, ..., An \u21d2 A0 (where \u201c,\u201d abbreviates \u201c\u2227\u201d, as usual),\nand a \ufb01nite set of relations R between worlds \u03c9i (1 \u2264 i \u2264 m) in C, where Ak (0 \u2264 k \u2264 n)\nare atoms and \u03c9i is a label representing a world in which the associated rule holds.\n\nTo deal with intuitionistic negation, we adopt the approach of [10], as follows. We rename\nany negative literal \u00acA as an atom A\u2032 not present originally in the language. This form of\nrenaming allows our de\ufb01nition of labelled intuitionistic programs above to consider atoms\nk, ..., An \u21d2 A0, where A\u2032\nonly. For example, given A1, ..., A\u2032\nk is a renaming of \u00acAk, an\ninterpretation that assigns true to A\u2032\nk represents that \u00acAk is true; it does not represent that\nAk is false. Following De\ufb01nition 1 (intuitionistic negation), A\u2032 will be true in a world \u03c9i if\nand only if A does not hold in every world \u03c9j such that R(\u03c9i, \u03c9j).\nFinally, we extend labelled intuitionistic programs to include modalities.\n\nDe\ufb01nition 3 (Labelled Intuitionistic Modal Program) A modal atom is of the form M A\nwhere M \u2208 {\u00a4, \u2666} and A is an atom. A Labelled Intuitionistic Modal Program is a \ufb01nite\nset of rules C of the form \u03c9i : M A1, ..., M An \u21d2 M A0, where M Ak (0 \u2264 k \u2264 n) are\nmodal atoms and \u03c9i is a label representing a world in which the associated rule holds, and\na \ufb01nite set of (accessibility) relations R between worlds \u03c9i (1 \u2264 i \u2264 m) in C.\n\n3 Motivating Scenario\n\nIn this section, we consider an archetypal testbed for distributed knowledge representation,\nnamely, the wise men puzzle [7], and model it intuitionistically in a neural network ensem-\nble. Our aim is to illustrate the combination of neural networks and intuitionistic modal\nreasoning. The formalisation of our computational model will be given in Section 4.\n\nA certain king wishes to test his three wise men. He arranges them in a circle so that they\ncan see and hear each other. They are all perceptive, truthful and intelligent, and this is\ncommon knowledge in the group. It is also common knowledge among them that there are\nthree red hats and two white hats, and \ufb01ve hats in total. The king places a hat on the head\n\n\fof each wise man in a way that they are not able to see the colour of their own hats, and\nthen asks each one whether they know the colour of the hats on their heads.\n\nThe puzzle illustrates a situation in which intuitionistic implication and intuitionistic nega-\ntion occur. Knowledge evolves in time, with the current knowledge persisting in time. For\nexample, at the \ufb01rst round it is known that there are at most two white hats on the wise\nmen\u2019s heads. Then, if the wise men get to a second round, it becomes known that there is\nat most one white hat on their heads.1 This new knowledge subsumes the previous knowl-\nedge, which in turn persists. This means that if A \u21d2 B is true at a world t1 then A \u21d2 B\nwill be true at a world t2 that is related to t1 (intuitionistic implication). Now, in any sit-\nuation in which a wise man knows that his hat is red, this knowledge - constructed with\nthe use of sound reasoning processes - cannot be refuted. In other words, in this puzzle, if\n\u00acA is true at world t1 then A cannot be true at a world t2 that is related to t1 (intuitionistic\nnegation).\n\nWe model the wise men puzzle by constructing the relative knowledge of each wise man\nalong time points. This allows us to explicitly represent the relativistic notion of knowl-\nedge, which is a principle of intuitionistic reasoning. For simplicity, we refer to wise man\n1 (respectively, 2 and 3) as agent 1 (respectively, 2 and 3). The resulting model is a two-\ndimensional network ensemble (agents \u00d7 time), containing three networks in each dimen-\nsion. In addition to pi - denoting the fact that wise man i wears a red hat - to model each\nagent\u2019s individual knowledge, we need to use a modality Kj, j \u2208 {1, 2, 3}, which repre-\nsents the relative notion of knowledge at each time point t1, t2, t3. Thus, Kjpi denotes the\nfact that agent j knows that agent i wears a red hat. The K modality above corresponds to\nthe \u00a4 modality in intuitionistic modal reasoning, as customary in the logics of knowledge\n[7], and as exempli\ufb01ed below.\n\nFirst, we model the fact that each agent knows the colour of the others\u2019 hats. For example,\nif wise man 3 wears a red hat (neuron p3 is active) then wise man 1 knows that wise man\n3 wears a red hat (neuron Kp3 is active for wise man 1). We then need to model the\nreasoning process of each wise man. In this example, let us consider the case in which\nneurons p1 and p3 are active. For agent 1, we have the rule t1 : K1\u00acp2 \u2227 K1\u00acp3 \u21d2 K1p1,\nwhich states that agent 1 can deduce that he is wearing a red hat if he knows that the other\nagents are both wearing white hats. Analogous rules exist for agents 2 and 3. As before,\nthe implication is intuitionistic, so that it persists at t2 and t3 as depicted in Figure 1 for\nwise man 1 (represented via hidden neuron h1 in each network). In addition, according to\nthe philosophy of intuitionistic negation, we may only conclude that agent 1 knows \u00acp2, if\nin every world envisaged by agent 1, p2 is not derived. This is illustrated with the use of\ndotted lines in Figure 1, in which, e.g., if neuron Kp2 is not active at t3 then neuron K\u00acp2\nwill be active at t2. As a result, the network ensemble will never derive p2 (as one should\nexpect), and thus it will derive K1\u00acp2 and K3\u00acp2.2\n\n4 Connectionist Intuitionistic Modal Reasoning\n\nThe wise men puzzle example of Section 3 shows that simple, single-hidden layer neural\nnetworks can be combined in a modular structure where each network represents a possible\nworld in the Kripke structure of De\ufb01nition 1. The way that the networks should then be\ninter-connected can be de\ufb01ned by following a semantics for \u21d2 and \u00ac, and for \u00a4 and \u2666 from\nintuitionistic logic. In this section, we see how exactly we construct a network ensemble\n\n1This is because if there were two white hats on their heads, one of them would have known (and\nhave said), in the \ufb01rst round, that his hat was red, for he would have been seeing the other two with\nwhite hats.\n\n2To complete the formalisation of the problem, the following rules should also hold at t2 (and at\n\nt3): K1\u00acp2 \u21d2 K1p1 and K1\u00acp3 \u21d2 K1p1. Analogous rules exist for agents 2 and 3.\n\n\f- 1 \n\n- 1 \n\nKp1 \n\nKp2 Kp3 \n\nK(cid:216)\n(cid:216) p2 \n\nK(cid:216)\n(cid:216) p3 \n\n h1 h2 \n\n h3 \n\n h4 \n\n h5 \n\nwise man 1 at point t3 \n\n K(cid:216)\n\n(cid:216) p2 \n\nK(cid:216)\n(cid:216) p3 \n \n\nKp1 Kp2 Kp3 \n\nK(cid:216)\n(cid:216) p2 \n\nK(cid:216)\n(cid:216) p3 \n\n- 1 \n\n h1 \n\n h2 \n\n h3 h4 \n\n h5 \n\nKp1 \n\nKp2 Kp3 \n\nK(cid:216)\n(cid:216) p2 \n\nK(cid:216)\n(cid:216) p3 \n\n K(cid:216)\n\n(cid:216) p2 \n\n K(cid:216)\n(cid:216) p3 \n\n h1 \n\n h2 h3 h4 \n\n h5 \n\nwise man 1 at point t2 \n\n- 1 \n\n K(cid:216)\n\n(cid:216) p2 \n\nK(cid:216)\n(cid:216) p3 \n \n\nwise man 1 at point t1 \n\nFigure 1: Wise men puzzle: Intuitionistic negation and implication.\n\ngiven an intuitionistic modal program. We introduce a translation algorithm, which takes\nthe program as input and produces the ensemble as output by setting the initial architecture,\nset of weights, and thresholds of the networks according to a Kripke semantics for the\nprogram. We then prove that the translation is correct, and thus that the network ensemble\ncan be used to compute the logical consequences of the program in parallel.\n\nBefore we present the algorithm, let us illustrate informally how \u21d2, \u00ac, \u00a4, and \u2666 are repre-\nsented in the ensemble. We follow the key idea behind Connectionist Modal Logics (CML)\nto represent Kripke models in neural networks [6]. Each possible world is represented by\na single hidden layer neural network. In each network, input and output neurons represent\natoms or modal atoms of the form A, \u00acA, \u00a4A, or \u2666A, while each hidden neuron encodes\na rule. For example, in Figure 1, hidden neuron h1 encodes a rule of the form A \u2227 B \u21d2 C.\nThresholds and weights must be such that the hidden layer computes a logical and of the\ninput layer, while the output layer computes a logical or of the hidden layer.3 Furthermore,\nin each network, each output neuron is connected to its corresponding input neuron with a\nweight \ufb01xed at 1.0 (as depicted in Figure 1 for K\u00acp2 and K\u00acp3), so that chains of the form\nA \u21d2 B and B \u21d2 C can be represented and computed. This basically characterises C-ILP\nnetworks [3]. Now, in CML, we allow for an ensemble of C-ILP networks, each network\nrepresenting knowledge in a (learnable) possible world. In addition, we allow for a number\nof \ufb01xed feedforward and feedback connections to occur among different networks in the\nensemble, as shown in Figure 1. These are de\ufb01ned as follows: in the case of \u00a4, if neuron\n\u00a4A is activated (true) in network (world) \u03c9i then A must be activated in every network\n\u03c9j that is related to \u03c9i (this is analogous to the situation in which we activate K1p3 and\nK2p3 whenever p3 is active). Dually, if A is active in every \u03c9j then \u00a4A must be activated\n\n3For example, if A \u2227 B \u21d2 D and C \u21d2 D then a hidden neuron h1 is used to connect A and B\nto D, and a hidden neuron h2 is used to connect C to D such that if h1 or h2 is activated then D is\nactivated.\n\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n(cid:216)\n-\n-\n-\n-\n-\n-\n-\n-\n-\n-\n-\n-\n\fin \u03c9i (this is done with the use of feedback connections and a hidden neuron that computes\na logical and, as detailed in the algorithm below). In the case of \u2666, if \u2666A is activated in\nnetwork \u03c9i then A must be activated in at least one network \u03c9j that is related to \u03c9i (we do\nthis by choosing an arbitrary \u03c9j to make A active). Dually, if A is activated in any \u03c9j that is\nrelated to \u03c9i then \u2666A must be activated in \u03c9i (this is done with the use of a hidden neuron\nthat computes a logical or, also as detailed in the algorithm below). Now, in the case of \u21d2,\naccording to the semantics of intuitionistic implication, \u03c9i : A \u21d2 B and R(\u03c9i, \u03c9j) imply\n\u03c9j : A \u21d2 B. We implement this by copying the neural representation of A \u21d2 B from\n\u03c9i to \u03c9j, as done via h1 in Figure 1. Finally, in the case of \u00ac, we need to make sure that\n\u00acA is activated in \u03c9i if, for every \u03c9j such that R(\u03c9i, \u03c9j), A is not active in \u03c9j. This is im-\nplemented with the use of negative weights (to account for the fact that the non-activation\nof a neuron needs to activate another neuron), as depicted in Figure 1 (dashed arrows), and\ndetailed in the algorithm below.\n\nWe are now in a position to introduce the Intuitionistic Modal Algorithm. Let P =\n{P1, ..., Pn} be a labelled intuitionistic modal program with rules of the form \u03c9i\n:\nM A1, ..., M Ak \u2192 M A0, where each Aj (0 \u2264 j \u2264 k) is an atom and M \u2208 {\u00a4, \u2666},\n1 \u2264 i \u2264 n. Let N = {N1, ..., Nn} be a neural network ensemble with each network Ni\ncorresponding to program Pi. Let q denote the number of rules occurring in P. Consider\nthat the atoms of Pi are numbered from 1 to \u03b7i such that the input and output layers of Ni\nare vectors of length \u03b7i, where the j-th neuron represents the j-th atom of Pi. In addition,\nlet Amin denote the minimum activation for a neuron to be considered active (or true),\nAmin \u2208 (0, 1); for each rule rl in each program Pi, let kl denote the number of atoms in\nthe body of rule rl, and let \u00b5l denote the number of rules in Pi with the same consequent\nas rl (including rl). Let M AXrl (kl, \u00b5l) denote the greater of kl and \u00b5l for rule rl, and\nlet M AXP (k1, ..., kq, \u00b51, ..., \u00b5q) denote the greatest of k1, ..., kq, \u00b51, ..., \u00b5q for program\nP. Below, we use k as a shorthand for k1, ..., kq, and \u00b5 as a shorthand for \u00b51, ..., \u00b5q. The\nequations in the algorithm come from the proof of Theorem 1, given in the sequel.\nIntuitionistic Modal Algorithm\n\n1. Rename each modal atom M Aj by a new atom not occurring in P of the form A\u00a4\nA\u2666\n\nj if M = \u2666;\n\nj if M = \u00a4, or\n\n2. For each rule rl of the form A1, ..., Ak \u21d2 A0 in Pi (1 \u2264 i \u2264 n) such that R(\u03c9i, \u03c9j), do: add a\nrule A1, ..., Ak \u21d2 A0 to Pj (1 \u2264 j \u2264 n).\n3. Calculate Amin > (M AXP (k,\u00b5, n) \u2212 1)\u00c1(M AXP (k,\u00b5, n) + 1);\n4. Calculate W \u2265 (2\u00c1\u03b2)\u00b7(ln (1 + Amin)\u2212ln (1 \u2212 Amin))\u00c1(M AXP (k,\u00b5)\u00b7(Amin \u2212 1)+Amin +\n1);\n5. For each rule rl of the form A1, ..., Ak \u21d2 A0 (k \u2265 0) in Pi (1 \u2264 i \u2264 n), do:\n\n(a) Add a neuron Nl to the hidden layer of neural network Ni associated with Pi; (b) Connect each\nneuron Ai (1 \u2264 i \u2264 k) in the input layer of Ni to Nl and set the connection weight to W ; (c)\nConnect Nl to neuron A0 in the output layer of Ni and set the connection weight to W ; (d) Set the\nthreshold \u03b8l of Nl to \u03b8l = ((1 + Amin) \u00b7 (kl \u2212 1) \u00c12)W ; (e) Set the threshold \u03b8A0 of A0 in the\noutput layer of Ni to \u03b8A0 = ((1 + Amin) \u00b7 (1 \u2212 \u00b5l)\u00c12)W. (f) For each atom of the form A\u2032 in rl,\ndo:\n\n(i) Add a hidden neuron NA\u2032 to Ni; (ii) Set the step function s(x) as the activation function of\nNA\u2032 ;4 (iii) Set the threshold \u03b8A\u2032 of NA\u2032 such that n \u2212 (1 + Amin) < \u03b8A\u2032 < nAmin; (iv) For each\n\n4Any hidden neuron created to encode negation (such as h4 in Figure 1) shall have a non-linear\nactivation function s(x) = y, where y = 1 if x > 0, and y = 0 otherwise. Such neurons en-\ncode (meta-level) knowledge about negation, while the other hidden neurons encode (object-level)\nknowledge about the problem domain. The former are not expected to be trained by examples and,\nas a result, the use of the step function will simplify the algorithm. The latter are to be trained, and\ntherefore require a differentiable, semi-linear activation function.\n\n\fnetwork Nj corresponding to program Pj (1 \u2264 j \u2264 n) in P such that R(\u03c9i, \u03c9j), do: Connect the\noutput neuron A of Nj to the hidden neuron NA\u2032 of Ni and set the connection weight to \u22121; and\nConnect the hidden neuron NA\u2032 of Ni to the output neuron A\u2032 of Ni and set the connection weight\nto W I such that W I > h\u22121(Amin) +\u00b5A\u2032 .W + \u03b8A\u2032.\n\nj in network Ni, do:\n\n6. For each output neuron A\u2666\n(a) Add a hidden neuron AM\nj\nR(\u03c9i, \u03c9z); (b) Set the step function s(x) as the activation function of AM\nfunction h(x) as the activation function of Aj; (c) Connect A\u2666\nj in Ni to AM\nweight to 1; (d) Set the threshold \u03b8M of AM\nj\nof Aj in Nz such that \u03b8Aj = ((1 + Amin) \u00b7 (1 \u2212 \u00b5Aj )\u00c12)W ; (f) Connect AM\nthe connection weight to W M > h\u22121(Amin) + \u00b5Aj W + \u03b8Aj .\n\nand an output neuron Aj to an arbitrary network Nz such that\nj , and set the semi-linear\nj and set the connection\nsuch that \u22121 < \u03b8M < Amin; (e) Set the threshold \u03b8Aj\nto Aj in Nz and set\n\nj\n\nj in network Ni, do:\n\n7. For each output neuron A\u00a4\n(a) Add a hidden neuron AM\nto each Nu (1 \u2264 u \u2264 n) such that R(\u03c9i, \u03c9u), and add an output\nj\nneuron Aj to Nu if Aj /\u2208 Nu; (b) Set the step function s(x) as the activation function of AM\nj , and\nset the semi-linear function h(x) as the activation function of Aj; (c) Connect A\u00a4\nj in Ni to AM\nj and\nset the connection weight to 1; (d) Set the threshold \u03b8M of AM\nsuch that \u22121 < \u03b8M < Amin; (e) Set\nj\nthe threshold \u03b8Aj of Aj in each Nu such that \u03b8Aj = ((1 + Amin) \u00b7 (1 \u2212 \u00b5Aj )\u00c12)W ; (f) Connect\nAM\nj\n8. For each output neuron Aj in network Nu such that R(\u03c9i, \u03c9u), do:\n(a) Add a hidden neuron A\u2228\n(c) For each output neuron A\u2666\n\nto Aj in Nu and set the connection weight to W M > h\u22121(Amin) + \u00b5Aj W + \u03b8Aj .\n\nj to Ni; (b) Set the step function s(x) as the activation function of A\u2228\nj ;\n\nj in Ni, do:\n\nj and set the connection weight to 1; (ii) Set the threshold \u03b8\u2228 of A\u2228\n\nj such\nj in Ni and set the connection weight\n\n(i) Connect Aj in Nu to A\u2228\nthat \u2212nAmin < \u03b8\u2228 < Amin \u2212 (n \u2212 1); (iii) Connect A\u2228\nto W M > h\u22121(Amin) + \u00b5Aj W + \u03b8Aj .\n9. For each output neuron Aj in network Nu such that R(\u03c9i, \u03c9u), do:\n(a) Add a hidden neuron A\u2227\n(c) For each output neuron A\u00a4\n\nj in Ni, do:\n\nj to A\u2666\n\nj to Ni; (b) Set the step function s(x) as the activation function of A\u2227\nj ;\n\n(i) Connect Aj in Nu to A\u2227\nthat n \u2212 (1 + Amin) < \u03b8\u2227 < nAmin; (iii) Connect A\u2227\nto W M > h\u22121(Amin) + \u00b5Aj W + \u03b8Aj .\nFinally, we prove that N is equivalent to P.\n\nj and set the connection weight to 1; (ii) Set the threshold \u03b8\u2227 of A\u2227\n\nj such\nj in Ni and set the connection weight\n\nj to A\u00a4\n\nTheorem 1 (Correctness of Intuitionistic Modal Algorithm) For any intuitionistic modal\nprogram P there exists an ensemble of neural networks N such that N computes the intu-\nitionistic modal semantics of P.\nProof The algorithm to build each individual network in the ensemble is that of C-ILP,\nwhich we know is provably correct [3]. The algorithm to include modalities is that of\nCML, which is also provably correct [6]. We need to consider when modalities and intu-\nitionistic negation are to be encoded together. Consider an output neuron A0 with neurons\nM (encoding modalities) and neurons n (encoding negation) among its predecessors in a\nnetwork\u2019s hidden layer. There are four cases to consider. (i) Both neurons M and neurons\nn are not activated: since the activation function of neurons M and n is the step function,\ntheir activation is zero, and thus this case reduces to C-ILP. (ii) Only neurons M are acti-\nvated: from the algorithm above, A0 will also be activated (with minimum input potential\nW M + \u03c2, where \u03c2 \u2208 R). (iii) Only neurons n are activated: as before, A0 will also be\nactivated (now with minimum input potential W I + \u03c2). (iv) Both neurons M and neurons\nn are activated: the input potential of A0 is at least W M + W I + \u03c2. Since W M > 0 and\nW I > 0, and since the activation function of A0, h(x), is monotonically increasing, A0\nwill be activated whenever both M and n neurons are activated. This completes the proof.\n\n\f5 Concluding Remarks\n\nIn this paper, we have presented a new model of computation that integrates neural net-\nworks and constructive, intuitionistic modal reasoning. We have de\ufb01ned labelled intu-\nitionistic modal programs, and have presented an algorithm to translate the intuitionistic\ntheories into ensembles of C-ILP neural networks, and showed that the ensembles com-\npute a semantics of the corresponding theories. As a result, each ensemble can be seen as a\nnew massively parallel model for the computation of intuitionistic modal logic. In addition,\nsince each network can be trained ef\ufb01ciently using, e.g., backpropagation, one can adapt the\nnetwork ensemble by training possible world representations from examples. Work along\nthese lines has been done in [4, 5], where learning experiments in possible worlds settings\nwere investigated. As future work, we shall consider learning experiments based on the\nconstructive model introduced in this paper. Extensions of this work also include the study\nof how to represent other non-classical logics such as branching time temporal logics, and\nconditional logics of normality, which are relevant for cognitive and neural computation.\nAcknowledgments\nArtur Garcez is partly supported by the Nuf\ufb01eld Foundation and The Royal Society. Luis Lamb is\npartly supported by the Brazilian Research Council CNPq and by the CAPES and FAPERGS foun-\ndations.\n\nReferences\n\n[1] A. Browne and R. Sun. Connectionist inference models. Neural Networks, 14(10):1331\u20131355,\n\n2001.\n\n[2] D. Van Dalen. Intuitionistic logic. In D. M. Gabbay and F. Guenthner, editors, Handbook of\n\nPhilosophical Logic, volume 5. Kluwer, 2nd edition, 2002.\n\n[3] A. S. d\u2019Avila Garcez, K. Broda, and D. M. Gabbay. Neural-Symbolic Learning Systems: Foun-\n\ndations and Applications. Perspectives in Neural Computing. Springer-Verlag, 2002.\n\n[4] A. S. d\u2019Avila Garcez and L. C. Lamb. Reasoning about time and knowledge in neural-symbolic\nlearning systems. In Advances in Neural Information Processing Systems 16, Proceedings of\nNIPS 2003, pages 921\u2013928, Vancouver, Canada, 2004. MIT Press.\n\n[5] A. S. d\u2019Avila Garcez, L. C. Lamb, K. Broda, and D. M. Gabbay. Applying connectionist modal\nInternational Journal on Arti\ufb01cial\n\nlogics to distributed knowledge representation problems.\nIntelligence Tools, 13(1):115\u2013139, 2004.\n\n[6] A. S. d\u2019Avila Garcez, L. C. Lamb, and D. M. Gabbay. Connectionist modal logics. Theoretical\n\nComputer Science. Forthcoming.\n\n[7] R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. MIT Press, 1995.\n[8] D. M. Gabbay. Labelled Deductive Systems. Clarendom Press, Oxford, 1996.\n[9] D. M. Gabbay, C. Hogger, and J. A. Robinson, editors. Handbook of Logic in Arti\ufb01cial Intelli-\n\ngence and Logic Programming, volume 1-5, Oxford, 1994-1999. Clarendom Press.\n\n[10] M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive databases.\n\nNew Generation Computing, 9:365\u2013385, 1991.\n\n[11] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-\n\npropagating errors. Nature, 323:533\u2013536, 1986.\n\n[12] L. Shastri. Advances in SHRUTI: a neurally motivated model of relational knowledge rep-\nresentation and rapid inference using temporal synchrony. Applied Intelligence, 11:79\u2013108,\n1999.\n\n[13] G. G. Towell and J. W. Shavlik. Knowledge-based arti\ufb01cial neural networks. Arti\ufb01cial Intelli-\n\ngence, 70(1):119\u2013165, 1994.\n\n[14] A. M. Turing. Computer machinery and intelligence. Mind, 59:433\u2013460, 1950.\n[15] L. G. Valiant. Robust logics. Arti\ufb01cial Intelligence, 117:231\u2013253, 2000.\n[16] V. Vapnik. The nature of statistical learning theory. Springer-Verlag, 1995.\n\n\f", "award": [], "sourceid": 2931, "authors": [{"given_name": "Artur", "family_name": "Garcez", "institution": null}, {"given_name": "Luis", "family_name": "Lamb", "institution": null}, {"given_name": "Dov", "family_name": "Gabbay", "institution": null}]}