553 
SPREADING ACTIVATION OVER 
DISTRIBUTED MICROFEATURES 
James Hendler 
Depart, ment of Computer Science 
University of Maryland 
College Park, MD 20742 
ABSTRACT 
One a.tt.empt a.t explaining human inferencing is that of spread- 
ing activation, particularly in the structured connectionist para- 
digm. This ha.s resulted in the building of systems with semanti- 
cally nameable nodes which perform inferencing by examining 
the patterns of activation spread. In this paper we demonstrate 
that simple st.ruct. ured network inferencing can be performed by 
passing act.ivation over the weights learned by a distributed a.lgo- 
rit, hm. Thus, an account is provided which explains a well- 
behaved relationship between structured and distributed connec- 
tionist approaches. 
INTRODUCTION 
A primary difference between the neural networks of 20 years ago and the 
current generation of connectionist models is the addition of mechanisms which 
permit the system to create an internal representation. These subsymbolic, 
semantically unnameable, features which are induced by connectionist learning 
a. lgorithms have been discussed a.s being of import both in structured and distri- 
buted connectiouist networks (cf. Feldman and Ballard, 1082; Rumelhart and 
McClelland, 1088). The fact that, network learning algorithms can create these 
m'crofeature.s is not,, however. enough in itself to account for how cognition 
works. Most, o' what we call intelligent thought derives from being able to tea.- 
son about the relat.iou between objects, to hypothesize about events and things, 
etc. If we are to do cognitive modeling we must, complete the story by explaining 
how nclworks can reason in the way that humans {or other int,e|ligent beings) do. 
One atl.empl at explaining such rea.soning is that. of spreading activation in the 
structured connectionist and marker-passing (cf. Charniak, 1983; Hendiet, 1._087) 
The author is also al'filiat, ed with the Institute for Advanced Conputer Studies and the Sys- 
tems Research Center a,t the University of Maryland. Funding for this work was provided in part by 
Office of Naval Research Grant N00014-88-K-0560. 
554 Hendler 
approaches. In these systems semantically nameable nodes permit an energy 
spreadl and reasoning about the world is accounted for by looking at, either 
stable contgurat. ions of the activation (the structured connectionist apl)roach ) or 
at, the paths found by examining intersections among the nodes (the marker- 
passing technique). In this paper we will demonstrate that simple structured- 
network-like inferencing can be performed by passing activation over t, he weight.s 
learned by a dist.ributed algorithm. Thus, an account is provided which explains 
a well-behaved relationship between st, ructnred and distributed connectionist 
approaches. 
THE SPREADING ACTIVATION MODEL 
In this paper we will demonstrate that local connect, ionist-like networks can be 
built by spreading activation over the nicrofeatures learned by a distributed net- 
work. To show this, we st,art with a. simple example which demonstrates the 
activation spreading mechanism used. The particular network we will use in this 
example is a (3-3-8 three-layer network trained by t,he t)ack-propagation learning 
algorithm. The training set used is shown in table 1. The weights between the 
output nodes and hidden units which are learned by the network (after learning 
to the g0% level for a typical run) are shown in figure 1. 
TABLE 1. Tra, ining Set, for Example 1. 
Input Output 
Pattern Pat tern 
000000 
0000 I 1 
001 100 
001111 
1 10000 
110011 
111100 
111111 
10000000 
01000000 
00100000 
00010000 
00001000 
00000100 
00000010 
o o o o o 0 o 1 
Spreading Activation over Distributed Microfeatures 
Weights 
hl h2 h$ 
nl -4.98 4.40 -2.82 
n2 -6.99 -4.99 -2.23 
nS -6.11 3.49 0.30 
n4 -6.37 -4.68 2.53 
n5 4.36 3.73 -5.09 
n6 4.38 -5.97 -3,67 
n7 0.89 1.07 3.32 
n8 3.88 -6.95 1.88 
Figtire 1. Weights Learned by Back Propagation 
To understand how the act, ivat, ion spreads, let us examine what occurs when 
activa, tion is started at node nl with a weight of 1. This a. ctivation strength is 
divided by the outbranching of the node and then multiplied by the weight of 
each link to the hidden units. Thus a.ctivation flows from nl t,o hi with a 
strength of 1/3 x Weight(l,hl]. A similar computation is ma. de to each of the 
other hidden units. This activation now spreads to each of the other output 
nodes in turn. Thus, n2 would gain activation of 
Activation(hi) x Weight(n'2, hl],,/8-/- 
Acti,t, ation(h?] ' Weight(n2, h $]//8 - 
Activation[h3) x Weight(n2, h$]/8 
or .80 from hi. 
Table 2 shows a. graph of the activation l)read between the output units. The 
table, which is symmetric, can thus be read as showing the output at each of the 
or, her units when an activa. l,ion strengt, h of I J8 placed al the named node. Look- 
ing at t, he table we see that the highesl, a.('livation occurs a. mong nodes which 
share t. he most, t'eat, ures of the input (i.e. same value and position) while the 
lowest is seen among those patterns sharing the fewesl feat.tires. 
556 Hendler 
However, as well as having t,his property, table 2 can be seen as providing a 
matrix which specifies the weight.s between the outpu! nodes if viewed a,s a struc- 
tured network. That, is, ,1 is connect,ed [o ,2 by a strength of +.80, to nS by a 
strength of +1.03, etc. Thus, by using this technique distributed representations 
can be turned into connect, ivity weights for structured networks. When non- 
orthogonal weights: are used. the same activation-spreading algorithm produces a 
structured network which can be used for more complex inferencing 0ran can the 
dis[ribut. ed network alone. 
We demonstrate this by a simple, a.nd again contrived. example. This example is 
motivated by Gary Cot. lre}l's structured model for word sense disambigua[ion 
(Co[troll, 1985). Co[troll, using weights derived by hand, demonstrated [hat, a. 
structured connect.ionist network could distinguish both word-sense and case-slot 
assignments for ambiguous lexical items. Prcented with [he sentence "John 
t,hrew [he fight," the system would activate a node l'or one meaning of "throw," 
presented with "John threw the ball" it would come up with another. The nodes 
of C, ottrell's network included words (.1ohn. Threw, etc.). word senses (Johnl, 
Propel. etc.) and case-slots (TAGT {agent of the throw). PAGT (agent, of the 
Propel). et.c.). 
TABLE 2. Activation Spread in 6--3--8 Network. 
,1 ,2 ,3 , ,,5 n6 07 8 
 .80 1.03 .17 .38 -1.57 -.38 -2.3 
.80  1.02 2.6 -1.57 .31 -.79 .14 
1.03 1.02 * .97 -.63 -2.03 -.03 --1.97 
.17 2.60 .97 * -2.42 -.38 -.09 .52 
.38 -1.57 -.63 -2.42  .64 -.38 -.77 
-1.57 .31 -2.03 -.38 .64 * -.6 2.14 
-.38 -.79 -.03 --.09 -.38 -.6 * .09 
-2.3 -.14 -1.97 .52 -.77 2.1-t .09 * 
To duplicate (gary's network via training, we presented a 3-1a.yer backprop nets- 
work wit.h a training set in which distribut, ed patterns, very loosely corresponding 
to a "dictionary" of word encodings  were associated with a vector representing 
each of the individual nodes which would be represented in Cottrell's system, but, 
with no structure. Thus, tach element in the training set is 
1- Which in any realistic system would some day be replaced by actual signal processing out- 
l,t.s or other rel,resontativns of act,ml word pronunciation forms. 
Spreading Activation ove Distributed Mcoeatums 7 
a 16 bit, vector (representing a four word sentence, each word a.s a. 4 bit, pat, fern), 
associa.t, ed wit, l! anot, her 16 bit veer,or represent,Jng t,he nodes 
Bobl Johnl propel t, hrow hghtl balll pagt pobJ ta,gt, tobj bob john t, hrew t, he 
fight, ball 
For t, his example, the sys[ern was t,rained on the encodings of the four sentences 
John t,hrew the ball 
John threw t,he fight. 
Bob t,hrew the ball 
Bob t. firew t,he fight 
wit, h the output sel, high for those objet'is in the second veer,or which were 
a,ppropriately a.ssocial,ed. as shown in Tab]e 3. 
'I'ABIE :3. Training Set for Example '2. 
Input, Out,put, 
Pal,tern }>at fern 
0110 0001 0101 0010 
01100001 0101 1010 
1001 OOOl 010! OOlO 
1001 0001 OlOl lOlO 
1001100011101110 
1010011100101101 
0101100011011110 
0110011100011 101 
Upon complehon of t,he lea. ming, t,he activat, io spreading algorithm was used o 
derive a [able of comectivily weights bet,ween lhe OUtl)Ut unit,s as shown in table 
4. 
These weights were then transferred into a local connectionist simu]a. tor and a 
very simple act, iva[ion sI)readig model was used to examine t, he result.s. When 
we run the simulalor. uing the activation spreading over learned weights, 
exact,ly t,he result,s produ('ed by Cott, rell's network are seen. Thus: 
Activation from the nodes corresponding to john. throw, the. and fight 
cauae a I)osi[ive activation at, the node for "Throw" and a negalive ac- 
t, ival.ion a t, he node for 'Propel." 
while 
Ac[ivat. ion from john throw the ball spread positively to '})ropel" and 
not, t,o "throw." 
Further, ot,her et'fect, s which are also t)redicl,ed by Cot,trell's model are seen: 
Act, iva.l.ion at. 7MGT and TOBJ spreads posit,ive ac[ivation to Th,rou, 
anti nol t.o Propel. 
and 
Act.ivation at, PAG7' and POBJ causes a spread t,o Propel but, not 1o 
rh FO 
558 Hendler 
TABLE 4. Connectivity Weights for Example 2. 
*** --0.12 0.01-0.01-0.01 0.01 0.01 0.01-0.01-0.01 0.12-0.12---0.03-0.03-0.01 0.00 
-0.12 *** -0.01 0.01 0.01-0.01-0.01-0.01 0.01 0.01-0.12 0.12 0.03 0.03 0.Ol-0.01 
0.01-0.01 *** -0.04-0.04 0.04 0.04 0.05-0.05-0.04 0.01-0.01-0.02-0.02-0.04 0.04 
-0.01 0.01-0.04 *** 0.04-0.04-0.04-0.05 0.05 0.04---0.01 0.01 0.02 0.02 0.04-0.04 
-0.01 0.01-0.04 0.04 *** -0.04-0.05-0.05 0.05 0.04-0.01 0.01 0.02 0.02 0.04-0.04 
0.01-0.01 0.04-0.04-0.04 *** 0.05 0.05-0.05-0.04 0.01-0.01-0.02-0.02-0.04 0.04 
0.01-0.01 0.04-0.04-0.05 0.05 *** 0.05-0.05-0.05 0.01--0.00-0.02-0.02-0.04 0.05 
0.01-0.01 0.05-0.05-0.05 0.05 0.05 *** --0.05-0.05 0.01-0.01-0.02-0.02--0.04 0.05 
-0.01 0.01-0.05 0.05 0.05-0.05-0.05-0.05 *** 0.05-0.01 0.01 0.02 0.03 0.04-0.05 
-0.01 0.01-0.04 0.04 0.04-0.04-0.05-0.05 0.05 *** -0.01 0.01 0.02 0.02 0.04-0.05 
0.12-0.12 0.01-0.01-0.01 0.01 0.01 0.01-0.01-0.01 *** -0.12--0.03-0.03-0.01 0.01 
-0.12 0.12-0.01 0.01 0.01-0.01-0.00-0.01 0.01 0.0l-0.12 *** 0.03 0.03 0.01--0.00 
-0.03 0.03-0.02 0.02 0.02--0.02--0.02-0.02 0.02 0.02--0.03 0.03 *** 0.20 0.02-0.02 
--0.03 0.03--0.02 0.02 0.02-0.02--0.02-0.02 0.03 0.02 0.03 0.03 0.20 *** 0.02-0.02 
0.01 0.01--0.04 0.04 0.04-0.04--0.04-0.04 0.04 0.0-1-0.01 0.01 0.02 0.02 *** -0.0t 
0.00-0.01 0.04-0.04--0.04 0.04 0.05 0.05 0.05 0.05 0.0!-0.00 0.02 0.02-0.04 *** 
We believe that results like this one may argue that tructured net.works are 
integrally linked to distributed networks in t,hat, di,qtributed network lcarfing 
techniques may provide a fundamental basis for explaining the cogit, ivc develop- 
ment of structured networks. In addition, wc see tidal simple int'erelial reason- 
ing can be produced using purely connectioni,t models. 
CONCLUDING REMARKS 
We have attempted to show that a model using an activation spreading va.riant 
can be used to take learned connectionist models and perform some limited forms 
of inferencing upon t,hem. Further, we have argued thai this tec]miquc may pro- 
vide a computationa] model in which structured networks can be learned and 
that stxuctured networks provide the infertricing capabilities ntissing in purely 
dist,ributed mode]s. }towever. before we can truly urther this claim. significant 
work remains to be done. We must extend and exploresucl mode]s. particularly 
examining whether these types of techniques can be extended to handle the com- 
plexity that can be found in real-world problenLs a.d serious cognitive models. 
In particula. r we are beginning an examination ot' two crucial issues: F}rst, 
will the technique described above work for realistic problems? In particular, can 
thc inI'erencing be designed to impacl on the recognition by the distribut,ed net- 
work? If so, one could see, for example, a speech recognition progra.m coupled to 
a syst, em like (1ot. trell's natural language system, providing a hand]e for a text 
understanding system. Similarly such a technique might allow the integration of 
top-down and botton-up processing for vision and or, her such signa] processing 
tasks. 
Spreading Activation over Distributed Microfeatures 559 
Secondly, we wish to see if more complex spreading activation models could be 
hooked to this type of model. Could networks such as those proposed by Sha.stri 
(1985), Dioderich (1985), and Pollack and Waltz (1982) which provide complex 
inferencing })nt require more st,ructure than simply weights between unit,s, be 
abstracted out, of the learned weight,s? Two particular areas currently being pur- 
sued by the author. for example, focus on active inhibition models for determin- 
ing whether portions of the network can be suppr,ssed t,o provide more complex 
inferencing and the learning of structures given temporally ordered information. 
References 
Charniak, E. Passing markers: A theory of contextual influence in language 
comprehension CogtitJe Science, 7(3), 1983. 1 71-190. 
Cot, trell, G.W. A Contection,/,_st Approach to Word-Seae Disambiguation Doc- 
toral Dissertation, Computer Science Dept.. l liversity of ttochester, 1985. 
Diederich, J. Paral/elverarbeit,an i,t ntzwerk-basierten Sy.steme, PhD Disserta- 
tion, Dept. of Linguistics, University of Bielefeld, 1985. 
Feldman, J.A. and Ballard, D.H. (1982). (1onnectionist models and their proper- 
ties. UognitiJe Science. 6. 205--25. 
liendiet, J.A. htlegrating Mart,'er-pa,s._si, 9 a,d Problem Solving.' A .spreadi,,g 
activalio approach to improved choice i plan.it.q lawrence F, rlbaun Associates, 
N.J., Noveml)er, 1,q87. 
Pollack, J.B. and V,?altz, D.L Natural Language Processing using spreading 
activation and lateral inhibition Proceedi.q.s of the Fo',rth Iternatio,t, al ('o'nfer- 
etce of the Cogtiti,e ,qcietce Society, 1982, 50-53. 
Rumelharl.. D.E. and McCle]land, J.L. (eds.) Parallel Di.slrib,ttted 
('arabridge, Ma.: MIT Press. 
Shastri, L. Et,ide,,tial ]ea,oig i Sematic et,,orl,'s: A formal theory a,d ifs 
parallel imp/emerttatio Doctoral Dissertation, Computer Science Department, 
[lnJversity of }:ochester, Sept.. 1985. 
