495 
REFLEX! VE ASSOCIATI VE MEMORIES 
Hendricus G. Loos 
Laguna Research Laboratory, Fallbrook, CA 92028-9765 
'ABSTRACT 
In the synchronous discrete model, the average memory capacity of 
bidirectional associative memories (BAMs) is compared with that of 
Hopfield memories, by means of a calculation of the percentage of good 
recall for 100 random BAMs of dimension 64x64, for different numbers 
of stored vectors. The memory capacity is found to be much smaller than 
the Kosko upper bound, which is the lesser of the two dimensions of the 
BAM. On the average, a 64x64 BAM has about 68 % of the capacity of the 
corresponding Hopfield memory with the same number of neurons. Ortho- 
normal coding of the BAM increases the effective storage capacity by 
only 25 %. The memory capacity limitations are due to spurious stable 
states, which arise in BAMs in much the same way as in Hopfield 
memories. Occurrence of spurious stable states can be avoided by 
replacing the thresholding in the backlayer of the BAM by another 
nonlinear process, here called "Dominant Label Selection" (DLS). The 
simplest DLS is the winner-take-all net, which gives a fault-sensitive 
memory. Fault tolerance can be improved by the use of an orthogonal or 
unitary transformation. An optical application of the latter is a Fourier 
transform, which is implemented simply by a lens. 
INTRODUCTION 
A reflexive associative memory, also called bidirectional associa- 
tive memory, is a two-layer neural net with bidirectional connections 
between the layers. This architecture is implied by Dana Anderson's 
optical resonator 1 , and by similar configurations 2,3. Bart Kosko 4 coined 
the name "Bidirectional Associative Memory" (BAM), and investigated 
several basic properties 4-6. We are here concerned with the memory 
capacity of the BAM, with the relation between BAMs and Hopfield 
memories 7, and with certain variations on the BAM. 
American Institute of Physics 1988 
496 
BAMSTRUCTURE 
We will use the discrete model in which the state of a layer of 
neurons Is described by a bipolar vector. The Dirac notation 8 will be 
used, in which I> and <l denote respectively column and row vectors. <al 
and la> are each other transposes, <alb> is a scalar product, and !a><bl is 
an outer product. As depicted in Fig. 1, the BAM has two layers of 
neurons, a front layer of N neurons with state vector If>, and a back layer 
backlayer, P neurons 
state vector b 
frontlayer, N neurons 
state vector f 
back 
stroke 
forward 
stroke 
Fig. 1. BAM structure 
of P neurons with state vector 
lb>. The bidirectional connec- 
tions between the layers allow 
slgnal flow in two directions. 
The front stroke gives lb> = 
s(B!f>), where B is the connec- 
tion matrix, and s( ) is a thres- 
hold function, operating at 
zero. The back stroke results In an ugraded front state <f'l=s(<b!B), 
which also may be written as !f'>=s(B'!b>), where the superscript T 
denotes transposition. We consider the synchronous model, where all 
neurons of a layer are updated simultaneously, but the front and bacl( 
layers are updated at different times. The BAM action ls shown In Fig. 2. 
The forward stroke entails taking scalar products between a front 
state vector If> and the rows of B, and entering the thresholded results 
as elements of the back state vector lb>. !n the bacl< stroke we take 
f 
thresholding 
& reflection 
NxP 
Flg. 2. BAM action 
thresholding 
& reflection 
b 
V f ( 
, 
NxN 
j thresholding 
& 
feedback 
Fig. 3. Autoassociative 
memory action 
scalar products of lb> with column vectors of B, and enter the 
thresholded results as elements of an upgraded state vector If'>. !n 
contrast, the action of an autoassociative memory is shown in Figure 3. 
The BAM may also be described as an autoassociative memory 5 by 
497 
concatenating the front and back vectors into a slngle state vector 
Iv>=!f,b>,and by taking the (N+ P)x(N+ P) connection matrlx as shown in Flg. 
4. This autoassociative memory has the same number of neurons as our 
 zro 
b 
Fig. 4. BAM as autoasso- 
ciative memory 
memory 7 the connection 
BAM, viz. N+P. The BAM operation where 
initially only the front state is speci- 
fied may be obtained with the corres- 
ponding autoassociative memory by 
initially specifying lb> as zero, and by 
arranging the thresholding operation 
such that s(O) does not alter the state 
vector component. For a Hopfield 
matrix is 
M 
H=(lm><ml) -MI , (1) 
m=l 
where Im>, m=l to M, are stored vectors, and I is the identity matrix. 
Writing the N+P dimensional vectors Im> as concatenations Idm,Cm>, (1) 
takes the form 
M 
H-(  (!dm><dml+lcm><cml+ldm><cml+lcm><dml))-Ml, (2) 
m=l 
with proper block placing of submatrices understood. Writing 
M 
Hd=(  
m=l 
M 
K= ElCm><dml , (3) 
m=l M 
Idm> <dml)-Ml, Hc=( - Icm> <cml)-Ml, (4) 
m--1 
where the I are identities in appropriate subspaces, the Hopfield matrix 
be as 
H may partitioned shown in Fig. 5. K is just the BAM matrix given 
by Kosko 5, and previously used by Kohonen 9 for linear heteroassociative 
memories. Comparison of Figs. 4 and ,% shows that in the synchronous 
discrete model the BAM with connection matrix (3) is equivalent to a 
Hopfield memory in which the diagona! blocks H d and H c have been 
498 
deleted. Since the Hopfield memory is robust, this "pruning" may not 
affect much the associative recall of stored vectors, if M is small; 
however, on the average, pruning will not improve the memory capacity. 
It follows that, on the average, a discrete synchronous BAM with matrix 
(3) can at best have the capacity of a Hopfield memory with the same 
number of neurons. 
We have performed computations of the average memory capacity 
for 64x64 BAMs and for corresponding 128x128 Hopfield memories. 
Monte Carlo calculations were done for I O0 memories, each of which 
stores M random bipolar vectors. The straight recall of all these vectors 
was checked, allowing for 24 iterations. For the BAMs, the iterations 
were started with a forward stroke in which one of the stored vectors 
Idm> was used as input. The percentage of good recall and its standard 
deviation were calculated. The results plotted in Fig. 6 show that the 
square BAM has about 68% of the capacity of the corresponding Hopfield 
memory. Although the total number of neurons is the same, the BAM only 
needs 1/4 of the number of connections of the Hopfield memory. The 
storage capacity found is much smaller than the Kosko 6 upper bound, 
which is rain (N,P). oo- 
d T 
Fig. 5. Partitioned 
Hopfield matrix 
601 
4oi 
201 
10 20 30 40 50 60 
M, number of stored vectors 
Fig. 6. % of good recall versus M 
CODED BAM 
So far, we have considered both front and back states to be used for 
data. There is another use of the BAM in which only front states are used 
as data, and the back states are seen as providing a code, label, or 
pointer for the front state. Such use was anticipated in our expression 
(3) for the BAM matrix which stores data vectors Idm> and their labels or 
codes Icm>. For a square BAM, such an arrangement cuts the information 
contained in a single stored data vector in half. However, the freedom of 
499 
choosing the labels Icm> may perhaps be put to good use. Part of the 
problem of spurious stable states, whlch plagues BAMs as well as 
Hopfield memories as they are loaded up, is due to the lack of 
orthogonality of the stored vectors. In the coded BAM we have the 
opportunity to remove part of thls problem by choosing the labels as 
orthonormal. Such labels have been used previously by Kohonen 9 in linear 
heteroassociative memories. The question whether memory capacity can 
be improved In this manner was explored by taking 64x64 BAMs in which 
the labels are chosen as Hadamard vectors. The latter are bipolar vectors 
with Euclidean norm FP, which form an orthonormal set. These vectors 
are rows of a PxP Hadamard matrix; for a discussion see Hatwit and 
Sloane 10. The storage capacity of such Hadamard-coded BAHs was 
calculated as function of the number M of stored vectors for 100 cases 
for each value of M, in the manner discussed before. The percentage of 
good recall and its standard deviation are shown in Fig. 6. It is seen that 
the Hadamard coding gives about a factor 2.5 in M, compared to the 
ordinary 64x64 BAM. However, the coded BAM has only half the stored 
data vector dimension. Accounting for this factor 2 reduction of data 
vector dimension, the effective storage capacity advantage obtained by 
Hadamard coding comes to only 25 %. 
HALF BAH WITH HADAMARD CODING 
For the coded BAH there is the option of deleting the threshold 
operation in the front layer. The resulting architecture may be called 
"half BAH". In the half BAM, thresholding is only done on the labels, and 
consequently, the data may be taken as analog vectors. Although such an 
arrangement diminishes the robustness of the memory somewhat, there 
are applications of interest. We have calculated the percentage of good 
recall for 100 cases, and found that giving up the data thresholding cuts 
the storage capacity of the Hadamard-coded BAH by about 60 %. 
SELECTIVE REFLEXIVE MEMORY 
The memory capacity limitations shown in Fig. 6 are due to the 
occurence of spurious states when the memories are loaded up. 
Consider a discrete BAM with stored data vectors Ira>, m = 1 to M, 
orthonormal labels Icm>, and the connection matrix 
500 
M 
K=. Icm><ml 
m=l 
(5) 
For an input data vector Iv> which is closest to the stored data vector 
II>, one has in the forward stroke 
M 
Ib>=s(clc 1 >+ Z amlcm>) ' (6) 
m=2 
where 
c=<11v>, and am=<mlv> (7) 
Although for m  1 
M 
am<C, for some vector component the sum 7 amlcm> 
m=2 
may accumulate to such a large value as to affect the thresholded result 
lb>. The problem would be avoided if the thresholding operation s( ) in the 
back layer of the BAM were to be replaced by another nonlinear operation 
which selects, from the linear combination 
M 
ClCl >+ - amlcm> (8) 
m=2 
the dominant label ICl>. The hypothetical device which performs this 
operation is here called the "Dominant Label Selector" (DLS) 11, and we 
call the resulting memory architecture "Selective Reflexive Memory" 
(SRM). With the back state selected as the dominant label ICl>, the back 
stroke gives <f'l=s(<clIK)=s(P<ll)=<ll, by the orthogonality of the labels 
ICm>. !t follows TM that the SRM gives perfect associative recall of the 
nearest stored data vector, for any number of vectors stored. Of course, 
the linear independence of the P-dimensional label vectors ICm>, m= 1 to 
M, requires P>=M. 
The DLS must select, from a linear combination of orthonormal 
labels, the dominant label. A trivial case is obtained by choosing the 
501 
labels Icm> as basis vectors lUm>, which have all components zero except 
for the ml:h component, which is unity. With this choice of labels, the 
f DLS may be taken as a winner- 
zinner- 
: bl [ take-all 
net 
Fig.7. Simplest reflexive 
memory with DLS 
take-all net W, as shown in Fig. 7. 
This case appears to be Included in 
Adaptive Iesonance Theory 
(ART) 12 as a special simplified 
case. A relationship between 
the ordinary BAM and ART was 
pointed out by Kosko 5. As in ART, 
.,  /orthogonal 
trans[or- 
! ! I 'lg I m,,,o- I F lilt I .winner- 
 I /take-all 
U ! I net 
Fig. 8..Selective reflexive 
memory 
there is considerable fault sensitivity in this memory, because the 
stored data vectors appear in the connection matrix as rows. 
A memory with better fault tolerance may be obtained by using 
orthogonal labels other than basis vectors. The DLS can then be taken as 
an orthogonal transformation G followed by a winner-take-all net, as 
shown in Fig. 8. G is to be chosen such that it transforms the labels Icm> 
into vectors proportional to the 
basis vectors lum>. This can always 
be done by taking 
P 
G= Llup><cpl , (9) 
p--1 
where the ICp>, p= 1 to P, form a 
complete orthonormal set which 
contains the labels Icm>, m=l to M. The neurons in the DLS serve as 
grandmother cells. Once a single winning cell has been activated, i.e., 
the state of the layer Is a single basis vector, say lUl>, this vector 
must be passed back, after application of the transformation 6-1, such 
as to produce the label Ic1> at the back of the BAM. Since 6 is 
orthogonal, we have 6-1=6 T, so that the required lnverse 
transformation may be accomplished simply by sending the basis vector 
back through the transformer; this gives 
P 
<Ul!G=  <UllUp><cpl=<c 1 I 
p=i 
(lO) 
502 
as required. 
HALF SRM 
The SRM may be modified by deleting the thresholding operation in 
the front layer. The front neurons then have a linear output, which is 
reflected back through the SRM, as shown in Fig. g. !n this case, the 
stored data vectors and the 
linear neurons 
,, T I /orthogonal 
, i transfr- 
I ! g mation 
!  J liT I winner- 
 / /take-all 
LI I i net 
Fig. g. Half SRM with linear 
neurons in front layer 
input data vectors may be taken 
as analog vectors, but we re- 
quire all the stored vectors to 
have the same norm. The action 
of the SRM proceeds in the same 
way as described above, except 
that we now require the ortho- 
normal labels to have unit 
norm. It follows that, just like 
the full SRM, the half SRM gives 
perfect associative recall to the nearest stored vector, for any number 
of stored vectors up to the dimension P of the labels. The latter 
condition is due to the fact that a P-dimensional vector space can at 
most contain P orthonormal vectors. 
In the SRM the output transform G is introduced in order to improve 
the fault tolerance of the connection matrix K. This is accomplished at 
the cost of some fault sensitivity of G, the extent of which needs to be 
investigated. In this regard it is noted that in certain optical implemen- 
tations of reflexive memories, such as Dana Anderson's resonator I and 
similar configurations2, 3, the transformation G is a Fourier transform, 
which Is implemented simply as a lens. Such an implementation is quite 
lnsentive to the common semiconductor damage mechanisms. 
EQUIVALENT AUTOASSOCI AT IVE I"'IEI'RIES 
Concatenation of the front and back state vectors allows descrip- 
tion of the SRHs in terms of autoassociative memories. For the SRM 
which uses basis vectors as labels the corresponding autoassociative 
memory is shown in Fig. 10. This connection matrix structure was also 
proposed by 6uest et. al. 13. The winner-take-all net W needs to be 
503 
Fig. 10. Equivalent auto- 
associative memory 
given time to settle on a basis 
vector state before the state lb> 
can influence the front state If>. 
This may perhaps be achieved by 
arranging the W network to have a 
thresholding and feedback which 
are fast compared with that of the 
K network. An alternate method 
may be to equip the W network 
with an output gate which is 
opened only after the W net has 
settled. These arrangements 
present a complication and cause a delay, which in some applications 
may be inappropriate, and In others may be acceptable in a trade 
between speed and memory denslty. 
For the SRM with output transformer and orthonormal labels other 
f b W /eedback 
f thresholded 
b linear 
W thresholded 
+ output gate 
Fig. 11. Autoassociative memory 
equivalent to SRM with transform 
output gate 
winner-take-al 
"i" w output 1 
?:" b back layer, 
linear 
'' ,"' f front Iayer 
BAM connections 
$ =orthogonal transformation 
winner-take-all net 
Fig. 12. Structure of SRM 
than basis vectors, a correspon- 
ding autoassociative memory may 
be composed as shown in Fig. 11. 
An output gate in the w layer is 
chosen as the device which 
prevents the backstroke through 
the BAM to take place before the 
winner-take-al net has settled. 
The same effect may perhaps be 
achieved by choosing different 
response times for the neuron 
layers f and w. These matters 
require investigation. Unless 
the output transform G is already 
required for other reasons, as in 
some optical resonators, the DLS 
with output transform is clumsy. 
it would far better to combine 
the transformer G and the net W 
into a single network. To find 
such a DLS should be considered 
a challenge. 
5O4 
The work was partly supported by the Defense Advanced Research 
Projects Agency, ARPA order 5916, through Contract DAAHO 1-86-C 
-0968 with the U.S. Army Missile Command. 
REFERENCES 
1. D. Z. Anderson, "Coherent optical eigenstate memory", Opt. Lett. 11, 
56 (1986). 
2. B. H. Sollet, 6. d. Dunning, Y. Owechko, and E. Margin, "Associative 
holographic memory with feedback using phase-conjugate mirrors", Opt. 
Lett. 1 I, 118 (1986). 
3. A. Yamiv and S. K. Wong, "Associative memories based on message- 
bearing optical modes In phase-conjugate resonators", Opt. Lett. 11, 
186 (1986). 
4. B. Kosko, "Adaptive Cognitive Processing", NSF Workshop for Neural 
Networks and Neuromorphic Systems, Boston, Mass., Oct. &-8, 1986. 
5. B. Kosko, "Bidirectional Associative Memories", IEEE Trans. SMC, In 
press, 1987. 
6. B. Kosko, "Adaptive Bidirectional Associative Memories", Appl. Opt., 
In press, 1987. 
7. J.J. Hopfield, "Neural networks and physical systems with emergent 
collective computational abilities", Proc. Natl. Acad. Scl. USA 79, 2554 
(1982). 
8. P. A.M. Dirac, THE PRINCIPLES OF QUANTUM MECHANICS, Oxford, 1958. 
9. T. Kohonen, "Correlation Matrix Memories", Helsinski University of 
Technology Report TKK-F-A 130, 1970. 
10. M. Harwit and N.J. A, 51oane, HADAMARD TRANSFORM OPTICS, 
Academic Press, New York, 1979. 
11. H. G. Logs, "Adaptive Stochastic Content-Addressable Memory", Final 
Report, ARPA Order 5916, Contract DAAHO 1-86-C-0968, March 1987. 
12. G. A. Carpenter and S. Grossberg, "A Massively Parallel Architecture 
for a Self-Organizing Neural Pattern Recognition Machine", Computer 
Vision, Graphics, and Image Processing, 37, 54 (1987). 
13. R. D. TeKolste and C. C. Guest, "Optical Cohen-Grossberg System 
with All-Optical Feedback", IEEE First Annual International Conference 
on Neural Networks, San Diego, June 21-24, 1987. 
