702 Obradovic and Parberry 
Analog Neural Networks of Limited Precision I: 
Computing with Multilinear Threshold Functions 
(Preliminary Version) 
Zoran Obradovic and Ian Parberry 
Department of Computer Science, 
Penn State University, 
University Park, Pa. 16802. 
ABSTRACT 
Experimental evidence has shown analog neural networks to be ex- 
tremely fault-tolerant; in particular, their performance does not ap- 
pear to be significantly impaired when precision is limited. Analog 
neurons with limited precision essentially compute k-ary weighted 
multilinear threshold functions, which divide R n into k regions with 
k-1 hyperplanes. The behaviour of k-ary neural networks is investi- 
gated. There is no canonical set of threshold values for k>3, 
although they exist for binary and ternary neural networks. The 
weights can be made integers of only O ((z +k)log (z +k)) bits, where 
z is the number of processors, without increasing hardware or run- 
ning time. The weights can be made +1 while increasing running 
time by a constant multiple and hardware by a small polynomial in z 
and k. Binary neurons can be used if the running time is allowed to 
increase by a larger constant multiple and the hardware is allowed to 
increase by a slightly larger polynomial in z and k. Any symmetric 
k-ary function can be computed in constant depth and size 
0 (n'-/(k-2)!), and any k-ary function can be computed in constant 
depth and size O (nk"). The alternating neural networks of Olafsson 
and Abu-Mostafa, and the quantized neural networks of Fleisher are 
closely related to this model. 
Analog Neural Networks of Limited Precision I 703 
1 INTRODUCTION 
Neural networks are typically circuits constructed from processing units which com- 
pute simple functions of the form f (w x,...,w, ):R" -->S where So_R, wiR for l__n, 
and 
f (Wl,...,W.)(Xl, .... 
i=1 
for some output function g:R-->S. There are two choices for the set S which are 
currently popular in the literature. The first is the discrete model, with S=B (where B 
denotes the Boolean set {0,1}). In this case, g is typically a linear threshold function 
g(x)=l iff x_>O, and f is called a weighted linear threshold function. The second is 
the analog model, with S=[0,1] (where [0,1] denotes {reRI0_r<l}). In this case, g 
is typically a monotone increasing function, such as the sigmoid function 
g(x)=(l+c-X) - for some constant c e R. The analog neural network model is popular 
because it is easy to construct processors with the required characteristics using a few 
transistors. The digital model is popular because its behaviour is easy to analyze. 
Experimental evidence indicates that analog neural networks can produce accurate 
computations when the precision of their components is limited. Consider what actu- 
ally happens to the analog model when the precision is limited. Suppose the neurons 
can take on k distinct excitation values (for example, by restricting the number of di- 
gits in their binary or decimal expansions). Then S is isomorphic to Zk={0,...,k-1}. 
We will show that g is essentially the multilinear threshold function 
g (h,h2,...,hk_O:R-->Z defined by 
g (x)=i iff hi <_x <hi+. 
Here and throughout this paper, we will assume that h<h2<...<h_, and for conveni- 
ence define h0=-oo and h=oo. We will call f a k-ary weighted multilinear threshold 
function when g is a multilinear threshold function. 
We will study neural networks constructed from k-ary multilinear threshold functions. 
We will call these k-ary neural networks, in order to distinguish them from the stan- 
dard 2-ary or binary neural network. We are particularly concerned with the resources 
of time, size (number of processors), and weight (sum of all the weights) of k-ary 
neural networks when used in accordance with the classical computational paradigm. 
The reader is referred to (Parberry, 1990) for similar results on binary neural networks. 
A companion paper (Obradovic & Parberry, 1989b) deals with learning on k-ary neur- 
al networks. A more detailed version of this paper appears in (Obradovic & Parberry, 
1989a). 
2 A K-ARY NEURAL NETWORK MODEL 
A k-ary neural network is a weighted graph M=(V,E ,w ,h), where V is a set of pro- 
cessors and E cVxV is a set of connections between processors. Function 
w:VxV-->R assign weights to interconnections and h:V-->R-assign a set of k-1 
thresholds to each of the processors. We assume that if (u,v)&E, w (u,v)=0. The 
size of M is defined to be the number of processors, and the weight of M is 
704 Obradovic and Parberry 
Iw (u,v)l. 
,vV 
The processors of a k-ary neural network are relatively limited in computing power. 
A k-ary function is a function f :Z-->Zk. Let F denote the set of all n-input k-ary 
functions. Define O:R'*--->F by O'(w ,...,wn ,h ,...,hk_0:R-->Z, where 
O(w ,h )=i 
i=l 
The set of k-ary weighted multilinear threshold functions is the union, over all n  N, 
of the range of OF. Each processor of a k-ary neural network can compute a k-ary 
weighted multilinear threshold function of its inputs. 
Each processor can be in one of k states, 0 through k-1. Initially, the input proces- 
sors of M are placed into states which encode the input. If processor v was updated 
during interval t, its state at time t-1 was i and output was j, then at time t its state 
will be j. A k-ary neural network computes by having the processors change state un- 
til a stable configuration is reached. The output of M are the states of the output pro- 
cessors after a stable state has been reached. A neural network M 2 is said to be f (t)- 
equivalent to M iff for all inputs x, for every computation of M on input x which 
terminates in time t there is a computation of M 2 on input x which terminates in time 
f(t) with the same output. A neural network M2 is said to be equivalent to M iff it 
is t-equivalent to it. 
3 ANALOG NEURAL NETWORKS 
Let f be a function with range [0,1]. Any limited-precision device which purports to 
compute f must actually compute some function with range the k rational values 
R={i/k-lli e Z,0<_/<k } (for some keN). This is sufficient for all practical purposes 
provided k is large enough. Since Rk is isomorphic to Z, we will formally define 
the limited precision variant of f to be the function f:X-->Z defined by 
f  (x )=round (f (x).(k-1)), where round:R-->N is the natural rounding function defined 
by round(x)=n iff n-0.5.x<n+0.5. 
Theorem 3.1: Letf(w,...,w,,):R"-->[O,1] where wieR for ln, be defined by 
f (w,...,w,)(x,... ,x,g(wixi) 
i=l 
where g :R-->[0,1] is monotone increasing and invertible. Then f (w 1,...,w,)k :R n -->Z 
is a k-ary weighted multilinear threshold function. 
Proof: It is easy to verify that f(wi,...,w,)=O(wi,...,w,,hi,...,h_O, where 
hi=g-i((2i-1)/2(k-1)). [] 
Thus we see that analog neural networks with limited precision are essentially k-ary 
neural networks. 
Analog Neural Networks of Limited Precision I 705 
4 CANONICAL THRESHOLDS 
Binary neural networks have the advantage that all thresholds can be taken equal to 
zero (see, for example, Theorem 4.3.1 of Parberry, 1990). A similar result holds for 
ternary neural networks. 
Theorem 4.1: For every n-input ternary weighted multilinear threshold function there 
is an equivalent (n+l)-input ternary weighted multilinear threshold function with 
threshold values equal to zero and one. 
Proof: Suppose w=(w,...,wn)eR n, h,h2eR. 
h l<h2. Define =(1 ..... n+l) R ' +l by 
,+i=-h/(h2-hO. It can be demonstrated by 
Z, 
Without loss of generality assume 
i=wi/(h2-h) for 1.(d._n, and 
a simple case analysis that for all 
O'(w ,h ,h 2)(x )=O'+ ( ,0,1)(x  ,...,x, ,1). 
The choice of threshold values in Theorem 4.1 was arbitrary. Unfortunately there is 
no canonical set of thresholds for k >3. 
Theorem 4.2: For every k>3, n>2, m_>0, h,...,hk_eR, there exists an n-input k-ary 
weighted multilinear threshold function 
such that for all (n +m)-input k-ary weighted multilinear threshold functions 
O+m( , . . , ,,h  
 ,...,h_0.Zt -->Zk 
and y 1,...,Ym  R, there exists x=(x,...,x,,) Z such that 
O(w b...,wn ,t ,...,t-O(x):O (w  ..... n+a ,h ,...,h_0(x b...,x, ,y b...,Ya). 
Proof (Sketch): Suppose that t l,...,t/_l R is a canonical set of thresholds, and w.l.o.g. 
assume n=2. Let h=(h,...,h,_O, where h=h2=2, h3--4, hi=5 for 4<i<k, and 
f =O2(1,1,h ). 
By hypothesis there exist w  ,. .. ,w, +2 and y=(y ,...,y,) R" such that for all x  Z 2, 
f (x)=-Of+2(w 1,...,w +2,t 1,...,t-l)(x ,y ). 
Let S =EWi+2Yi. 
i=l 
Since f (1,0)--0, f (0,1)=0, f (2,1)=2, f (1,2)=2, it follows that 
2(w +w 2+S)<t l+t 3. 
(1) 
Since f (2,0)=2, f (1,1)=2, and f (0,2)=2, it follows that 
706 Obradovic and Parberry 
W i+W 2+5 ?t 2. (2) 
Inequalities (1) and (2) imply that 
2t2<tl+t3. (3) 
By similar arguments from g=O2(1,1,1,3,3,4,...,4) we can conclude that 
2t2>tl+t3. (4) 
But (4) contradicts (3). [] 
5 NETWORKS OF BOUNDED WEIGHT 
Although our model allows each weight to take on an infinite number of possible 
values, there are only a finite number of threshold functions (since there are only a 
finite number of k-ary functions) with a fixed number of inputs. Thus the number of 
n-input threshold functions is bounded above by some function in n and k. In fact, 
something stronger can be shown. All weights can be made integral, and 
0 ((n +k)log (n +k)) bits are sufficient to describe each one. 
Theorem 5.1: For every k-ary neural network M x of size z there exists an equivalent 
k-ary neural network M 2 of size z and weight ((k-1)/2)Z (z + l) (z+k)/2+O) with integer 
weights. 
Proof (Sketch): It is sufficient to prove that for every weighted threshold function 
f(wi,...,w,,hi,...,hk_i):Z-->Z for some nN, there is an equivalent weighted thres- 
hold function g(w,...,w,h,...,h_i) such that Iwi*l<((k-1)/2)n(n+l) (n')/2+(O for 
l<i..n. By extending the techniques used by Muroga, Toda and Takasu (1961) in the 
binary case, we see that the weights are bounded above by the maximum determinant 
of a matrix of dimension n +k-1 over Z. [] 
Thus if k is bounded above by a polynomial in n, we are guaranteed of being able to 
describe the weights using a polynomial number of bits. 
6 THRESHOLD CIRCUITS 
A k-ary neural network with weights drawn from {+1} is said to have unit weights. A 
unit-weight directed acyclic k-ary neural network is called a k-ary threshold circuit. 
A k-ary threshold circuit can be divided into layers, with each layer receiving inputs 
only from the layers above it. The depth of a k-ary threshold circuit is defined to be 
the number of layers. The weight is equal to the number of edges, which is bounded 
above by the square of the size. Despite the apparent handicap of limited weights, k- 
ary threshold circuits are surprisingly powerful. 
Much interest has focussed on the computation of symmetric functions by neural net- 
works, motivated by the fact that the visual system appears to be able to recognize ob- 
jects regardless of their position on the retina. A function f :ZZ is called sym- 
metric if its output remains the same no matter how the input is permuted. 
Analog Neural Networks of Limited Precision I 707 
Theorem 6.1: Any symmetric k-ary function on n inputs can be computed by a k-ary 
threshold circuit of depth 6 and size (n+l)'-/(k-2)!+ 0 (kn). 
Proof: Omitted. [] 
It has been noted many times that neural networks can compute any Boolean function 
in constant depth. The same is true of k-ary neural networks, although both results 
appear to require exponential size for many interesting functions. 
Theorem 6.2: Any k-ary function of n inputs can be computed by a k-ary threshold 
circuit with size (2n+l)k"+k+l and depth 4. 
Proof: Similar to that for k=2 (see Chandra et. al., 1984; Parberry, 1990). [] 
The interesting problem remaining is to determine which functions require exponential 
size to achieve constant depth, and which can be computed in polynomial size and 
constant depth. We will now consider the problem of adding integers represented in 
k-ary notation. 
Theorem 6.3: The sum of two k-ary integers of size n can be computed by a k-ary 
threshold circuit with size 0 (n 2) and depth 5. 
Proof: First compute the carry of x and y in quadratic size and depth 3 using the stan- 
dard elementary school algorithm. Then the its position of the result can be computed 
from the i th position of the operands and a carry propagated in that position in con- 
stant size and depth 2. [] 
Theorem 6.4: The sum of n k-ar k, integers of size n can be computed by a k-ary 
threshold circuit with size 0 (n3+kn ) and constant depth. 
Proof: Similar to the proof for k =2 using Theorem 6.3 (see Chandra et. al., 1984; Par- 
berry, 1990). [] 
Theorem 6.5: For every k-ary neural network M x of size z there exists an 0 (t)- 
equivalent unit-weight k-ary neural network M2 of size 0 ((z +k)41og3(z +k)). 
Proof: By Theorem 5.1 we can bound all weights to have size O((z+k)log(z+k)) in 
binary notation. By Theorem 6.4 we can replace every processor with non-unit 
weights by a threshold circuit of size 0 ((z +k)31og3(z+k)) and constant depth. [] 
Theorem 6.5 implies that we can assume unit weights by increasing the size by a po- 
lynomial and the running time by only a constant multiple provided the number of 
logic levels is bounded above by a polynomial in the size of the network. The 
number of thresholds can also be reduced to one if the size is increased by a larger 
polynomial: 
Theorem 6.6: For every k-ary neural network M of size z there exists an 0 (t)- 
equivalent unit-weight binary neural network M2 of size O(z4k4)(log z + log k) 3 
which outputs the binary encoding of the required result. 
Proof: Similar to the proof of Theorem 6.5. [] 
This result is primarily of theoretical interest. Binary neural networks appear simpler, 
and hence more desirable than analog neural networks. However, analog neural net- 
works are actually more desirable since they are easier to build. With this in mind, 
Theorem 6.6 simply serves as a limit to the functions that an analog neural network 
708 Obradovic and Parberry 
can be expected to compute efficiently. We are more concerned with constructing a 
model of the computational abilities of neural networks, rather than a model of their 
implementation details. 
7 NONMONOTONE MULTILINEAR NEURAL NETWORKS 
Olafsson and Abu-Mostafa (1988) study information capacity of functions 
f (w ,...,w,):R ' -->B for w i  R, 1</n, where 
f (w,...,w,)(x ..... x,)=g (5"wixi) 
i=l 
and g is the alternating threshold function g(h,h2,...,hk_):R-->B for some monotone 
increasing hi,R, l_./<k, defined by g(x)=O if hva.x<h2+ for some O_i<n/2. We 
will call f an alternating weighted multilinear threshold function, and a neural net- 
work constructed from functions of this form alternating multilinear neural networks. 
Alternating multilinear neural networks are closely related to k-ary neural networks: 
Theorem 7.1 : For every k-ary neural network of size z and weight w there is an 
equivalent alternating multilinear neural network of size z log k and weight 
(k-1)w log (k-l) which produces the output of the former in binary notation. 
Proof (Sketch): Each k-ary gate is replaced by log k gates which together essentially 
perform a "binary search" to determine each bit of the k-ary gate. Weights which in- 
crease exponentially are used to provide the correct output value. [] 
Theorem 7.2: For every alternating multilinear neural network of size z and weight 
w there is a 3t-equivalent k-ary neural network of size 4z and weight w+4z. 
Proof (Sketch): Without loss of generality, assume k is odd. Each alternating gate is 
replaced by a k-ary gate with identical weights and thresholds. The output of this gate 
goes with weight one to a k-ary gate with thresholds 1,3,5,...,k-1 and with weight 
minus one to a k-ary gate with thresholds -(k-I),...,-3,-1. The output of these gates 
goes to a binary gate with threshold k. [] 
Both k-ary and alternating multilinear neural networks are a special case of nonmono- 
tone multilinear neural networks, where g:R-->R is the defined by g(x)=ci iff 
hi..x<hi+, for some monotone increasing hi,R, l<i<k, and Co .... ,ck_Zk. Non- 
monotone neural networks correspond to analog neural networks whose output func- 
tion is not necessarily monotone nondecreasing. Many of the restfit of this paper, in- 
cluding Theorems 5.1, 6.5, and 6.6, also apply to nonmonotone neural networks. The 
size, weight and running time of many of the upper-bounds can also be improved by a 
small amount by using nonmonotone neural networks instead of k-ary ones. The de- 
tails are left to the interested reader. 
8 MULTILINEAR HOPFIELD NETWORKS 
A multilinear version of the Hopfield network called the quantized neural network has 
been studied by Fleisher (1987). Using the terminology of (Parberry, 1990), a quan- 
tized neural network is a simple symmetric k-ary neural network (that is, its intercon- 
nection pattern is an undirected graph without self-loops) with the additional property 
that all processors have an identical set of thresholds. Although the latter assumption 
Analog Neural Networks of Limited Precision I 709 
is reasonable for binary neural networks (see, for example, Theorem 4.3.1 of Parberry, 
1990), and ternary neural networks (Theorem 4.1), it is not necessarily so for k-ary 
neural networks with k>3 (Theorem 4.2). However, it is easy to extend Fleisher's 
main result to give the following: 
Theorem 8.1: Any productive sequential computation of a simple symmetric k-ary 
neural network will converge. 
9 CONCLUSION 
It has been shown that analog neural networks with limited precision are essentially 
k-ary neural networks. If k is limited to a polynomial, then polynomial size, constant 
depth k-ary neural networks are equivalent to polynomial size, constant depth binary 
neural networks. Nonetheless, the savings in time (at most a constant multiple) and 
hardware (at most a polynomial) arising from using k-ary neural networks rather than 
binary ones can be quite significant. We do not suggest that one should actually con- 
struct binary or k-ary neural networks. Analog neural networks can be constructed by 
exploiting the analog behaviour of transistors, rather than using extra hardware to inhi- 
bit it. Rather, we suggest that k-ary neural networks are a tool for reasoning about the 
behaviour of analog neural networks. 
Acknowledgements 
The financial support of the Air Force Office of Scientific Research, Air Force Sys- 
tems Command, USAF, under grant numbers AFOSR 87-0400 and AFOSR 89-0168 
and NSF grant CCR-8801659 to Ian Parberry is gratefully acknowledged. 
References 
Chandra A. K., Stockmeyer L. J. and Vishkin U., (1984) "Constant depth reducibility," 
SIAM J. Comput., vol. 13, no. 2, pp. 423-439. 
Fleisher M., (1987) "The Hopfield model with multi-level neurons," Proc. IEEE 
Conference on Neural Information Processing Systems, pp. 278-289, Denver, CO. 
Muroga S., Toda I. and Takasu S., (1961) "Theory of majority decision elements," J. 
Franklin Inst., vol. 271., pp. 376-418. 
Obradovic Z. and Parberry I., (1989a) "Analog neural networks of limited precision I: 
Computing with multilinear threshold functions (preliminary version)," Technical Re- 
port CS-89-14, Dept. of Computer Science, Penn. State Univ. 
Obradovic Z. and Parberry I., (1989b) "Analog neural networks of limited precision II: 
Learning with multilinear threshold functions (preliminary version)," Technical Report 
CS-89-15, Dept. of Computer Science, Penn. State Univ. 
Olafsson S. and Abu-Mostafa Y. S., (1988) "The capacity of multilevel threshold func- 
tions," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 2, pp. 
277-281. 
Parberry I., (To Appear in 1990) "A Primer on the Complexity Theory of Neural Net- 
works," in A Sourcebook of Formal Methods in Artificial Intelligence, ed. R. Banerji, 
North-Holland. 
