A novel artificial neural network for sorting - Systems ... - IEEE Xplore

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 29, NO. 2, APRIL 1999

of the equalizer, first, the equalizer was trained with 5000 symbols to obtain optimum weight values. Then, subsequent 1000 symbols were transmitted and the outputs of the equalizer were obtained for plotting the eye-patterns. The signal constellation of 1000 noisy received signal samples (before equalization) along with the eye-patterns of 1000 symbols at SNR of 15 dB for CH = 6 with nonlinear model (NL = 1) is shown in Fig. 10. It may be seen that the effectiveness of channel equalization using ANN’s is superior to the linear LMSbased equalizer. Out of the three ANN structures, FLANN-based equalizer provides the most effective equalization for both linear and nonlinear channel models. Similar observations were made for all the six channels with linear and three nonlinear models studied.

[11] [12] [13]

[14] [15] [16]

VIII. CONCLUSION Considering channel equalization as a multi-category classification problem, it is shown that performance of ANN-based equalizers provides substantial improvement in terms of convergence rate, MSE floor level and BER. In a linear equalizer the performance degrades drastically with increase in EVR, specially when the channel is nonlinear. However, it is shown that, in the ANN-based equalizer the performance degradation with increase in EVR is not so severe. We have introduced a novel FLANN-based equalizer structure for adaptive channel equalization of nonlinear channels. Because of its single layer structure the FLANN offers advantages over MLP. The performance of PPN and MLP is found to be similar in most of the experiments. But the single layer PPN structure is preferable than MLP as it offers less computational complexity. Out of the three ANN equalizer structures, the performance of the FLANN is found to be the best in terms of MSE level, convergence rate, BER and computational complexity for linear as well as nonlinear channel models over a wide range of SNR and EVR variations. Because of computational advantages the FLANN may be used in other signal processing applications. REFERENCES [1] S. Chen, G. J. Gibson, and C. F. N. Cowan, “Adaptive channel equalization using a polynomial perceptron structure,” Proc. Inst. Elect. Eng., vol. 137, pt. 1, pp. 257–264, Oct. 1990. [2] S. Chen, G. J. Gibson, C. F. N. Cowan, and P. M. Grant, “Adaptive equalization of finite nonlinear channels using multilayer perceptrons,” Signal Process., vol. 20, pp. 107–119, 1990. [3] W. S. Gan, J. J. Soraghan, and T. S. Durrani, “A new functional-link based equaliser,” Electron. Lett., vol. 28, pp. 1643–1645, Aug. 1992. [4] G. J. Gibson, S. Siu, and C. F. N. Cowan, “The application of nonlinear structures to the reconstruction of binary signals,” IEEE Trans. Signal Processing, vol. 39, pp. 1877–1884, Aug. 1991. [5] S. Haykin, Adaptive Filter Theory, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1991. [6] M. Meyer and G. Pfeiffer, “Multilayer perceptron based decision feedback equalizers for channels with intersymbol interference,” Proc. IEE, vol. 140, pt. 1, pp. 420–424, Dec. 1993. [7] Y.-H. Pao, Adaptive Pattern Recognition and Neural Networks. Reading, MA: Addison-Wesley, 1989. [8] Y.-H. Pao, G.-H. Park, and D. J. Sobajic, “Learning and generalization characteristics of random vector functional-link net,” Neurocomputation, vol. 6, pp. 163–180, 1994. [9] J. C. Patra, “Some studies on artificial neural networks for signal processing applications,” Ph.D. dissertation, Indian Inst. Technol., Kharagpur, Dec. 1996. [10] J. C. Patra and R. N. Pal, “A functional link artificial neural network for adaptive channel equalization,” Signal Process., vol. 43, pp. 181–195, May 1995.

[17]

271

, “Functional link neural network-based adaptive equalization of nonlinear channels with QAM signal,” in Proc. IEEE Int. Conf. Systems, Man, Cybernetics, Vancouver, B.C., Canada, Oct. 1995, pp. 2081–2086. N. Sadegh, “A perceptron based neural network for identification and control of nonlinear systems,” IEEE Trans. Neural Networks, vol. 4, pp. 982–988, Nov. 1993. S. Siu, G. J. Gibson, and C. F. N. Cowan, “Decision feedback equalization using neural network structures and performance comparison with standard architecture,” Proc. Inst. Elect. Eng., vol. 137, pt. 1, pp. 221–225, Aug. 1990. J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles. Reading, MA: Addison-Wesley, 1981. B. Widrow and M. A. Lehr, “30 Years of adaptive neural networks: perceptron, madaline and backpropagation,” Proc. IEEE, vol. 78, pp. 1415–1442, Sept. 1990. Z. Xiang, G. Bi, and T.-L. Ngoc, “Polynomial perceptrons and their applications to fading channel equalization and co-channel interference suppression,” IEEE Trans. Signal Processing, vol. 42, pp. 2470–2479, Sept. 1994. S. S. Yang and C. S. Tseng, “An orthonormal neural network for function approximation,” IEEE Trans. Syst., Man, Cybern. B, vol. 26, pp. 779–785, Oct. 1996.

A Novel Artificial Neural Network for Sorting T. Tambouratzis

Abstract—An artificial neural network (ANN) is employed for sorting a sequence of real elements in monotonic (descending or ascending) order. Although inspired by harmony theory (HT), whereby the same construction as for the HT ANN is followed, the proposed ANN differs in the mode of operation, namely the obliteration of the consensus (harmony) function, the circumvention of simulated annealing as a means of settling to a solution, the simplification of the activation updating of the nodes of the upper layer, the clamping of the nodes of the lower layer, the gradual shrinking of the ANN and the use of an automatic termination criterion. The creation of the sorted sequence is progressive, whereby at most as many network updates are required as there are elements in the sequence. Ties between elements are resolved by simultaneous activation of the corresponding nodes. Finally, the min and max problems are solved in a single network update. Index Terms—Artificial neural networks, harmony theory, sorting.

I. INTRODUCTION Sorting constitutes a basic operation of many computing tasks, including VLSI design, digital signal processing, network communications, database management and data processing, for which it has been estimated that sorting operations account for over 25% of the total processing time. The significance of sorting is reflected in the multitude of sorting techniques that have been proposed over the previous decades. The main aim of these techniques, be they serial or parallel in construction and operation, is the minimization of the time and storage requirements of the sorting operation. Instructive overviews of serial and parallel sorting techniques are given in [1] and [2], respectively. Representative serial sorting techniques appear Manuscript received June 19, 1998; revised November 8, 1998. This paper was recommended by Associate Editor R. Popp. The author is with the Institute of Nuclear Technology—Radiation Protection, NCSR “Demokritos,” Aghia Paraskevi 153 10, Athens, Greece (e-mail: [email protected]). Publisher Item Identifier S 1083-4419(99)02296-7.

1083–4419/99$10.00  1999 IEEE

272


Fig. 1. The general structure of the HT ANN.

in [3]–[9], while, more recently, parallel sorting techniques have been proposed by [10]–[18]. In this piece of research, a novel artificial neural network (ANN), characterized by simplicity and transparency in construction as well as during operation, is proposed for sorting a sequence of N real elements fni g; i = 1; 2; . . . ; N . The same ANN can sort the elements in descending or in ascending order, while ties between the elements are accommodated for. The creation of the sorted sequence is progressive, whereby at most N node activation updates are required; additionally, an automatic termination criterion of operation is utilized. The proposed ANN is derived from harmony theory (HT) [19] whereby the same construction as for the standard HT ANN is followed. However, its mode of operation constitutes a modified version of the corresponding HT ANN in terms of the clamping of the nodes of the lower layer, the simplification of the activation updating of the nodes of the upper layer, the circumvention of the slow and time-consuming process of simulated annealing as a means of settling to a solution and the obliteration of the consensus (harmony) function. This paper is organized as follows: Section II describes the structure of the HT ANN, while Section III introduces the modifications leading to the ANN proposed for sorting. The general results as well as a comparison with three recent—and distinct—ANN sorting techniques appear in Section IV. Section V concludes the paper.

consensus (harmony) function associated with the HT ANN assigns a harmony value to every state corresponding to its fitness in constituting a solution of the problem; the states where the harmony function is maximized (harmony maxima states) correspond to the optimal solutions. The harmony value is raised when such activation values are assigned to the HT ANN nodes as to (for more details on the harmony function the reader is referred to [19] and [21]): a) maximize the number of active nodes of the upper layer, b) render the active nodes of the upper layer compatible (i.e. the signs of the nonzero weights connecting all the active nodes of the upper layer with the same node of the lower layer must be identical; this must hold for all the nodes of the lower layer), and c) render the nodes of the lower layer compatible (i.e. the signs of the activation values of the nodes of the lower layer and the signs of the weights of the connected active nodes of the upper layer must coincide). 4) The HT ANN employs simulated annealing in order to settle (reach a solution). During simulated annealing, the HT ANN performs a number of network updates (stochastic updates of the activation values of its nodes) which collectively aim at maximizing harmony. Each update is executed at a specific value of the temperature parameter T , which is decremented before the next update. The stochastic nature of the updates causes nonmonotonic ascent in terms of harmony which, in turn, ensures that the network settles at a global harmony maximum state (optimal solution). 5) During simulated annealing, each network update is performed in two passes for a specific value of the temperature parameter T . The first pass comprises the stochastic update of the activation values of the nodes of the upper layer, while the second pass comprises the stochastic update of the activation values of the nodes of the lower layer. For the first network update only, an initial random assignment of f01; +1g activation values to the nodes of the lower layer is made. Assuming that the HT ANN consists of x and y nodes in the upper and lower layer, respectively, the first pass is described by (1)–(3). The total input Ii (i = 1; 2; . . . ; x) from all the nodes of the lower layer to each node of the upper layer is calculated: y Ii

=

j =1

II. HARMONY THEORY ARTIFICIAL NEURAL NETWORKS (HT ANN’s) The ANN proposed for sorting is derived from the harmony theory (HT) ANN [19]. The main characteristics of the HT ANN, shown in Fig. 1, are as follows. Relating to construction 1) The nodes assume binary activation values and are partitioned into two groups, each of which is arranged in a separate layer. The nodes of the lower layer assume activation values of f01; +1g and represent the elements of the problem to be solved, while those of the upper layer assume activation values of f0; +1g and represent the constraints between the elements. 2) HT is furnished with a training phase whereby the connections and the accompanying weights evolve over time; the alternative of direct construction also exists. In both cases, the connections are symmetric and only allowed between nodes of different layers. The weights of the connections serve to enforce the constraints between the elements of the problem. Relating to the mode of operation 3) The HT ANN state is described by the activation values of its nodes; these values collectively represent the current problem state and, consequently, the proposed solution. The

wij

actlj

0

k

(1)

where wij denotes the weight of the connection between the th node of the upper layer and the j th node of the lower layer (j = 1; 2; . . . ; y ); actlj denotes the current activation value of the j th node of the lower layer and k denotes the threshold parameter of HT. k is crucial in guiding the HT ANN toward a state of maximum harmony: it specifies the required degree of agreement between the connections of a node of the upper layer and the activation values of the connected nodes of the lower layer in order for the node of the upper layer to raise harmony by becoming active (for more details the reader is referred to [19] and [21]). Ii is processed through a sigmoid nonlinearity, producing a quantity P between 0 and 1 i

P

=

1

1 + e0I=T

(2)

where I = Ii . The new activation value actui is produced by comparing P with a random number R in the range [0; +1] actui =

+1 0

if P > R otherwise

(3)

The second pass is described by (2), (4), and (5). The total input (j = 1; 2; . . . ; y ) from all the nodes of the upper layer to

Ij


each node of the lower layer is calculated: I

j =2

x j =1

ij actui

(4)

w

which is again processed as shown in (2) for activation value actlj is given by actlj =

+1 1

0

I

=

if P > R otherwise.

j . The new

I

(5)

The comparisons in (3) and (5) are responsible for the stochastic nature of the activation updates of the nodes of both layers. The interested reader is referred to [21] for more details on the effect of T on the sharpness of the sigmoid in (2) and, subsequently, on the monotonicity of the ascent in terms of harmony. Directly constructed HT ANN’s have been employed in constraintpropagation tasks (electrical circuit operation, scene analysis via line labeling) [19], [20], in optimization tasks (map coloring, the n-queens problem, graph planarization, satellite broadcasting) [21]–[26] and, more recently, in clustering tasks [27]. It has been demonstrated that, for appropriate values of the HT parameters, an optimal solution of the problem is invariably settled upon. Additionally, if more than one optimal solutions exist, each of them is settled upon with roughly equal probability.

the first update determines the min and max elements of the sequence. 50 ) Each network update comprises a single deterministic pass of activation. The activation values of the nodes of the lower layer aclj (j = 1; 2; . . . ; N ) are uniformly clamped to +1 or 01 depending on whether the desired order of sorting is descending or ascending, respectively. Owing to clamping, the second pass of activation is obliterated. Concerning the first pass, (2) and (3) are substituted by actui =

FOR

ij =

+1; 1;

0

A;

(i; j = 1; 2; . . . ; N ),

if if if

SORTING

i > nj i < nj ni = nj

=

+1; 1;

0

(8)

i = Compi 0 km

I

i

I

(9)

is decomposed into a constant term Compi (i = and the variable term km 1 Compi equals

Compi = =

N

ij actlj N wij ; j =1 0 Nj=1 wij ;

j =1

w

if descending order if ascending order

(10)

while km (m = 1; 2; . . . ; N ) stands for the progressively decreasing threshold parameter employed at the network updates and is given by

m = N + 1 0 2m

k

(11)

such that the required degree of agreement in order for a node of the upper layer to become active is reduced at each update. As a result of (8)–(11), the nodes of the upper layer assume activation values of +1 in the order in which the corresponding elements ni appear in the sorted sequence. In the event of ties between l elements at a particular value of km , these elements are added to the sorted sequence at the same network update, while the threshold parameter jumps l consecutive values for the next network update (km+l ). If M distinct elements appear in the given sequence, exactly M network updates are required for producing the entire sorted sequence.

n

(6)

n

where the value of desired order of sorting A

i0 i