hierarchical clustering - Semantic Scholar

7 downloads 0 Views 601KB Size Report
Milan Jovovic a, Slavica Jonic a,b, Dejan Popovic a,b,* ... E-mail address: [email protected] (D. Popovic) ...... [9] Pitas I, Milos E, Venetsanopoulos AN.
Medical Engineering & Physics 21 (1999) 329–341 www.elsevier.com/locate/medengphy

Automatic synthesis of synergies for control of reaching — hierarchical clustering Milan Jovovic´ a, Slavica Jonic´ a b

a,b

, Dejan Popovic´

a,b,*

Faculty of Electrical Engineering, University of Belgrade, Bulevar Revolucije 73, 11000 Belgrade, Yugoslavia Center for Sensory Motor Interaction, Frederik Bajers Vj 7, D-3, Aalborg University, Aalborg 9220, Denmark Received 2 December 1998; accepted 2 July 1999

Abstract In this paper we describe a novel method for determining synergies between joint motions in reaching movements by hierarchical clustering. A set of recorded elbow and shoulder trajectories is used in a learning algorithm to determine the relationships between angular velocities at elbow and shoulder joints. The learning algorithm is based on optimal criteria for obtaining the hierarchy of descriptions of movement trajectories. We show that this method finds complex synergism between optimal joint trajectories for a given set of data and angular velocities at the shoulder and elbow joints. Three other machine learning techniques (ML) are used for comparison with our method of hierarchical clustering of trajectories. These MLs are: (1) radial basis functions (RBF), (2) inductive learning (IL), and (3) adaptive-network-based fuzzy inference system (ANFIS). Better error characteristics were obtained using the method of hierarchical clustering in comparison with the other techniques. The advantage of the method of hierarchical clustering with respect to the other MLs is in integrating the spatial and temporal elements of reaching movements. Determination and analysis of spatio-temporal events of movement trajectories is a useful tool in designing control systems for functional electrical stimulation (FES) assisted manipulation.  1999 IPEM. Published by Elsevier Science Ltd. All rights reserved. Keywords: Clustering; Machine learning; Motor control; Reaching; Rehabilitation; Synergy

1. Introduction The spectrum of functional motions which a human can master during his/her lifetime is impressive. Most of these movements are learned in early childhood, but the repertoire is increased on a daily basis if so required. According to present knowledge this is possible because each functional motion relies upon perceptuo-motor coordination, that involves three major components: the intake of sensory information, the internal coding of this information into a format that is adequate for driving a motor action, and the generation of movements themselves. The last statement supports findings of several scientists about the organization of motor control with respect to the activity of neural cells within the premotor and motor cortex (e.g. [1,2]). Neurophysiological studies of Georgopoulos et al. [1–

* Corresponding author: Tel.: +45-96358726; fax: +45-98154008. E-mail address: [email protected] (D. Popovic´)

3] are of special interest for understanding the control of reaching movements. In their studies it is shown that single nerve cells in the motor cortex are excited prior to making movements in some directions and inhibited prior to movements in other directions, having therefore some predictive value. These results suggest that the motor cortex is concerned with the general planning of the direction of movement, rather than the details of the load or the muscles that will have to be activated to produce a desired end point. Reaching movements are produced when the decision has been taken to approach a target. Clearly, these signals must be transformed to produce the right movements under various conditions, but the mechanisms underlying these transformations remain unknown. Some aspects must arise from the pattern of anatomical connections from the motor cortex to the spinal cord and others from the variety of inputs that impinge on the motor neurons from sensory pathways and from other brain centres involved in the control of movement. Biological systems control direction of movements at different levels of complexity, from accurately planned movements to reflexes [2].

1350-4533/99/$ - see front matter.  1999 IPEM. Published by Elsevier Science Ltd. All rights reserved. PII: S 1 3 5 0 - 4 5 3 3 ( 9 9 ) 0 0 0 5 8 - 2

330

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

Some subjects with tetraplegia retain control of their shoulder movements and need to be assisted in controlling their elbow joint in order to properly direct their hand while performing the reaching tasks. In able-bodied subjects these movements are fully automatic, and feedforward control ensures the synergistic activity of many muscles. The synergistic control paradigm between joint rotations was proposed in [4] in assisting the reaching tasks by functional electrical stimulation (FES) of elbow extensor muscles. In that paper the linear form of covariation of angular velocities of shoulder and elbow joints is used to simplify the design of the control system for reaching. The relation between the angular velocities of the joints during the reaching movements is, however, highly nonlinear. The aim of this research is to investigate the use of machine learning techniques (MLs) in determining the synergistic behaviour between the joints. In this paper a technique of hierarchical clustering of joint trajectories is proposed as a method for acquiring the knowledge based on spatio-temporal features representing the nonlinear relationship between the angular velocities at the shoulder and elbow joints. The hierarchy of descriptions of movement trajectories forms a rulebase of a nonanalytical control system for reaching. The advantage of this method with respect to the other MLs we tried is in integrating the spatial and temporal elements of reaching movements. Neural network architectures have been used [5–7] in representing and generating aiming movements of a limb. Massone and Bizzi [5] used three-layer sequential networks in learning a sensory-motor transformation to drive aiming movements. This model produces time trajectories of a limb from a starting posture toward targets specified by sensory stimuli. The network is properly adjusted to the bell-shaped velocity profile of hands on the trajectories. They showed that the same task could not be learned by a linear network. A modular neural network architecture has been used in [6] to learn piecewise control strategy in a robot motion controlled task. A plant’s parameter space is adaptively partitioned into a number of regions, and a different network learns a control law in each region. They showed that the modular architecture’s performance is superior to that of a single network in satisfying the requirements of nonlinear control system design. Generation of muscle stimulation patterns for the control of arm movements by neural networks was presented in Lan et al. [7]. They showed, by comparison, that the feedforward network combined with recurrent feedback and input time delays can most effectively capture the optimal temporal profiles of muscle stimulation. Only single joint movements were considered in that paper. In our work, we deal with the relationship between angular velocities at the shoulder and elbow joints to control the reaching task. The experiments were performed in order to determine the synergies between the

angular velocities at the joints by recording the angular coordinates while subjects were performing the reaching movements. Information from the history of the shoulder angular velocity was used as input in the three MLs (1, RBF with orthogonal least square learning algorithm [8]; 2, IL based on minimizing entropy [9,10]; and 3, ANFIS with subtractive clustering for initial identification of fuzzy inference system and neural network for tuning such obtained fuzzy rules [11]). We used the MLs to investigate the change of the angular velocity at the elbow joint on output. The hierarchical clustering technique, on the other hand, uses the prediction information in the learning algorithm differently. Scale of computation of spatio-temporal features of the joints angular velocity map is selected hierarchically, starting from the highest scale and gradually decreasing the scale of computation. We shall describe the method of hierarchical clustering in the next section. The techniques of RBF, IL and ANFIS were described elsewhere [12].

2. Method In this work we consider the problem of the formation of trajectories from the standpoint of the complexity of the control scheme. Covariation of the set of trajectories is considered locally, in time domain (time windows) and globally, computing joint variation of the set of angular velocities a⬘ and b⬘, where a⬘ is the angular velocity at the shoulder joint and b⬘ is the angular velocity at the elbow joint, as shown in Fig. 1. In the process, a two-dimensional set of curves is quantized in time and amplitude domain by the method of hierarchical clustering making a hierarchy of descriptions of trajectories [a⬘(t)b⬘(t)].

Fig. 1. Set up for recording joint angles a and b (from [4] with permission).

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

2.1. Probabilistic basis of technique of clustering Clustering techniques are applied in many problems (like pattern recognition, learning, source coding, image and signal processing [13,14]) where a priori knowledge about the distribution of the data is not available. Simply stated, the goal is to partition a given data set into different categories such that the data points in each category are as compact as possible. This tool is widely used for analysing multi-dimensional data in diverse disciplines such as biology, social science and astronomy. The clustering problem statement is usually made mathematically precise by defining a cost energy to be minimized. In this paper a probabilistic framework is constructed, which is based on the principle of maximum entropy [15]. The algorithm is derived from the analogy to the model of free energy, originally used in statistical mechanics for modelling different complex systems. The deterministic annealing procedure is introduced by controlling the Lagrange multiplier t=1/T, which is inversely proportional to temperature in the physical analogy. Defined cost function is nonconvex (which is the case with practically all useful cost functions [16]), which makes the applied clustering technique a nonconvex optimization problem. We used the model of free energy in processing the system of curves [a⬘(t)b⬘(t)]. In the algorithm we use the term group window for the time window of computation of average levels of amplitudes a⬘ and b⬘, which we will call group vector. The process of selection of groups is based on the probabilistic considerations where every data point inside a group window belongs to the group. We start from the largest scale, selecting the whole time interval of the set of curves [a⬘(t)b⬘(t)] to be of the window size W. The estimation of the group vector is formalized through the following algorithm: Free energy F is associated with the window W: 1 F(t)⫽⫺ log Z t where, Z⫽



2

e−tz

W

and, z2⫽(a⬘(w)⫺A)2⫹(b⬘(w)⫺B)2. Vector [A B] is the group vector, in this case it is the representation of the average values of the data for the given time window of computation W([a⬘(w)b⬘(w)]). Free energy is also an error function for the given distortion measure z2. Minimizing the free energy with respect to the group vector [A B] is equivalent to the minimiz-

331

ation of the distortion of the signal represented by the vector [A B] inside the time interval W. Function Z is thus called the partition function. For small values of the scale parameter (t→0) partition function Z becomes averaged out by multiplying the measure of distortion with a small weighting factor. As the scale parameter is monotonically increased, the partition function is shaped by increasingly dominant energy factors tz2. For a given scale t, we estimate the group vector by minimizing the free energy (finding the equilibrium point of the system of equations):

冘 冘

∂F ⫽ ⫺2(a⬘(w)⫺A)P⫽0 ∂A W

(1)

∂F ⫽ ⫺2(b⬘(w)⫺B)P⫽0 ∂B W where 2

e−tz P⫽ . Z Function P is the Gibbs function of the probability distribution. Parameter t in the function P controls the degree of associations of data points with the group vector. With the increase of parameter t associations of data points there is a corresponding decrease with the distance to the group vector. The system of Eq. (1) gives a fixed point iteration: A⫽

冘 冘

a⬘(w)P⫽g1(t,A,B)

(2)

W

B⫽

b⬘(w)P⫽g2(t,A,B).

W

The convergence characteristics of the map (Eq. (2)) is equivalent to that of the gradient descent, Y→Y⫺

∂F ∂Y

for Y=[A B]. We check the stability of the map by computing the first order derivatives of the map: ∂2 F 1⫺ 2 ⬍1 ∂Y

(3)

From Eq. (3) it is obvious that for the map in Eq. (2) to be stable the Hessian of the free energy has to be a positive-definite. Also note, that as long as the Hessian of the free energy is a positive-definite, we can apply the convex estimation of the group vector. The point in the scale when the Hessian of the free energy loses its positive-definiteness indicates the point in iteration when the nonconvex component is dominant in the estimation

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

332

procedure. It is natural then that our group window splits in two along the principal component corresponding to the maximal singular value of the scatter matrix of the map (Eq. (2)), defined as:

S⫽

冤 冥 ∂g1 ∂g1 ∂A ∂B

∂g2 ∂g2 ∂A ∂B

冘 冘 冘

∂g1 ⫽2t (a⬘(w)⫺A)2P ∂A W

.

∂g2 ⫽2t (b⬘(w)⫺B)2P ∂B W

The algorithm proceeds as follow: if the map becomes unstable (the Hessian of the free energy is no longer a positive-definitive) we split the group window in two, along the principal component that corresponds to the maximal singular value of the scatter matrix. If the Hessian of the free energy is positive-definite at the equilibrium point, we proceed by defining the variance function: V⫽

of the group vectors for these windows continues with separately defined maps as in Eq. (2). Let us analyse now the entries of the scatter matrix of the map in Eq. (2). After a few lines of derivation we have,



∂g1 ⫽2t (a⬘(w)⫺A)(b⬘(w)⫺B)P, ∂B W and by the symmetry of the partial derivatives, ∂g2 ∂g1 ⫽ . ∂A ∂B The scatter matrix is, therefore, of the following form:

z2P.

S⫽2t

W

We also have,

冘 冘

∂V ∂F ⫽(tV⫺1) ⫹2t (a⬘(w)⫺A)z2P ∂A ∂A W ∂V ∂F ⫽(tV⫺1) ⫹2t (b⬘(w)⫺B)z2P. ∂B ∂B W For ∂F/∂A=∂F/∂B=0, we move away from the equilibrium point in the direction of the maximal decrease of the variance function:

冘 冘

∂V dA⫽⫺ ⫽⫺2t (a⬘(w)⫺A)z2P ∂A W ∂V dB⫽⫺ ⫽⫺2t (b⬘(w)⫺B)z2P. ∂B W And we update the scale parameter t as: ⌬F⬘⫽dA2⫹dB2

冤冘

冘 W



⌬t⫽min ⌬F⬘⬎0. W

We continue next with the convex minimization of the free energy and the estimation of the group vectors. In the case that a new pair of the group windows is created by splitting the original window in two, the estimation

W

(a⬘(w)−A)(b⬘(w)−B)P

W

(a⬘(w)−A)(b⬘(w)−B)P





.

2

(b⬘(w)−B) P

W

Computationally, we check the stability of the map in Eq. (1) by evaluating singular values of the scatter matrix S: max|sv(S)|⬍1.

(4)

The group for which the above condition is not satisfied goes through the process of phase transition. We split that group window in two according to the following rule: for every point in the original window W, we make the decision as to which one of the newly created windows will that point w belong to by summing up the projections of error vectors along the principal component vector corresponding to the maximum singular value of the scatter matrix S in the neighbourhood of that point:

冘 ⫹1

and,



(a⬘(w)−A)2P

i⫽⫺1

2

e−ti [eAeB][EAEB]TP⫽



⬎0 w苸W1

⬍0 w苸W2.

(5)

In the formula above, the vector [EA EB] is the principal component vector corresponding to the maximum singular value of the scatter matrix S, and [eA eB]=[(a⬘(w)⫺A)(b⬘(w)⫺B)] is the point error vector. Also note, that t here plays the role of the temporal

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

333

extent of integration and we use it consistently to build the discriminant function. Therefore, the process of selection of groups expresses a compromise between the accuracy of the computation of the group vectors and the density of the sampling by the group windows. In those regions where the data covary more the sampling process is expressed with smaller group windows. In relatively flat regions group windows are relatively larger in size. 3. Results For the purpose of recording angular trajectories of the shoulder and elbow joints we used flexible goniometers (Penny & Giles, UK). The goniometers were connected through the board for A/D conversion [17] and RS-232 link to a PC computer. We recorded reaching movements in the horizontal plane, for different starting and target positions of the hand. Before the recording session, the elbow’s goniometer was calibrated for the position of full extension of the elbow (b=180°) and the position of b=90°. The shoulder’s goniometer was calibrated for the positions of full extension a=0° and a=90°, as in Fig. 1. The angular trajectories [a(t) b(t)] of the shoulder and elbow joints are filtered and fitted polynomially before the first derivative is taken. The set of angular velocity trajectories that included the same starting and target positions of the hand, are scaled to the same time interval. The scaling of the bell shaped profile of the angular velocities was done such that the product of the maximum of profile of velocities and the time duration is kept constant. Fig. 2 shows the set of eight trajectories of angular velocities at the shoulder and elbow joints after the scaling operation. We start with the computation of the amplitude levels of the set of curves with the value of the scale parameter of computation t=0 and monotonically increase the scale parameter up to the point when the process reaches the critical value of the scale parameter and the point of formation of new windows of computation. Fig. 3 shows angular velocity trajectories at the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=0.52×10⫺4. The original window is divided in two according to the rule given in Eq. (5). Two group windows are formed after passing through the process of phase transitions when the computation reaches the critical value of the scale parameter tc=0.52×10⫺4. One of the group windows is discontinuous which can be seen in Fig. 4. The process of computation of amplitude levels of the signal is continued up to the point when one of the group windows reaches the point of phase transition at the value of the scale parameter tc=1.24×10⫺4 and thus forms two new windows from the original group window.

Fig. 2. Set of angular velocity trajectories of the shoulder and elbow joints for the same starting and target positions of the hand scaled to the same time interval.

Three group windows are used for the computation of the amplitude levels of the signal after one of the groups passes through the process of phase transition at the value of the scale parameter tc=1.24×10⫺4. The value of the scale parameter tc=2.77×10⫺4 indicates the point of phase transition of one of the groups and the formation of four group windows (Fig. 5). Fig. 6 shows the process of computation of the amplitude levels of the signal with four group windows while reaching the critical value of computation tc=2.96×10⫺4 for one of the group windows. In the hierarchy of scale computation, five group windows are formed after passing through phase transition at tc=2.96×10⫺4 up to reaching the critical value of the scale parameter tc=3.82×10⫺4 (Fig. 7). The process of the optimization of the set of the joint angular velocity trajectories with the amplitude levels of the signal is represented with the eight group windows shown in Fig. 8. One of the group windows, represented

334

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

Fig. 3. Set of angular velocity trajectories at the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=0.52×10⫺4.

with the amplitude levels [A B]=[27.83 31.28] is discontinuous. For the error function of computation we use the logarithmic function of free energy adjusted to the zero level. The zero level adjustment coincides with the case when all data points are equal to the group vector: 1 E⫽⫺ (logZ⫹logN) t where N is the number of data points. Fig. 9 shows the error of computation by hierarchical clustering. 3.1. Comparison of results of the machine learning techniques for the control of reaching The results of mapping the reaching movements by the technique of hierarchical clustering are shown in Fig. 10. Mapping of the trajectories for four different starting

Fig. 4. Set of angular velocity trajectories at the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=1.24×10⫺4.

and target positions of the hand are shown in the figure. The trajectories describe the movements of the hand from the starting position to the target and returning back to the starting position. In the same row are shown trajectories and the results of mapping for the same starting and target positions of the hand in the workspace. The first column shows the trajectories of the angular velocity at the shoulder joint and the second column the trajectories of the change of angular velocity at the elbow joint. The results of clustering are shown with the trajectories quantized with 8 cluster groups, which gives 8 time intervals and 2×8 amplitude levels. Fig. 11 shows the results of mapping the reaching movements by the technique of RBF in comparison with the results of the clustering technique. For the inputs of the RBF we used two signals of angular velocity of the shoulder joint advancing in time by 50 and 100 ms with respect to the signal of the change of the angular velocity at the elbow joint, which is mapped on output. A con-

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

335

Fig. 5. Set of angular velocity trajectories of the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=2.77×10⫺4.

Fig. 6. Set of angular velocity trajectories of the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=2.96×10⫺4.

tinuous signal on output is uniformly quantized with 8 amplitude levels for the purpose of comparison with the results of the clustering technique. The spread constant of the radial basis functions was chosen to be 20, and the network had 967 neurons in hidden layers. We trained the network using a sequence built from every two arbitrarily chosen movements from each of the four sets shown in Fig. 10. Testing was done for all the movements. In Fig. 11 the movements from the first set are marked with #1, from the second #2, and from the fourth with #3. The results of mapping the movements from the third set are not shown because we did not have enough movements in the set to test the trained network with the movements that are not seen during the training of the network. The first two rows show the results of testing with the movements that are seen during the process of training. The last two rows show the results of testing the network with the movements that are not seen in the process of training. The

diagrams in Fig. 11 show the biggest error of mapping with the technique of RBF. For this reason they appear overstretched in comparison with those shown with the results of the clustering technique in Fig. 10. The same input and output signals are used in the algorithms for mapping the reaching movements by the techniques of IL and ANFIS as for the RBF. In the technique of IL the input signal was divided on 10 fixed levels (potentials), for applying the algorithm of minimal entropy. For the ANFIS technique, the radius of the clusters was chosen to be 0.5. The comparison of the results of IL and ANFIS techniques with the results of the clustering technique are shown in Figs. 12 and 13, respectively. The continuous signal on output of ANFIS is quantized in the same way as for RBF, and it was chosen that IL gives 8 discrete levels on its output. All parameters, which had to be chosen for RBF, IL and ANFIS, were chosen after numerous and tedious trials to get as good as possible pattern mapping.

336

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

Fig. 7. Set of angular velocity trajectories of the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=3.82×10⫺4.

Fig. 8. Set of angular velocity trajectories of the shoulder and elbow joints and the amplitude levels of the signal while reaching the critical value of the scale parameter tc=5.1×10⫺4.

4. Discussion In Popovic´ [4] a nonanalytical hierarchical control system for reaching with FES was proposed. With this system it is possible to control the movements of the elbow joint of tetraplegic subjects that retain the control of their shoulder. The central parameter of this control system is the scaling factor of angular velocities of shoulder and elbow joints, C: a⬘⫽Cb⬘ It is shown in [18] that the parameter C depends only on starting and target positions of the hand and, therefore, expresses the initial value that has to be known before the initiation of the movement. This work is focused on the design of the second level of this hierarchical control system. Namely, the design of the look-up table which contains the desired values

Fig. 9.

The error of computation by hierarchical clustering.

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

337

Fig. 10. Results of mapping the reaching movements from four sets of movements by hierarchical clustering. The signals are normalized in amplitude: 3.59 rad/s (shoulder), 6.88 rad/s (elbow) for set #1; 2.22 rad/s (shoulder), 7.77 rad/s (elbow) for set #2; 8.85 rad/s (shoulder), 5.92 rad/s (elbow) for set #3; 6.18 rad/s (shoulder), 7.13 rad/s (elbow) for set #4.

of angular velocity and the change of angular velocity at the elbow joint and as a result gives the parameter of electrical stimulation of the elbow extensor muscles on the output. The aim is to properly direct the hand during reaching movements together with the desired profile of velocity. The method of hierarchical clustering used in this work is based on the computation of covariation of the data for every step in the hierarchy of the scales of computation. The technique of the deterministic annealing is used with the aim of uniting the best characteristics of stochastic and deterministic optimization techniques. From one side, it is deterministic which means it does not optimize the system of curves by generating random moves of computation. On the other side it is still the

method of annealing which means it aims global minimum instead of going to the nearest local minimum. As in the method of stochastic relaxation, this method makes an incremental progress in the optimization of curves for every step of incremental change of the scale parameter (temperature). The algorithm of deterministic annealing incorporates the level of noise in the energy function. This method effectively gives a family of energy functions identified with the temperature parameter, which represent the level of noise. The family of energy functions is convex on high temperature and becomes nonconvex as we decrease temperature. For t=0 (T→⬁) the energy function is convex and the global minimum can be easily located with some conventional method of convex opti-

338

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

Fig. 11. Comparison of results of mapping the reaching movements by RBF and hierarchical clustering. (• • • desired signal; - - - clustering; ——— RBF).

mization. Methods of deterministic annealing use conventional methods of convex optimization in a process of following the localized global minimum on high temperature for every step of the temperature cooling schedule. In the method of hierarchical clustering developed in this work, convex optimization of error function is achieved by the adaptive selection of time intervals of computation. Scale parameter t is an associative parameter of the data with the amplitude levels of the signal. The error of computation becomes relatively smaller in flat regions of the system of curves than in the regions where the system of curves covary more. As a result, selected group windows of the computation are relatively bigger in flat regions than in the regions of the stronger variation of the data. This fact can easily be seen in Figs. 3–8 which show the process of hierarchical clustering of the data. The process of phase transitions in the algorithm is represented by the sequence of formation of new group

windows at the critical values of the scale parameter. The group window of the group for which the condition of phase transition is reached is divided in two newly created group windows. This way we achieve a minimax optimization of the energy function. During the process of temperature cooling two effects are happening at the same time. On one side, by temperature cooling we progressively compute more accurately the group vectors. On the other side, the group vector maps become progressively more unstable with an increase of the nonconvex component of the energy function. The annealing mechanism which we use is based on the variance function for the given window of computation. If changes of variance with respect to the group vector are larger, naturally, we need a larger change in the scale of computation. On the other hand, if the change of the variance with respect to the group vector is smaller we make a more precise computation of the group vectors with a smaller change in the scale of computation.

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

339

Fig. 12. Comparison of results of mapping the reaching movements by IL and hierarchical clustering. (• • • desired signal; - - - clustering; ——— IL).

In the process of phase transition, the group window is divided in two according to the rule given in Eq. (2). Amplitude levels in the newly formed windows can significantly differ from those in the original widow. Group vector maps compute amplitude levels in the domain of the newly created windows of computation. Originally the group window is divided along the principal component vector corresponding to the maximum singular vector of the scatter matrix. The amplitude levels of the signal computed in the newly formed windows are located on the opposite sides of the amplitude levels that correspond to the joint window. This effect can be seen in Figs. 3–7 that show the hierarchy of formation of the groups (group windows and corresponding amplitude levels). In formula (2) the zero point on the direction of the maximum principal component corresponds to the group vector of the original group. Fig. 9 shows the error function of computation. During the process of phase transitions new groups are formed and new group vectors are mapped from the data.

We can see the regions of linear decrease of the error function during the process of annealing of the energy function. We can also see the points of abrupt decrease of the error function which correspond to the points of phase transitions. Newly formed windows are smaller than the original window and corresponding group vector maps become more effective on the new windows of computation, which is reflected in the abrupt decrease in the error of computation. Comparison of the results of mapping the reaching movements by hierarchical clustering and the techniques of RBF, IL, and ANFIS are shown in Figs. 11–13, respectively. Better error characteristics are obtained with the technique of hierarchical clustering than with the other techniques. The error of mapping by hierarchical clustering is uniformly distributed over the whole time interval of movement, which can be best seen in Fig. 10. This is due to the fact that the group parameters in this technique are obtained by the computation of global information of covariation of the angular

340

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

Fig. 13. Comparison of results of mapping the reaching movements by ANFIS and hierarchical clustering. (• • • desired signal; - - - clustering; ——— ANFIS).

velocity of the shoulder and the change of angular velocity of the elbow joint, by hierarchical the formation of groups. The error of mapping with RBF and ANFIS is the most pronounced on the border regions of the time interval of movements. This is due to the so called limited aperture problem that we have with these techniques. The limited receptive field of the RB functions and the local character of the clustering algorithm used in ANFIS are the limiting factors in mapping the data on the borders of the time interval of movement. The technique of IL showed the worst error values of mapping, even though it is based on an algorithm that computes the error signal globally, on the whole time interval. However, we did not make a numerical comparison of the mappings of RBF, IL, and ANFIS, with that of the clustering technique due to a different use of input signal as well as the interpretation of results on output. Namely, the most important characteristic of the technique of hierarchical clustering is that it is the quantiz-

ation technique in both the amplitude and time domains. We use this fact in building the rule based nonanalytical control system for reaching. Quantization of trajectories is obtained by the computation of joint covariation of data. For the 8 groups selected this means we can generate trajectories with 24 optimally computed parameters (8 time intervals and 2×8 amplitude levels). This is the minimum number of parameters and the result we use in building the look-up table of the hierarchical control system for reaching [4]. The technique of hierarchical clustering gives the hierarchy of descriptions of movement trajectories which also gives a possibility of including an error feedback mechanism in a hierarchical control algorithm.

References [1] Georgopoulos AP, Kalaska JF, Caminiti R, Massey JT. On the relations between the direction of two-dimensional arm move-

M. Jovovic´ et al. / Medical Engineering & Physics 21 (1999) 329–341

[2] [3]

[4]

[5] [6]

[7]

[8]

[9]

ments and cell discharge in primate motor cortex. J Neurosci 1982;2:1527–37. Georgopoulos AP, Ashe J, Smyrnis N, Taira M. The motor cortex and the coding of force. Science 1992;256:1692–5. Georgopoulos AP, Caminiti R, Kalaska J, Massey JT. Spatial coding of movement: a hypothesis concerning the coding of movement direction by motor cortical populations. Exp Brain Res Suppl 1983;7:327–36. Popovic´ D, Popovic´ M. Tuning of a nonanalytical hierarchical control system for reaching with FES. IEEE Trans on BME 1998;40:1024–31. Massone L, Bizzi E. A neural network model for limb trajectory formation. Biol Cybern 1989;61:417–25. Jacobs RA, Jordan MI. Learning piecewise control strategies in a modular neural network architecture. IEEE Trans on Sys Man and Cybern 1993;23:337–45. Feng H-Q, Crago PE, Lan N. Neural network generation of muscle stimulation patterns for control of arm movements. IEEE Trans on Rehab Eng 1994;2:213–24. Chen S, Cowan CFN, Grant PM. Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans Neural Net 1991;2:302–9. Pitas I, Milos E, Venetsanopoulos AN. A minimum entropy approach to rule learning from examples. IEEE Trans Sys Man and Cybern 1992;22:621–35.

341

[10] Nikolic´ Z, Popovic´ D. Control of locomotion: minimum entropy algorithm for capturing knowledge. J Autom Control 1996;6:81–94. [11] Jang JS. ANFIS: adaptive-network-based fuzzy inference systems. IEEE Trans Sys Man and Cybern 1993;23:665–85. [12] Jonic´ S, Jankovic´ T, Gajic´ V, Popovic´ D. Three machine learning techniques for automatic determination of rules to control locomotion. IEEE Trans on BME 1999;46:300–10. [13] Duda RO, Hart PE. Pattern classification and scene analysis. New York: Wiley, 1973. [14] Jain AK, Dubes RC. Algorithms for clustering data. Englewood Cliffs, NJ: Prentice-Hall, 1988. [15] Jayens ET. Information theory and statistical mechanics. In: Rosenkrantz RD, editor. Papers on probability, statistics and statistical physics. Dordrecht, The Netherlands: Kluwer, 1989. [16] Wiggins S. Introductions to applied nonlinear dynamical systems and chaos. New York: Springer-Verlag, 1990. [17] Popovic´ M, Tepavac D. A portable 8 channel gait kinematics recording unit. In: Proc. of the XIV IEEE EMSB, Paris, 1992. [18] Popovic´ M. New method to control reaching movements in humans after spinal cord injury, Ph.D. thesis. University of Belgrade: Faculty of Electrical Engineering, 1995.