Gesture Recognition Based on Kinect and sEMG ... - Springer Link

Mobile Networks and Applications https://doi.org/10.1007/s11036-018-1008-0

Gesture Recognition Based on Kinect and sEMG Signal Fusion Ying Sun 1,2 & Cuiqiao Li 1 & Gongfa Li 1,2 & Guozhang Jiang 1,2 & Du Jiang 2 & Honghai Liu 3 & Zhigao Zheng 4 Wanneng Shu 5

&

# Springer Science+Business Media, LLC, part of Springer Nature 2018

Abstract A weighted fusion method of D-S evidence theory in decision making is proposed to aim at the problem of lacking in the distribution of trust, data processing and precision in D-S evidential theory. The method of gesture recognition based on Kinect and sEMG signal are established. Weighted D-S evidence theory is used to fuse Kinect and sEMG signals and the simulation experiment is made respectively. The stimulation results show that comparing with other experimental methods, the decision fusion method based on weighted D-S evidence theory has higher utilization efficiency and recognition rate. Keywords Gesture recognition . D-S evidence theory . sEMG . Kinect . signal fusion

1 Introduction With the development of science and technology, humancomputer interaction technology is gradually mature. As a new type of human-computer interaction, the gesture recognition has gradually become the focus of research [1]. In humancomputer interaction, Microsoft Corp released the equipment, Kinect. It not only can capture the gesture video information, but also can reflect the depth image and gray image of the pixel point to the lens distance, and has certain research value. But it has some limitations, such as the limited view angle of camera and small scanning range, so it is difficult to differentiate the background image in complex environment [2]. In medical research, gesture recognition is achieved mainly by

detecting bioelectrical signals of the human body. Electromyogram, EMG is the main subjects [3]. Generally, EMG signals are collected by electrodes. It has no harm to the human body when detecting, and the detection method is simple. However, when the surface electrode is detecting, signal noise will be collected at the same time, resulting in more complex data collected. Therefore, in the process of signal acquisition and analysis, the signal must be processed by a certain way. Liu Yarui brings forward a method of human gesture identification based on Fourier Jacobi matrix deformation. The method is based on the depth sensor of Kinect, and the depth image can be obtained by using the depth information obtained, and then the corresponding color image is obtained [4].

* Gongfa Li [email protected] * Zhigao Zheng [email protected]

Wanneng Shu [email protected] 1

Key Laboratory of Metallurgical Equipment and Control Technology of Ministry of Education, Wuhan University of Science and Technology, Wu Han 430081, China

2

Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan 430081, China

Guozhang Jiang [email protected]

3

School of Computing, University of Portsmouth, Portsmouth PO1 3HE, UK

Du Jiang [email protected]

4

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

Honghai Liu [email protected]

5

College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China

Ying Sun [email protected] Cuiqiao Li [email protected]

Mobile Netw Appl

The defects in the above methods can be improved by the new information processing technology. The traditional single sensor information acquisition system becoming more and more difficult to meet the complex practical applications, the fusion application of multisensory have been made a great deal of research [5–7]. At present, multi-sensor information fusion technology has been widely used in industrial production, intelligent production, medical physiological characterization detection, artificial intelligence, human-computer interaction and other fields. The information collected by different sensors are analyzed and processed. The depth data of computer technology is fused. More comprehensive data information can be obtained. Compared with the data processing technology of single sensor, the information obtained by multisensory is multidimensional and comprehensive, which makes the acquisition system have good stability. The time and space information of sensor data can be better utilized, so as to increase the credibility of data acquisition, and enhance the identification ability of the acquisition system [8, 9]. This information was collected by using a variety of sensors at the same time. It is carried out comprehensive treatment to eliminate the errors existing in the system. More reasonable evaluation is made for the detected object, so as to make the correct judgment. According to the different characteristics of sEMG signal and Kinect input, taking into account the synchronous acquisition of sEMG signals and Kinect image signals, the data collected by two kinds of equipment is fused. It is combining sEMG signal and Kinect image signal, and more efficient identification scheme of gesture is sought. Target recognition using multiple sensors has the following main advantages [10–12]: The time and space coverage of the identification system is extended, and the survivability of the identification system is improved. It can give full play to the advantages of each sensor to improve the target recognition rate. The performance of multisensory anti-jamming is better than that of single sensor, which can reduce or eliminate the deception and interference of non-target object. It can improve the stability of the identification system and greatly improve the validity, reliability and fault tolerance of the identification results.

2 Identification of upper limb movements based on Kinect Color cameras and depth sensors of Kinect are used to collect information. In order to identify human movements in depth images, many researchers have done relevant research. For example, by using Kinect to obtain depth images, filtering algorithms are proposed to track human skeletons. However, using Kinect’s depth sensor cannot effectively and accurately identify human gestures. Compared with depth sensor, Kinect color camera can provide high-precision image information,

and it has strong applicability for gesture recognition in complex environment. There are many kinds of hand gestures. This paper focuses on the identification of discrete gesture movements. According to the existing experimental equipment and conditions, and the identification methods based on Kinect and sEMG signals and the requirements of gesture recognition based on data fusion method, a set of standard static gesture sets is predefined. As shown in Fig. 1: Gestures are divided into three types of movements, including hand movements, wrist movements, and finger movements. (1) Two hand movements: (Hand Open, HO) and (Hand Close, HC) (2) Four wrist movements: (Wrist Flexion, WF), (Wrist Extension, WE), (Wrist Pronation, WP), and (Wrist Supination, WS) (3) Four finger movements: (Finger-Index, FI), (FingerMiddle, FM), (Finger-Ring, FR) and (Finger-Little, FL) In the process of gesture recognition based on Kinect and sEMG signals as well as gesture recognition based on data fusion method, the classification of the above 10 gestures are verified by simulation.

2.1 Graph segmentation and feature extraction for upper limb movements According to the relevant information of Kinect, the collected color images are stored mainly by RGB space [13–15]. However, the RGB model cannot be applied to all skin color data models. Therefore, the data of RGB space is converted into HSV space, and then data processing is carried out. According to the skin color segmentation algorithm, different people are basically the same value in the hue component of H, namely the colored race color tones are basically the same, and color saturation component of S is obviously different. So we can use HSV model for skin color segmentation [16–18]. In this case, RGB space needs to be transformed into HSV space [19–21]. The purpose of modeling for upper limb movements is to find the parameters that can represent different gestures, and classify the human upper limb movements more accurately. The concrete implementation methods include Fourier transform, characteristic line method [22–24] and data model based on histogram [25, 26]. A gesture contour is made up of a set of pixels at the boundary of a gesture. There is a lot of useless information in these pixels, such as redundancy and noise. Therefore, when extracting the gesture contour, the image information needs to be processed [27–29]. The useless information was eliminated by using the polygonal approximation, and then Douglas-Peuker, D-P algorithm [30–32] is used to make a polygon approximation for gesture contour.

Mobile Netw Appl Fig. 1 Gesture sets

2.2 Classification and identification of upper limb movements Sparse representation classification algorithm is proposed to solve the problem of face recognition. When the number of training samples is large, this method is very expensive and costly. In the process of real-time human-computer interaction, the experimental time should be as short as possible. Where, KNN algorithm has good classification performance, and it cannot require establishing the intuitive mathematical model. The classification performance is stable, and the classification effect is remarkable. In this paper, KNN algorithm is used to classify [33, 34].The acquired image is segmented by Skin segmentation, and then and the gesture contour is segmented from the background image. The predefined gesture movements, Finger-Index is taken as an example to make gesture segmentation. As shown in Fig. 2. Subsequently, the image is extracted using a resolution of 640 * 480. Because the image sampling point of this resolution is too many, the image sampling under this resolution is divided into two steps. The first step is to set the sampling parameters = 8, and the sampling points are reduced into 400. In the second step, set the sampling parameters = 8~18, and the 20 sets of experiments are done. 100 gesture images are collected by Kinect. Where 50 images as training samples, the remaining 50 images are taken as test samples to perform silhouette segmentation of gestures. D-P algorithm is used and then

the polygon Fourier descriptor is computed. The first 8 descriptor values are set into feature vectors. The collected 50 images are taken as training samples. Bayes classifier can select appropriate feature values to classify the gesture and obtain the classification results [35–37]. Figure 3 shows the sample distribution of Wrist movements and finger movements. The horizontal, vertical and vertical coordinates represent the values of the three descriptors. These six kinds of gestures have good separability in feature space. But it can also be found that there are overlapping in individual samples between some actions. These samples will have some effect on the training of classifiers, and the final recognition rate will be shown in Table 1.

Fig. 2 the segmentation image of Gesture FI

Mobile Netw Appl

Time domain method has the characteristics of simple calculation. It can extract the feature values in the process of sEMG signal acquisition, and meet the real-time requirements [40]. The signal amplitude of Mean Absolute Value, MAVand Auto Regressive, AR model coefficient are selected for gesture movement to extracted sEMG. The sEMG signal can be as an output signal, and the signal segment of the whole muscle movement can be viewed as a result of a special noise. So data processing of sEMG signals can be achieved by parameter model. The EMG signal has both deterministic and stochastic properties while the specific noise can show its randomness, and the parameters of the model can show its certainty. Auto Regressive (AR) model is used to analyze the sEMG signal and the improved Burg algorithm is used to solve the parameters of the AR model [41, 42]. Fig. 3 the three Fourier descriptors Feature distribution of six gesture movements

3 Human upper limb action recognition based on sEMG 3.1 Effective segment detection of EMG signal Select the appropriate parameters, the mean square values of each signal are moving average to determine the starting and ending points of the corresponding action, the specific process is as follows: (1) The collected raw signals are processed by summing and averaging. The square of the processed signal is calculated, and the window size of the moving average processing is set to 60. (2) Predefined amplitude values are defined as a threshold value, and a threshold analysis is made for the processed signal. The analysis conditions are reserved for signals higher than the threshold, and the signals below the threshold are eliminated. (3) Determine the valid segment and eliminate the signal data that is not in the active segment or the effective time is too short.

3.3 Pattern recognition of upper limb movements The data acquisition based on sEMG signal is carried out on three predefined gestures, that is, hand movement, wrist movement and finger movement. The test subjects were healthy men with no history of muscle problems. The experiment that subjects participate in the extraction of sEMG signals for long time, there is no movement deformation or not standard, and then the experimental error is eliminated. During the same time period, 10 data acquisition experiments were carried out respectively and performed 20 collection of each movement. During the acquisition of EMG signals, the time of hand gesture control is between 400 and 1200 ms. In this experiment, the characteristic values required for the feature extraction of sEMG signals are set to the MAVand AR coefficients of the signals. Eigenvalues are extracted from the signals after bandpass filtering and efficient segment detection. The feature of every signal extraction can be a MAV and three AR model coefficients. Therefore, according to the sampled signal data, MAV and AR model coefficients

3.2 Eigenvalue extraction of sEMG Many feature extraction methods, such as AR model coefficients, have been widely used by many scholars [38, 39]. Table 1 HO

HC

pattern recognition rate based on Kinect WF

(%) (%) (%) 82 81 82

FI

(%) 78

(%) (%) (%) (%) (%) 80 83 80 82 80.4

(%) 79

(%) 77

FM

FR

FL

均识别率

WE WP WS

Fig. 4 the feature distribution of six kinds of gestures

Mobile Netw Appl Table 2 HO

HC

pattern recognition rate based on sEMG signals WF

WE

WP

WS

FI

FM

FR

FL

ARR

(%)

(%)

(%)

(%)

(%)

(%)

(%)

(%)

(%)

(%)

(%)

67

66

60

63

64

62

60

58

61

62

62.3

ARR represents average recognition rate

including two kinds of eigenvalues in twenty-four dimensional feature vectors. Figure 6 represents a gesture movement respectively Hand Open, Hand Close, Wrist Flexion, Wrist Extension, Fingerring and Finger-little. The MAV of a sEMG signal and the characteristic distribution of the AR model coefficients after one acquisition process is characteristic distribution map. As shown in the diagram, the gesture is well distinguishable. However, there is some mutual interference between the eigenvalues of the two movements of the thumb near the ring finger and the thumb near the little thumb. Analysis of the causes of the phenomenon lies in the arrangement of muscle electrode positions have a certain part of muscle exercise for the two movement, and the movement effect is obvious, so that the two kinds of gestures in the characteristic value of the interference between the region. From the above experiments, we can see that the parameters in frequency domain, such as MAV and AR model, have good representation for sEMG signals, and then produce higher accuracy for gesture classification. Therefore, it is obviously feasible to use these two eigenvalues to describe the gesture in the experiment. Figure 4 shows clearly that different gestures have distinct differences in the same features, which can be used for gesture classification and recognition. According to the MAV and AR model mentioned above eigenvalue extraction of gesture motion is carried out. sEMG signals were extracted from the participants, and each group was given 200 sets of eigenvalues, each of which could get a total of 2000 eigenvalues. For each action of each individual, 100 sets of characteristic values were selected as training samples, and the other 100 groups were retained as test samples. Fig. 5 gesture action segmentation combinations of Kinect and sEMG

An action gesture

sEMG

After using the sEMG signal extracted by the eigenvalues, the action recognition is performed. Among them, the classifier selects linear Bayesian classifier. Since there are ten predefined gestures, the prior probability of the Bias classifier is set to 1/10. According to the training requirements of the classifier, the gesture motion is processed. The mean of training samples is obtained, and the mean value of each gesture is taken as the core area of data collection. The discriminant function of the test sample function is brought into the test sample, and then the test samples are classified and identified according to the decision rules. The recognition rate of the ten actions is shown in Table 2.

4 Identification of upper limb movements based on information fusion According to the method of multi-sensor fusion and the ten gestures predefined in the experiment, the full gesture movements are divided in a certain extent including the timing of the gestures, the duration of the movement, the relaxation of the hand, and the time to prepare for the next movement. The signal is intercepted by the effective segment, and the sEMG signal and Kinect image signal are obtained simultaneously. Moreover, the middle frame of the effective gesture motion video is extracted as a gesture image for subsequent processing. The specific process of gesture segmentation based on the combination of Kinect and sEMG signals is shown in Fig. 5. After the synchronization of sEMG signals and image signals of muscle electrode and the Kinect are acquired. According to the different processing method of signal and image signal, the sEMG signal is obtained after a series of processing data processing feature vector required. After a series of processing of Kinect gesture signal, the frame is extracted and the characteristic quantity is obtained. Finally, the different features are acquired by two detection methods, under the guidance of fusion theory. Feature fusion is used to provide relevant data for motion recognition. Finally, the final recognition result is obtained by data fusion method. ready

Start

finish

recovery

End

Active segment detection

Active section

Video sequence

Frame extraction

First frame Second frames Third frames

Frame n

Mobile Netw Appl Fig. 6 Belief internal The uncertainty interval of trust A

Support A interval

Nonsupport A interval

·

Pls(A)

Bal(A)

4.1 D-S evidence theory D-S evidence theory [43–46] discusses a framework for identificationθ. Therefore, the possible independent identification results or hypotheses about propositions are defined within this framework. The set of all possible subsets contained is called a power set of θ, represented by Ω (θ). In gesture recognition, if the class of the sample is identified as x, y, z in this case, the ‘identification framework’ and ‘power set’ are defined as follows: θ ¼ ðx; y; zÞ

ð1Þ

ΩðθÞ ¼ fϕ; fxg; fyg; fzg; fx; yg; fx; zg; fy; zg; fx; y; zgg ð2Þ Among φ refers to the situation that the answer is not x, y or z. It can be other types. The subset refers to the answer could be x or y. It is similar with other subsets. It can be seen that when there are N elements in θ, there are 2 N – 1 elements in Ω (θ). For the defined ‘recognition framework’ θ, power set function can be defined as a mapping from Ω (θ) to [0, 1], which satisfies two conditions: m : ΩðθÞ→½0; 1

ð3Þ

mðϕÞ≠0; ∑ mðAÞ ¼ 1

ð4Þ

A⊆ΩðθÞ

Among m is a basic probability assignment function. When A = θ, m (A) means that it is confused in distributing m. When A is a subset of θ and m(A) ≠ 0, it is a focal function of m. If θ is a ‘recognition framework’, m : Ω(θ) → [0, 1] is the basic probability assignment function in frame Bel, so the belief function is defined as: Bel : ΩðθÞ→½0; 1; Bel ðAÞ ¼ ∑ mðBÞ

ð5Þ

B⊆A

Plausibility function Pls is defined as: Pls : ΩðθÞ→½0; 1; PlsðAÞ ¼ 1−Bel A ¼ ∑ mðBÞ B∩A≠ϕ

Fig. 7 flow chart of identification based on D-S evidence theory

Kinect

ð6Þ

Belief function Bel means a measure that proposition A determines to be set. Plausibility function Pls means an uncertainty proposition that A probably determines to be set. If the belief function Bel(A) and plausibility function Pls(A) of proposition A are known, the belief interval of A is [Bel(A), Pls(A)], which is shown in Fig. 6. This interval represents the uncertain probability of the occurrence of A. The lower bound is the belief of proposition, which means the minimal probability that proposition, occurs based on the direct evidence of the sensor. The upper bound is the plausibility of proposition, which means the combination of the belief of proposition and the potential possibility of proposition occurred. So the bound can explain how much of evidence supports the proposition, and how much is not known about the proposition, and how much it is determined to refute the proposition. This is a hypothesis that Bel1 and Bel2 are belief functions from two types of sensors in the same ‘recognition framework’ θ. m1 and m2 are the corresponding basic probability assignment functions. If focal functions of Bel1 and Bel2 are A1, A2,…, AK, and B1, B2,…,BK. According to the D-S orthogonal principle, the composed basic probability assignment function m : Ω(θ) → [0, 1] is shown in following formula. 8 0; C ¼ ϕ > > < ∑ m1 ðAi Þm2 B j ; C≠ϕ Ai ∩Bj¼C ð7Þ mϕðC Þ ¼ > 1−K > : K¼

∑

Ai∩B j ¼ϕ

m1 ðAi Þm2 B j

It is important to select weights when the results are fused. As for the trust of each sensor, the data accuracy of the sensor under the same conditions is usually used as the reference standard. According to the difference of sensor input mode, the weighted D-S evidence fusion theory is improved. Through the training of the sample data, we can calculate the classification effect (P ik ) of different sensors for hand gesture

Proof range of a proposition D-S combination law

EMG sensor

ð8Þ

Proof range of a proposition

Logic decision

Gesture class

Mobile Netw Appl 100 90 80 70 60

sEMG Kinect

50

D-S 40

Weighted D-S

30 20 10 0 1

2

3

4

5

6

7

8

9

10

Fig. 8 recognition rate of each recognition method under ten types of actions

recognition. Pik is accuracy of classification. Type k represents the k sensor. According to the number of sensors in this experiment, it is taken 1 or 2 and represents different gesture classes. 8 0; C ¼ ϕ > > > < A ∩B∑ ¼C Pi1 m1 ðAi ÞP j2 m2 B j ; C≠ϕ i j ð9Þ mðC Þ ¼ > 1−K > > : Where, K¼

∑

Ai ∩B j ¼C

Pi1 m1 ðAi ÞP j2 m2 B j

ð10Þ

Based on the weighted D-S evidence theory, the process of human upper limb motion identification based on sEMG signals and Kinect is the same as that of the general data fusion process. In practical applications, it can be found that different types of sensor data have a great precision difference, so in operation, the data collected by different sensors should be differentiated by trust. The muscle surface electrode and Kinect sEMG at the same time make signal and image signal acquired. Through the sensor data processing, features were given two kinds of sensor values. The weighted D-S evidence fusion theory provides a certain weight for each sensor, so as to show the trust degree of the sensor with different precision. Table 3 Comparison of average recognition rates based on single sensor and two fusion algorithms

The results of feature fusion are analyzed logically, and finally the classification results are obtained. The specific flow chart is shown in Fig. 7. In order to verify the fusion method described in this paper on the effectiveness of the human upper limb motion recognition fusion of Kinect and sEMG signals. The decision fusion algorithm based on D-S evidence theory and the weighted D-S evidence theory fusion algorithm is experimentally verified, and the experimental results are compared and judged, and then the method with higher recognition rate is obtained. Ten types of actions in predefined gestures are compared and collected. Set the collected signals for 5 groups, each group of 20 gestures repeated, a total of 100 Kinect image signals and sEMG signals. Figure 8 is a test of the recognition rate obtained by hand gesture recognition under the condition of single sensor and multi-sensor data fusion. According to Fig. 8, the following conclusions can be drawn. (1) It is obvious that the gesture recognition based on sEMG signals is less effective. Compared with single sensor, the recognition rate based on Kinect has obvious difference. After analysis, the reason of bad effect identification gesture sEMG signal based on that gesture involves a number of hand muscles. Different gestures may be part of the muscle movement leads to overlap, sEMG signal overlap, gesture classification errors. (2) It can be seen that using multi-sensor fusion theory to perform hand gesture classification has better recognition rate. Compared with the recognition rate of the single sensor, most of the gesture recognition can be well recognized in the multi-sensor information fusion technology. (3) It can be seen from the diagram that the weighted D-S evidence theory based on the traditional D-S evidence theory to optimize and improve has a good recognition effect. Gesture of the classification results in two types of single sensor has a low recognition rate in the weighted D-S evidence theory, which indicates that the single sensor in multi sensor fusion still exist the problem that the data acquisition is not in perfect condition. Table 3 shows the average rate of recognition of predefined types of hand gestures by ten testers under four different

gesture

HO (%)

HC (%)

WF (%)

WE (%)

WP (%)

WS (%)

FI (%)

FM (%)

FR (%)

FL (%)

ARR (%)

sEMG Kinect D-S Weighted D-S

67 82 88 89

66 81 87 88

60 82 85 88

63 78 84 87

64 79 82 89

62 77 84 86

60 80 83 85

58 83 85 86

61 80 83 84

62 82 82 86

62.3 80.4 84.3 86.8

ARR represents average recognition rate

Mobile Netw Appl

identification methods. According to the table, the decision fusion method based on D-S evidence theory and weighted D-S evidence theory has better classification results. The correct classification rate is higher than that of the two types of single sensor recognition. It validates the effectiveness of human upper limb action recognition based on sEMG and Kinect signal fusion.

5 Conclusions According to the research theory of multisensor data fusion, a gesture recognition method based on sEMG signal and Kinect multisensor data fusion is proposed. Using the effective segment of the EMG signal, we extract the Kinect image collected synchronously, and obtain two sets of data. Then, the characteristic value is fused, and the D-S evidence theory is used to recognize human upper limb action, and the recognition rate is obtained. By improving the D-S evidence theory, the weighted D-S evidence theory is proposed, and the gesture recognition is simulated experimentally. Simulation results verify the effectiveness of weighted D-S evidence theory in human upper limb action recognition based on Kinect and sEMG signal fusion.

9. 10.

11.

12.

13.

14. 15. 16.

17.

18.

19.

20. Acknowledgments This work was supported by grants of National Natural Science Foundation of China (Grant No. 51575407, 51575338, 61273106, 51575412 and 61603420) and the Grants of National Defense Pre-Research Foundation of Wuhan University of Science and Technology (GF201705).

21.

22.

References 23. 1.

2. 3.

4.

5.

6.

7.

8.

Guan R, Xu XM, Luo YY (2013) A computer vision-based gesture detection and recognition technique. Computer Applications and Software 30(1):155–159 Li J, Gu D, Liu F (2015) Finger and hand tracking based on kinect depth information. Computer Applications and Software 32(3):79–83 Fang YF, Liu HH, Li GF, Zhu XY (2015) A multichannel surface EMG system for hand motion recognition. International Journal of Humanoid Robotics 12(2):1–11 Liu Y, Yang WL (2016) Hand gesture recognition based on Kinect and Pseudo-Jacobi-Fourier moments. Transducer and Microsystem Technologies 35(7):48–54 Lee HK, Shin SG, Kwon DS (2017) Design of emergency braking algorithm for pedestrian protection based on multi-sensor fusion. Int J Automot Technol 18(6):1067–1076 Belmonte-Hernandez A, Hernandez-Penaloza G, Alvarez F, Conti G (2017) Adaptive Fingerprinting in Multi-Sensor Fusion for Accurate Indoor Tracking. IEEE Sensors J 17(15):4983–4998 Das S, Barani S, Wagh S, Sonavane SS (2017) Extending lifetime of wireless sensor networks using multi-sensor data fusion. Sadhana-Academy Proceedings in Engineering Sciences 42(7): 1083–1090 Fu H, Du XK (2005) Multi-sensor optimum fusion based on the bayes estimation. Control Theory and Applications 4(24):10–12

24.

25.

26.

27.

28.

29.

Wadaa A, Olariu S, Wilson L (2005) Training a wireless sensor network. Mobile Networks and Applications 10(1–2):151–168 Yang P, Chen X, Li Y (2012) A sign language recognition method based on multi-sensor information. Space Medicine and Medical Engineering 25(4):276–281 Wang WH, Chen X, Yang P (2010) Chinese sign language recognition based on multiple sensors information detection and fusion. Chin J Biomed Eng 29(5):665–671 Zhao AF, Pei D, Wang QZ (2014) Gesture recognition fused with multi-information in complex environment. Computer Engineering and Applications 50(5):180–184 Chen DS, Li GF, Sun Y, Jiang GZ, Kong JY, Li JH, Liu HH (2017) Fusion hand gesture segmentation and extraction based on CMOS sensor and 3d sensor. Int J Wirel Mob Comput 12(3):305–312 Ju ZJ, Liu HH (2011) A unified fuzzy framework for human-hand motion recognition. IEEE Trans Fuzzy Syst 19(5):901–913 Ju ZJ, Liu HH (2010) Recognizing hand grasp and manipulation through empirical copula. Int J Soc Robot 2(3):321–328 Chen DS, Li GF, Sun Y, Kong JY, Jiang GZ, Tang H, Ju ZJ, Yu H, Liu HH (2017) An interactive image segmentation method in hand gesture recognition. Sensors 17(2):253 Liao YJ, Sun Y, Li GF, Kong JY, Jiang GZ, Jiang D, Cai HB, Ju ZJ, Yu H, Liu HH (2017) Simultaneous calibration: a joint optimization approach for multiple kinect and external cameras. Sensors 17(7):1491 Miao W, Li GF, Jiang GZ, Fang YF, Ju ZJ, Liu HH (2015) Optimal grasp planning of multi-fingered robotic hands: a review. Applied and Computationnal Mathematics 14(3):238–247 Krishna RV, Maziar L, Joohee K (2014) Real-time refinement of kinect depth maps using multi-resolution anisotropic diffusion. Mobile Networks and Applications 19(3):414–425 Li GF, Liu J, Jiang GZ, Liu HH (2015) Numerical simulation of temperature field and thermal stress field in the new type of ladle with the nanometer adiabatic material. Advances in Mechanical Engineering 7(4):1–13 Li GF, Gu YS, Kong JY, Jiang GZ, Xie LX, Wu ZH, Li Z, He Y, Gao P (2013) Intelligent control of air compressor production process. Applied Mathematics & Information Sciences 7(3):1051–1058 Alon J, Athitsos V (2009) A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Transactions on Pattem Analysis and Machine Intelligence 31(9):1685–1699 Li GF, Qu PX, Kong JY, Jiang GZ, Xie LX, Gao P, Wu ZH, He Y (2013) Coke oven intelligent integrated control system. Applied Mathematics & Information Sciences 7(3):1043–1050 Li GF, Miao W, Jiang GZ, Fang YF, Ju ZJ, Liu HH (2015) Intelligent control model and its simulation of flue temperature in coke oven. Discrete and Continuous Dynamical Systems - Series S (DCDS-S) 8(6):1223–1237 Binh N, Enolkida S, Ejima T (2005). Real-time hand tracking and gesture recognition system. International Conference on Graphic, Vision and Image Processing, pp362–368 Li GF, Kong JY, Jiang GZ, Xie LX, Jiang ZG, Zhao G (2012) Airfuel ratio intelligent control in coke oven combustion process. Information-An International Interdisciplinary Journal 15(11): 4487–4494 Miao W, Li GF, Sun Y, Jiang GZ, Kong JY (2016) Gesture recognition based on sparse representation. Int J Wireless and Mobile Computing 11(4):348–356 Li B, Sun Y, Li GF, Kong JY, Jiang GZ, Jiang D, Tao B, Xu S, Liu HH (2017) Gesture recognition based on modified adaptive orthogonal matching pursuit algorithm. Clust Comput. https://doi.org/10. 1007/s10586-017-1231-7 He Y, Li GF, Liao YJ, Sun Y, Kong JY, Jiang GZ, Jiang D, Tao B, Xu S, Liu HH (2017) Gesture recognition based on an improved local sparse representation classification algorithm. Clust Comput. https://doi.org/10.1007/s10586-017-1237-1

Mobile Netw Appl 30.

Chen Z, Sun SK (2010) A zernike moment phase-based descriptor for local image representation and matching. IEEE Trans Image Process 19(1):205–219 31. Chen DS, Li GF, Jiang GZ, Fang YF, Ju ZJ, Liu HH (2015) Intelligent computational control of multi-fingered dexterous robotic hand. Journal of Computational & Theoretical Nanoscience 12(12):6126–6132 32. Ding WL, Li GF, Jiang GZ, Fang YF, Ju ZJ, Liu HH (2015) Intelligent computation in grasping control of dexterous robot hand. Journal of Computational & Theoretical Nanoscience 12(12): 6096–6099 33. Liu YQ, Wang XH, Yan K (2016). Hand gesture recognition based on concentric circular scan lines and weighted K-nearest neighbor algorithm. Multimedia Tools and Applications, pp1–15 34. Hugo JE, Eduardo FM, Enrique SL (2016) A naïve bayes baseline for early gesture recognition. Pattern Recogn Lett 73:91–99 35. Kim J, Wagner J (2008) Bi-channel sensor fusion for automatic sign language recognition. 8th IEEE International Conference on Automatic Face & Gesture Recognition, pp647–652 36. Li GF, Liu Z, Jiang GZ, Xiong HG, Liu HH (2015) Numerical simulation of the influence factors for rotary kiln in temperature field and stress field and the structure optimization [J]. Advances in Mechanical Engineering 7(6):1–15 37. Li GF, Qu PX, Kong JY, Jiang GZ, Xie LX, Wu ZH, Gao P, He Y (2013) Influence of working lining parameters on temperature and stress field of ladle. Applied Mathematics & Information Sciences 7(2):439–448 38. Vidal C, Jedynak B (2010) Learning to match: deriving optimal template-matching algorithms from probabilistic image models. Int J Comput Vis 88(2):189–213

39.

40.

41.

42.

43.

44.

45. 46.

47

Guo WC, Sheng XJ, Liu HG, Zhu XY (2017) Toward an enhanced human-machine interface for upper-limb prosthesis control with combined EMG and NIRS signals. IEEE Transactions on Humanmachine Systems 47(4):564–575 Rebecca W, Hussam M, Haydn M, Jiang XQ (2016) Burg algorithm for enhancing measurement performance in wavelength scanning interferometry. Surface Topography: Metrology and Properties 4:1–8 Ding WL, Li GF, Sun Y, Jiang GZ, Kong JY (2017) D-S evidential theory on sEMG signal recognition. Int J Comput Sci Math 8(2): 138–145 Liu S, Fu W, He L et al (2017) Distribution of primary additional errors in fractal encoding method [J]. Multimedia Tools and Applications 76(4):5787–5802 Li Z, Li GF, Jiang GZ, Fang YF, Ju ZJ, Liu HH (2015) Intelligent Computation of grasping and manipulation for multi-fingered robotic hands. Journal of Computational & Theoretical Nanoscience 12(12):6192–6197 Huang Z, Ren FJ (2017) Facial expression recognition based on multi-regional D-S evidences theory fusion. IEEJ Trans Electr Electron Eng 12(2):251–261 Pan Z, Liu S (2017) A review of visual moving target tracking [J]. Multimedia Tools and Applications 76(16):16989–17018 Liu S, Pan Z, Fu W, Cheng X (2017) Fractal generation method based on asymptote family of generalized Mandelbrot set and its application [J]. Journal of Nonlinear Sciences and Applications 10(3):1148–1161 Li GF, Tang H, Sun Y, Kong JY, Jiang GZ, Jiang D, Tao B, Xu S, Liu HH (2017) Hand gesture recognition based on convolution neural network. Clust Comput. https://doi.org/10.1007/s10586-017-1435-x