Voice Controlled Intelligent Wheelchair - IEEE Xplore

24 downloads 0 Views 1MB Size Report
Masato Nishimori, Takeshi Saitoh and Ryosuke Konishi. Tottori University, Tottori, Japan. E-mail: {saitoh, konishi}@ele.tottori-u.ac.jp. Abstract: In order to assist ...
SICE Annual Conference 2007 Sept. 17-20, 2007, Kagawa University, Japan

Voice Controlled Intelligent Wheelchair Masato Nishimori, Takeshi Saitoh and Ryosuke Konishi Tottori University, Tottori, Japan E-mail: {saitoh, konishi}@ele.tottori-u.ac.jp Abstract: In order to assist physically handicapped persons, we developed a voice controlled wheelchair. The user can control the wheelchair by voice commands, such as “susume (run forward)” in Japanese. A grammar-based recognition parser named “Julian” is used in our system. Three type commands, the basic reaction command, the short moving reaction command, and the verification command, are given. We experimented speech recognition by Julian, and obtained a successful recognition rate 98.3%, 97.0% of the movement command and the verification command, respectively. The running experiment with three persons was carried out in the campus room and the utility of our system is shown. Keywords: voice control, intelligent wheelchair, speech recognition.

1. INTRODUCTION

2. VOICE CONTROLLED INTELLIGENT WHEELCHAIR 2.1 Voice command and wheelchair reaction

Recently, the old person and the physically handicapped person who use a wheelchair are increasing. However, only two type wheelchairs by the handoperating and operating the joystick, have come into wide use. The former type needs muscular strength for the operation and the latter type needs the skill. Therefore, there is a problem that it is difficult for the old and the handicapped person to use these interfaces.

We set voice commands to control wheelchair. The voice commands consist of nine reaction commands and five verification commands. The reaction commands consist of five basic reaction commands and four short moving reaction commands which move short distance. The details of voice commands and their respective reaction 1 to ⃝ 5 are the are shown in table 1. In this table,from ⃝ 6 to ⃝ 9 are the short basic reaction commands, from ⃝ A and ⃝ B are the vermoving reaction commands, and ⃝ ification commands. Two commands “migi” and “hidari” are the 90 degrees rotational movement in a fixed place without any forward or backward movement. When these commands are input during running, the system turns 90 degrees and move forward after turn. Thus, five basic reaction commands achieve seven reactions (modes). In our system, though the short moving reactions, such as forward running about 30cm and right rotating about 30 degree, are decided in experience, these reactions is possible to change by user’s needs. After voice command is input, recognition process is carried out, and recognition result is shown in the display of the laptop. In this time, to avoid wrong reaction by the misrecognition, the user must input a verification command to check the result. When acceptance word (“OK” or “yes”) is input, the system considers the recognition result as correct, and it reacts to the given movement. Oppositely, when rejection word (“torikeshi” or “no” or “cancel”) is input, or no voice for a while, the system considers the recognition result as incorrect.

The interface of wheelchair, voice [1], [2], direction of the face [3], eye gaze [4], electromyography (EMG) signal from the neck and the arm muscles [5], EMG signal from the face muscles [6], wireless tongue-palate contact pressure sensor [7], eye-control method based on electrooculography (EOG) [8], electroencephalography (EEG) [9] are proposed. Using these methods, it becomes easy to physically a handicapped person, to operate a wheelchair. Though a person moves his face and the direction of his eyes unconsciously, unnaturalness causes to use these movements as the interface consciously. Operation by EMG or EOG or EEG signal needs the skill, and a mental burden is forced on the user to use a contact type sensor to measure these signals. On the other hand, voice is a natural communication way, and voice is one of the easy interfaces. This research develops a voice controlled wheelchair which uses the voice command as the interface. The basic modes of the wheelchair is, 1) running until next command is input, 2) turning or rotating, and 3) stopping. Though, our system has these modes, in this paper, to use in the narrow space, we use four commands which move short distance. Moreover, we also use two type verification (acceptance and rejection) commands to prevent the wrong reaction by the misrecognition. The running experiments are carried out with some persons in the campus corridor or room, and the utility of our system is shown.

2.2 System configuration Our system is based on commercial electronic wheelchair Nissin Medical Industries co. NEO-P1. Our system consists of a headset microphone and a laptop. The overview and the construction of our system are shown in figure 1 and figure 2, respectively. In our system, the user can select two type controls voice and button. Since the main issue of this paper is voice controlled

- 336 -

PR0001/07/0000-0336 ¥400 © 2007 SICE

Table 1 Voice command and reaction. 1 ⃝ 2 ⃝ 3 ⃝ 4 ⃝ 5 ⃝ 6 ⃝ 7 ⃝ 8 ⃝ 9 ⃝ A ⃝ B ⃝

command tomare susume sagare migi hidari sukoshi-susume sukoshi-sagare sukoshi-migi sukoshi-hidari OK / yes torikeshi / no / cancel

START

reaction (mode) stop run forward run backward turn right / rotate right turn left / rotate left run forward about 30cm run backward about 30cm rotate right about 30 degrees rotate left about 30 degrees acceptance command rejection command

input reaction command recognition (reaction command) show result word stop command

Yes

No Yes

no input for 3 seconds No

input verification command recognition (verification command) No

OK? Yes

reaction

Fig. 3 Control algorithm.

2.3 Control algorithm To control the wheelchair by voice command, a user inputs reaction command and verification command to prevent wrong reaction by the misrecognition. In our system, the control algorithm is shown in figure 3 and detail is as follows: 1. User inputs a reaction command. 2. Input reaction voice data is recognized and its result word is displayed. 3. When result word is a stop command, go to 7. 4. User inputs a verification command. When any verification command is not input within three seconds, the system considers that recognition result fails and go to 1. 5. Input verification voice data is recognized. 6. When result word is a rejection command, go to 1. 7. Control system sends the target control signal to the wheelchair, and the wheelchair reacts. Go to 1.

Fig. 1 System overview.

system, this paper does not refer to button control. The laptop is main computer of this system, and recognition process is executed here. The control signal is sent to PIC from the laptop, and PIC generates the motor control signal to drive the wheelchair. We use a grammar-based recognition parser named “Julian”. This is open-source software and is developed by Kyoto University, Nara Institute of Science and Technology, and so on [10]. Julian is another version of Julius which performs grammar-based recognition. Julius is an open-source large vocabulary continuous speech recognition engine. Julian can perform speech recognition based on a written grammar. The grammar describes possible syntax or patterns of words on a specific task. When a speech input is given, Julian searches for the most likely word sequence under constraint of the given grammar. Furthermore, in our system, the forward running speed is set at 1.8 km/h, and backward speed is set at 1.4 km/h.

2.4 Main Window A user operates control system with riding a wheelchair. The window of this system which is simple and easy to understand is desirable. Then, the window is designed that many illustrations are used and little text are used. Our created window is shown in figure 4. Each illustration and button inside the main window has the following function. 1. Audio wave: input audio signal is displayed in realtime. 2. Mode illustration: recognition result is displayed as an illustration of corresponding mode. Nine mode illustration is shown in figure 5. 3. Start button: When a button is pushed, the input of the audio signal is started.

laptop microphone voice command audio signal button control

control system recognition (reaction command) recognition (verification command)

wheelchair PIC

left motor right motor

control signal

Fig. 2 Voice controlled intelligent wheelchair.

- 337 -

ex.1 A user inputs stop command to stop in the target position by going straight running. ex.2 A user inputs stop command in the target position by going straight running. ex.3 A user makes a wheelchair go straight at the time of the stop, immediately he inputs stop command. The stopping experiment was carried out with the five persons. Each person was tested five times. Thus, average value and standard deviation were calculated. Experimental results are shown in figure 7. As for ex.1, average error was 11.7cm. However, we found that increasing number of time of the experiment decreases the error. From two experimental results of ex.2 and ex.3, the braking distance of about 2m occurred until it actually stopped after stop command was input. This is the reason why two seconds are necessary for the recognition processing and the display of the result. The running speed is 1.8km/h, and our system runs 2m in 2 seconds. This distance supposed with the specifications of the voice controlled wheelchair though it can expect that the braking distance decreases by making improvement in the performance of the laptop and the low running speed. Here, the braking distance was about 50cm when button operation is used without using the voice input.

2 1 5 3 4

Fig. 4 Main window.

3 sagare 1 tomare ⃝ 2 susume ⃝ ⃝

4 migi ⃝

5 hidari ⃝

7 sukoshi- ⃝ 8 sukoshi- ⃝ 6 sukoshi- ⃝ 9 sukoshi⃝ sagare migi susume hidari Fig. 5 Mode illustrations.

3.3 Running experiment in the corridor The next experiment is a running experiment. This experiment was carried out with same five persons of previous experiment. A running place was in the corridor of campus, and the width of the corridor was about 2m, and one obstacle was put on the hall of the corridor. We set two courses A and B as shown in figure 8. The total distances of course A and B were about 16m and 13m, respectively. The experimental running time, the number of basic reaction command and the number of short moving reaction command are shown in table 2. Because a little moving distance operation of the course B is more necessary than the course A, there is more input of the short moving reaction command in the course B. Though the person E is one of the authors and he is control the system in a practiced operation, every person is almost running at the same time.

4. Quit button: When a button is pushed, the control system is quit. 5. Control buttons: When one of the buttons is pushed, the wheelchair is moved. That is, a user can control the system without voice command. Furthermore, the present mode is highlighted.

3. EXPERIMENT 3.1 Recognition experiment To evaluate recognition performance of Julian, we experiment speech recognition test with 15 students. The target words are nine reaction commands and five verification commands as shown in table 1. This experiment is carried on in laboratory room. There were some voices of other people in the recording environment in the circumference. As the results, we obtained successful recognition rates of 98.3% of reaction command and 97.0% of verification command, respectively.

3.4 Running experiment in the room The next experiment was carried out with the three persons. A running place was in the room of campus. We presume the course which the start point is near the door and the destination point is one of the seats. The distance between the desks was about 160cm. The width of the wheelchair was about 55cm. So, this experiment is a running experiment in the narrow running course. To evaluate performance of our system, we experiment not only voice operation but also the key operation of the laptop. The experimental running time, the running distance and the number of reaction are shown in table 3. The running scenes of C are shown in the figure 10. Unlike the previous experiments in the corridor, the

3.2 Stopping experiment The action of this system executes until the next command is give. For example, a wheelchair goes straight until stop or turn command is input. Then, we tested the following three experiments to verify the operation of our system. The overview of these experiments are shown in figure 6.

- 338 -

stop line

1000

start line

900

tomare susume

goal

800

ex.1

700 600

y [cm]

tomare susume

ex.2

course A

500

course B

400

obstacle

300 200

tomare susume

100

ex.3

0 goal start -100 -700 -600 -500 -400 -300 -200 -100 0 100

200

300

x [cm]

Fig. 6 Overview of stopping test.

Fig. 8 Test courses in the corridor.

100

Table 2 Experimental results of running experiment in the corridor.

distance [cm]

50

(a) Course A 0

-50

-100

A

B

C

D

E

all

person

(a) ex.1

person A B C D E average

running time [s] 166.14 141.80 208.83 217.26 127.23 172.30

person A B C D E average

running time [s] 163.66 177.43 158.68 185.46 156.33 168.30

the number of

the number of

basic command

short moving command

12 14 16 16 11 13.8

7 3 2 6 4 4.4

(b) Course B 300

distance [cm]

250

200

150

100

A

B

C

D

E

the number of

the number of

basic command

short moving command

22 16 17 9 14 15.6

8 8 6 15 10 9.4

all

person

100

difference in running time by the user occurred. The person C is one of the authors, and he controls the system in a practiced operation. On the other hand, key operation was almost the same running time as every person. Though it had a difference individually, we confirmed that our system could run even through the narrow space, like the room.

50

4. CONCLUSION

(b) ex.2 200

distance [cm]

150

0

A

B

C

D

E

This paper developed a voice controlled wheelchair. Three type commands, the basic reaction command, the short moving reaction command, and the verification command, are given in our system. Speech recognition used an open-source software Julian. We obtained a successful recognition rate 98.3%, 97.0% of nine reaction commands and five verification commands, respectively.

all

person

(c) ex.3 Fig. 7 Result errorbars of stopping experiments.

- 339 -

600

Goal

A B C

500

400

y[cm]

300

200

(a)

(b)

(c)

(d)

100

Start 0

−100 −600

−500

−400

−300

−200

−100

0

100

x[cm]

Fig. 9 The course and result path in the room. Table 3 Experimental results of running experiment in the room.

person A B C average

(a) Voice control running running time [s] distance [cm] 143.95 1120 256.87 984 49.43 959 150.08 1021

the number of input command 8 30 7 15

person A B C average

running time [s] 34.15 49.70 38.51 40.79

(b) Key control running distance [cm] 1042 947 983 991

the number of input command 7 19 9 11.7

(e) [4]

Y. Matsumoto, T. Ino, and T. Ogasawara. Development of intelligent wheelchair system with face and gaze based interface. In IEEE Int. Workshop on Robot and Human Communication, pages 262–267, 2001. [5] K. Choi, M. Sato, and Y. Koike. Consideration of the embodiment of a new, human-centered interface. IEICE Trans. Inf. & Syst., E89-D(6):1826– 1833, 2006. [6] K. H. Kim, H. K. Kim, J. S. Kim, W. Son, and S. Y. Lee. A biosignal-based human interface controlling a power-wheelchair for people with motor disabilities. ETRI Journal, 28(1):111–114, 2006. [7] Y. Ichinose, M. Wakumoto, K. Honda, T. Azuma, and J. Satou. Human interface using a wireless tongue-palate contact pressure sensor system and its application to the control of an electric wheelchair. IEICE Trans. Inf. & Syst., J86-D-II(2):364–367, 2003. [8] R. Barea, L. Boquete, M. Mazo, and E. Lopez. Wheelchair guidance strategies using eog. Journal of Intelligent and Robotic Systems, 34(3):279–299, 2002. [9] B. Rebsamen, C. L. Teo, Q. Zeng, M. H. Ang Jr., E. Burdet, C. Guan, H. Zhang, and C. Laugier. Controlling a wheelchair indoors using thought. IEEE Intelligent Systems, 22(2):18–24, 2007. [10] T. Kawahara and A. Lee. Open-source speech recognition software julius. Journal of Japanese Society for Artificial Intelligence, 20(1):41–49, 2005. (in Japanese).

The running experiment with some persons is carried out in the campus corridor and room and the utility of our system is shown. Our system has not work out any countermeasure, such as obstacle avoidance, yet. The future work is to improve in our system with the practical use.

REFERENCES [1]

[2]

[3]

(f) Fig. 10 Running scenes.

G. Pires and U. Nunes. A wheelchair steered through voice commands and assisted by a reactive fuzzy-logic controller. Journal of Intelligent and Robotic Systems, 34(3):301–314, 2002. K. Komiya, Y. Nakajima, M. Hashiba, K. Kagekawa, and K. Kurosu. Test of running a powered-wheelchair by voice commands in a narrow space. JSME Journal (C), 69(688):210–217, 2003. (in Japanese). L.M. Bergasa, M. Mazo, A. Gardel, R. Barea, and L. Boquete. Commands generation by face movements applied to the guidance of a wheelchair for handicapped people. In ICPR, volume 4, pages 4660–4663, 2000.

- 340 -