university of california, san diego

4 downloads 0 Views 5MB Size Report
Chi, Puneet Khattar, and Sarah Threlfall helped with the P300 BCI work described here. Aimee Arnoldussen ... Supervisor: Dr. Richard Buxton. Department of ...
UNIVERSITY OF CALIFORNIA, SAN DIEGO

P3 or not P3: Toward a Better P300 BCI

A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Cognitive Science

by

Brendan Allison

Committee in charge: Professor Jaime A. Pineda, Chair Professor John Batali Professor Jeffrey Elman Professor Steven Hillyard Professor Tzyy-Ping Jung 2003

Copyright Brendan Allison, 2003 All rights reserved

The dissertation of Brendan Allison is approved, and it is acceptable in quality and form for publication on microfilm:

Chair University of California, San Diego

2003

iii

TABLE OF CONTENTS

SIGNATURE PAGE .....................................................................................................iii TABLE OF CONTENTS ...............................................................................................iv LIST OF FIGURES .....................................................................................................viii LIST OF TABLES .......................................................................................................xiii ACKNOWLEDGEMENTS ..........................................................................................xiv CURRICULUM VITAE .............................................................................................xviii ABSTRACT ..............................................................................................................xxvii CHAPTER 1: INTRODUCTION ................................................................................... 1 CHAPTER 2: NEUROBIOLOGY OF BCIs................................................................... 9 2.1. Why Are BCIs Based on Electrical Measures of Brain Activity? ........ 9 2.2: Why Does Electrical Activity Reflect Cognitive Activity? ................... 12 2.3: EEG Recording and Signal Processing.................................................. 13 2.3.1 Macroelectrodes ............................................................................ 13 2.3.2 Microelectrodes.............................................................................. 17 2.3.3 Early Signal Processing.................................................................. 20 2.3.4. Artifact Removal .......................................................................... 22 2.3.5. Free Running EEG (FREEG)....................................................... 24 2.3.6. Event Related Potentials (ERPs).................................................. 26 2.4. How Can EEGs Be Useful? ................................................................... 29 2.4.1 What Can We Learn About the Mind and Brain from EEGs?....... 29 2.4.2 How Can EEGs be Useful in BCIs?.............................................. 33 2.4.3 What Can’t EEGs Tell Us? ............................................................ 37 2.5. What Have EEGs Told Us About Cognition?....................................... 40 2.5.1 ERPs: P300s ................................................................................. 40

iv

2.5.2. Event Related Potentials: Slow Cortical Potentials (SCPs) ......... 47 2.5.3. ERPs: Readiness Potentials (RPs)................................................ 48 2.5.4 Event Related Potentials: Steady State Visual Evoked Potentials (SSVEPs)................................................................................................. 51 2.5.5. Free Running EEGs: Various Spectra and Alertness.................... 52 2.5.6. Free running EEGs: Mu Rhythms and Movement........................ 53 CHAPTER 3: CURRENT STATUS OF BCI RESEARCH ........................................... 56 3.1. What is a BCI? ....................................................................................... 56 3.2. Purpose and Relative Merits of BCIs.................................................... 59 3.3. Prototypical BCIs .................................................................................. 63 3.3.1 P300 BCIs ...................................................................................... 64 3.3.2. Visual Evoked Potential (VEP) BCIs .......................................... 88 3.3.3.

Slow Cortical Potential (SCP) BCIs: The Thought Translation

Device (TTD) .......................................................................................... 92 3.3.4. Mu BCIs ....................................................................................... 95 3.3.5. Mental task BCIs.......................................................................... 105 3.3.6. Implanted BCIs ............................................................................ 108 3.4. What isn’t a BCI?................................................................................... 118 3.4.1. Alertness / workload monitors ..................................................... 120 3.4.2. Brainwave Fingerprinting ............................................................ 123 3.4.3. Chapin (2002)............................................................................... 125 3.5. Summary of Issues ................................................................................ 128 3.5.1 BCIs and Cognition........................................................................ 128 3.5.2. BCIs and language ....................................................................... 131 3.5.3. BCIs and Information Throughput............................................... 133 3.5.4. BCIs and Other Factors ................................................................ 143

v

3.6. Future directions..................................................................................... 145 CHAPTER 4: EFFECTS OF SOA AND FLASH PATTERN MANIPULATIONS ON ERPs, PERFORMANCE, AND PREFERENCE AND IMPLICATIONS FOR A BRAIN COMPUTER INTERFACE (BCI) SYSTEM ................................................................. 148 4.1: Introduction ............................................................................................ 148 4.2: Methods.................................................................................................. 154 4.2.1 Subjects .......................................................................................... 154 4.2.2 EEG Recording .............................................................................. 154 4.2.3 Experimental Paradigm .................................................................. 155 4.2.4 Data Analysis ................................................................................. 156 4.3 Results and Discussion............................................................................ 157 4.3.1 Behavioral Results: Counting accuracy ......................................... 157 4.3.2: Behavioral Results: Subjective Report ......................................... 161 4.3.3: Electrophysiology ......................................................................... 163 4.3.4: General Discussion ....................................................................... 177 CHAPTER 5: ERPs EVOKED BY DIFFERENT MATRIX SIZES: IMPLICATIONS FOR A BRAIN COMPUTER INTERFACE (BCI) SYSTEM ....................................... 207 5-1: Introduction............................................................................................ 208 5-2: Methods ................................................................................................. 209 5-3: Results.................................................................................................... 212 5-4: Discussion.............................................................................................. 216 Acknowledgments........................................................................................................... 219 References ....................................................................................................................... 219 CHAPTER 6: INDEPENDENT COMPONENT ANALYSIS (ICA) AND ITS POTENTIAL VALUE IN A P300 BCI SYSTEM.......................................................... 222 6-1: Introduction............................................................................................ 222

vi

6.2: Methods.................................................................................................. 228 6-3: Results.................................................................................................... 228 6.4: Discussion .............................................................................................. 237 CHAPTER 7: CONCLUSIONS...................................................................................... 275 APPENDIX: THE EFFECTS OF SELF-MOVEMENT, OBSERVATION, AND IMAGINATION ON Mu RHYTHMS AND READINESS POTENTIALS (RPs): TOWARDS A BRAIN-COMPUTER INTERFACE (BCI) ........................................... 284 GLOSSARY OF BCI TERMS........................................................................................ 295 REFERENCES................................................................................................................ 301

vii

LIST OF FIGURES Chapter 2 Figure 2-1: The Utah Intracranial Electrode Array (UIEA)............................................ 20 Figure 2-2: The cone electrode ....................................................................................... 20 Figure 2-3: Examples of neuronal populations whose activity would be difficult or impossible to detect with a scalp electrode. .................................................................... 38 Figure 2-4: Readiness potentials. .................................................................................... 50 Figure 2-5: Frequency, time, and power for missed and hit targets.................................52 Chapter 3 Figure 3-1: The display used in Farwell and Donchin (1988)...........................................65 Figure 3-2: Grand averaged responses to attended (target) and unattended (nontarget) flashes from Farwell and Donchin (1988)....................................................................... .67 Figure 3-3: Figure and legend from Farwell and Donchin (1988)...................................70 Figure 3-4: The display used in Donchin et al. (2000).....................................................74 Figure 3-5: Accuracy and bitrate for able bodied and healthy subjects with and without the DWT.......................................................................................................................... .76 Figure 3-6: Averaged ERPs evoked by red, green, and yellow lights in the virtual stoplight study..............................................................................................................................80 Figure 3-7: Grand averaged ERPs for all subjects in the VR condition........................................................................................................................83 Figure 3-8: The display used in Kaiser et al (2001)........................................................94 Figure 3-9: The display used in Wolpaw et al. (1991).....................................................97 Figure 3-10: Accuracy and hit rate from Wolpaw et al. (1991)........................................98

viii

Figure 3-11: This BCI enables a rat to control a robot arm. The switch (D) that operates the robot arm (B) can be controlled by the rat pressing the bar, or by neural activity (C) recorded from electrodes in the brain (A). ...................................................................... 114 Figure 3-12: (A) shows the apparatus the monkeys used while operating the BCI. (B) shows the eight locations on the corners of an imaginary cube where a target may appear..........................................................................................................................116 Figure 3-13: Drawings representing the open loop condition........................................117 Figure 3-14. Relationship between amplitude in theta and alpha bands and error rate. Top graph, site Cz; bottom graph, site POz............................................................................ 121 Figure 3-15: The sensor headset developed by Advanced Brain Monitoring (ABM) ... 123 Figure 3-16: Examples of rat navigation using brain microstimulation. ........................ 128 Figure 3-17: Relationship between number of possible choices (cognemes), accuracy, choices per minute, and bitrate........................................................................................ 135 Chapter 4 Figure 4-1: The 8 x 8 grid used in the first study............................................................ 152 Figure 4-2: Three of the eight row flashes used in the “single flash” condition............. 153 Figure 4-3: Three of the eight column flashes used in the “single flash” condition.........153 Figure 4-4: The three condition........................153

row

Figure 4-5: The three condition...................154

column

flashes flashes

used used

in in

the

“multiple

flash”

the

“multiple

flash”

Figure 4-6. Counting accuracy as a function of trial position. ......................................160 Figure 4-7: Interaction of flash type and SOA................................................................ 161 Figure 4-8: Relationship between SOA, flash type, and P300 amplitude....................... 170 Figure 4-9: Grand average ERPs evoked in the single – 500 SOA bin. ......................... 185 Figure 4-10: Grand average ERPs evoked in the multiple 17% - 500 SOA bin..............186

ix

Figure 4-11: Grand average ERPs evoked in the multiple 33% - 500 SOA bin..............187 Figure 4-12: Grand average ERPs evoked in the multiple 50% - 500 SOA bin..............188 Figure 4-13: Grand average ERPs evoked in the single - 250 SOA bin. ........................ 189 Figure 4-14: Grand average ERPs evoked in the multiple 17% - 250 SOA bin............. 190 Figure 4-15: Grand average ERPs evoked in the multiple 33% - 250 SOA bin............. 191 Figure 4-16: Grand average ERPs evoked in the multiple 50% - 250 SOA bin..............192 Figure 4-17: Grand average ERPs evoked in the single - 125 SOA bin..........................193 Figure 4-18: Grand average ERPs evoked in the multiple 17% - 125 SOA bin..............194 Figure 4-19: Grand average ERPs evoked in the multiple 33% - 125 SOA bin............. 195 Figure 4-20: Grand average ERPs evoked in the multiple 50% - 125 SOA bin............. 196 Figure 4-21: Average ERPs evoked by about 35 flashes in subject JM. ........................ 197 Figure 4-22: Two single trials from the averages in figure 4-21.....................................198 Figure 4-23: Two single trials from the averages in figure 4-21.....................................199 Figure 4-24: Two single trials from the averages in figure 4-21.....................................200 Figure 4-25: Two single trials from the averages in figure 4-21. ................................... 201 Figure 4-26: Two splotch patterns currently being used in a follow up study................ 202 Figure 4-27: Two other splotch patterns currently being used in a follow up study....... 203 Chapter 5

Figure 5-1: The three matrices used in the third study.................................................... 208 Figure 5-2: Grand average ERP responses across all subjects over midline sites. ......... 209

x

Chapter 6 Figure 6-1: Graphical representation of the blind source separation problem and ICA..............................................................................................................................220 Figure 6-2: Component scalp projections and ERPs evoked by attended (target) flashes for subject NJ...............................................................................................................227 Figure 6-3: Component scalp projections and ERPs evoked by ignored (nontarget) flashes for subject NJ...............................................................................................................228 Figure 6-4: Component scalp projections and ERPs evoked by attended (target) flashes for subject CC..............................................................................................................230 Figure 6-5. The component properties of component 1 in MJ’s attended trials ............ 231 Figure 6-6: Component scalp maps and ERPs for subject BS. ....................................... 237 Figure 6-7: Component scalp maps and ERPs for subject BS. ....................................... 238 Figure 6-8: Component scalp maps and ERPs for subject CC.......................................239 Figure 6-9: Component scalp maps and ERPs for subject CC.......................................240 Figure 6-10: Component scalp maps and ERPs for subject EW.....................................241 Figure 6-11: Component scalp maps and ERPs for subject EW..................................... 242 Figure 6-12: Component scalp maps for subject JC.. ..................................................... 243 Figure 6-13: Component scalp maps and ERPs for subject JC....................................... 244 Figure 6-14: Component scalp maps and ERPs for subject JS.......................................245 Figure 6-15: Component scalp maps and ERPs for subject JS.......................................246 Figure 6-16: Component scalp maps and ERPs for subject MJ......................................247 Figure 6-17: Component scalp maps and ERPs for subject MJ...................................... 248 Figure 6-18: Component scalp maps and ERPs for subject MT.. ................................... 249 Figure 6-19: Component scalp maps and ERPs for subject MT.. ................................... 250

xi

Figure 6-20: Component scalp maps and ERPs for subject PK......................................251 Figure 6-21: Component scalp maps and ERPs for subject PK......................................252 Figure 6-22: The component properties of component 1 in NJ’s attended trials.............253 Figure 6-23: The component properties of component 1 in JS’s ignored trials.............. 254 Figure 6-24: The component properties of component 1 in JS’s ignored trials.............. 255 Figure 6-25: Component 8 from BS’s attended trials................................................... 256 Figure 6-26: Component 11 from BS’s attended trials..................................................257 Figure 6-27: The component properties of component 6 in BS’s attended trials............258 Figure 6-28: The component properties of component 9 in BS’s attended trials............259 Figure 6-29: The component properties of component 6 in BS’s ignored trials..............260 Figure 6-30: The component properties of component 8 in BS’s ignored trials..............261 Figure 6-31: The component properties of component 11 in JS’s attended trials...........262 Figure 6-32: The component properties of component 10 in JS’s attended trials...........263 Figure 6-33: The trials...............264

component

properties of component 9 in JS’s ignored

Figure 6-34: The trials...............265

component

properties of component 6 in JS’s ignored

Figure 6-35: Component 13 from JS’s attended trials...................................................266 Figure 6-36: Component 12 from JS’s attended trials...................................................267 Figure 6-37: Component 14 from JS’s attended trials...................................................268 Figure 6-38: Component 11 from JS’s attended trials...................................................269 Figure 6-39: Component 13 from PK’s attended trials..................................................270 Figure 6-40: Component 11 from PK’s ignored trials...................................................271

xii

LIST OF TABLES

Chapter 3 Table 3-1: Performance of different classification approaches for each of five subjects in the offline version of the virtual stoplight study. From Bayliss, 2001............................ .81 Table 3-2: Classification accuracy for two subjects using the virtual stoplight BCI...... .81 Chapter 4 Table 4-1: Flash type x SOA interaction for counting accuracy....................................162 Table 4-2: P2, N2, and P300 amplitude and attention for different flash bins ............... 166 Table 4-3: P2 and P300 amplitude and electrode for different flash bins .......................167 Table 4-4: N1, P2, N2, and P300 amplitude and SOA for different flash bins................169 Table 4-5: P300 latency and SOA for different flash bins..............................................171 Table 4-6: N1, N2, and P300 latency and Electrode for different flash bins ................. 173 Table 4-7: N1 latency and Attention for different flash bins...........................................175

Chapter 5 Table 5-1: Size x Attention interaction for P300 amplitude...........................................211 Table 5-2: Size x Site interaction for P300 amplitude and latency ................................212 Table 5-3: Site x Attention interaction for P300 amplitude and latency ........................213

xiii

ACKNOWLEDGEMENTS

First and foremost, I would like to thank the beach. Long may it wave. I would like to thank my dissertation committee, both for their roles as committee members as well as their instruction and support before I even formed the committee. John Batali was not only a perspicacious committee member; he also taught me how to program during my undergrad time here. Jeff Elman provided many helpful comments on my dissertation, and also taught me about neural networks, proper experiment design, and ethics and survival skills in academia (and taught me well, too; I survived). Jaime Pineda is largely responsible for my decision to work with EEGs in the first place, and supported my decision to research BCIs at a time when the field was unknown. Steve Hillyard provided excellent feedback on my dissertation as well as every scientific project I have pursued at UCSD. Without Tzyy-Ping Jung's help, I might have never understood ICA in the first place, let alone been able to use it in a dissertation. Many others in the department deserve mention as well, beginning with those in the lab. David Leland, the other grad student in our lab, has provided voluble advice and friendship on both academic and other matters. He and I share a bond of suffering through adversity that only other grad students can fully appreciate. Andrey Vankov provided technical assistance on this dissertation and has always been a warm and supportive friend. Alex Kadner and his toddling son Julius brightened the lab, and my friend Victor Wang was a paragon of patience and perserverance. Vic! I also wish to thank the many outstanding undergrads who have worked in the lab over the years. My first research assistant, Matt Casserino, was involved in my early hypnosis work. Ben Chi, Puneet Khattar, and Sarah Threlfall helped with the P300 BCI work described here. Aimee Arnoldussen, Gabe Castillo, Bryan Gazaui, Raj Ratwani, Jason Silver, Dave

xiv

xv Silverman, and Ellen Stuart were involved with mu BCI work, and Chris Martinez and Jennifer Bowen did great work on our web page. Finally, I would like to thank all of my subjects, also called neuronauts, who endured an electrode gel hairstyle in the name of science. Christine Johnson's Cognitive Science 17 class is what pulled me into the major in the first place. Chris is an outstanding and devoted teacher, and inspired me to TA and then teach Cog Sci 17 myself. John Polich has provided me excellent advice on my studies and is an outstanding role model, both personally and professionally. Scott Makeig and Arnaud Delorme have been very helpful with EEGLab and ICA. Rick Buxton, Eric Wong, Mark Geyer, and David Braff all provided outstanding mentoring during my laboratory rotations with them, and taught me about fMRIs. I would also like to thank the rest of the cognitive science department - faculty, staff, postdocs, and students, past and present. A great deal of my time over the last 13 years has been spent at this department, with a fair amount of the remainder spent with friends from this and related departments. I feel grateful and deeply honored to have learned from and worked with those in this department, and very much enjoyed the time I spent at work and at play here.

xvi

I also wish to thank the many members of the BCI community, many of whom are friends and all of whom are enthusiastic, warm, and supportive. Larry Farwell and Manny Donchin wrote the article that inspired my dissertation, and have been in contact during my dissertation work. The Wolpaw lab, including Jon Wolpaw, Gerv Schalk, Theresa Vaughan, Dennis McFarland, and Bill Sarnacki, deserve thanks both for their kind correspondance and for their tireless and unselfish effortt to promote BCI research through conferences and BCI2000. Thanks also to Niels Birbaumer, Alan Gevins, Melody Moore, Dave Peterson,

and, of course, Mom and Dad!

and, of course, Mom and Dad!

xvii

NOTE ON CHAPTER FIVE AND APPENDIX The material presented in chapter 5 and the appendix involved the dissertation author and a co-author. The dissertation author was the primary researcher and was responsible for all aspects of the study. The co – author, Jaime Pineda, supervised the research that forms the basis for chapter five and the appendix.

CURRICULUM VITAE

NAME:

Brendan Allison

DATE OF BIRTH:

February 11, 1973

CITIZENSHIP:

United States

MARITAL STATUS:

single

PRESENT ADDRESS: Cognitive Science Department 0515 University of California, San Diego La Jolla, CA 92093-0515 (858) 534-9754 (lab) (858) 534-1128 (fax) email: [email protected]

EDUCATION

University of California, San Diego

B.S.

1994

Cognitive Science

University of California, San Diego

M.S.

1997

Cognitive Science

University of California, San Diego

Ph.D.

2003

Cognitive Science

RESEARCH AND PROFESSIONAL EXPERIENCE

Winter 1993

Undergraduate Teaching Assistant Cognitive Science 17: Neurobiology of Cognition Instructor: Dr. Jaime Pineda Department of Cognitive Science, UCSD

xviii

1993 - 1994

Undergraduate Research Assistant Supervisor: Dr. Jaime Pineda Department of Cognitive Science, UCSD

Fall 1995

Graduate Student Researcher

(ongoing)

Dr. Jaime Pineda Department of Cognitive Science, UCSD

Fall 1995

Graduate Teaching Assistant Cognitive Science 107A: Functional Neurobiology Instructor: Dr. Jaime Pineda Department of Cognitive Science, UCSD

Winter 1996

Graduate Teaching Assistant Cognitive Science 107C: Cognitive Neuroscience Instructor: Dr. David Zipser Department of Cognitive Science, UCSD

Spring 1996

Graduate Teaching Assistant Cognitive Science 17: Neurobiology of Cognition Instructor: Dr. Jaime Pineda Department of Cognitive Science, UCSD

Sum. 1996

Completed laboratory rotation Supervisor: Dr. Richard Buxton Department of Radiology, UCSD

xix

Summer 1996

Completed laboratory rotation Supervisor: Dr. Mark Geyer Department of Psychiatry, UCSD

Fall 1996

Completed laboratory rotation Supervisor: Dr. Martin Sereno Department of Cognitive Science, UCSD

Fall 1996

Graduate Teaching Assistant Cognitive Science 101A: Experimental Approaches to Cognition Instructor: Dr. Christine Johnson Department of Cognitive Science, UCSD

Winter 1997

Graduate Teaching Assistant Cognitive Science 17: Neurobiology of Cognition Instructor: Dr. Jaime Pineda Department of Cognitive Science, UCSD

Sum. 1997

Course Co - instructor (with Dr. Jaime Pineda) Cognitive Science 17: Neurobiology of Cognition Department of Cognitive Science, UCSD

1997 - 1999

Senior Teaching Assistant Department of Cognitive Science, UCSD

xx

Winter 1998

Graduate Teaching Assistant Cognitive Science 17: Neurobiology of Cognition Instructor: Dr. Andrea Chiba Department of Cognitive Science, UCSD

Fall 1998

Brain Computer Interface Project Coordinator

(ongoing)

Cognitive Neuroscience Laboratory Department of Cognitive Science, UCSD

Sum. 1998

Course instructor (with Jaime Pineda) Introduction to Cognitive Neuroscience Department of Cognitive Science, UCSD

Sum. 1999

Course Instructor Cognitive Science 17: Neurobiology of Cognition Department of Cognitive Science, UCSD

Sum. 2000

Course Instructor Cognitive Science 17: Neurobiology of Cognition Department of Cognitive Science, UCSD

Winter 2000

Course Instructor (with David Leland) Cognitive Science 91: Undergraduate lecture series Department of Cognitive Science, UCSD

xxi

Spring 2001

Course Instructor Cognitive Science 91: Undergraduate lecture series Department of Cognitive Science, UCSD

Sum. 2001

Course Instructor Cognitive Science 17: Neurobiology of Cognition Department of Cognitive Science, UCSD

Fall 2001

Course Instructor Cognitive Science 17: Neurobiology of Cognition Department of Cognitive Science, UCSD

Winter 2002

Course Instructor Cognitive Science 91: SCANS Presents Department of Cognitive Science, UCSD

Spring 2002

Course Instructor Cognitive Science 91: SCANS Presents Department of Cognitive Science, UCSD

Fall 2002

Course Instructor Cognitive Science 91: SCANS Presents Department of Cognitive Science, UCSD

xxii

Winter 2003

Teaching Assistant Cognitive Science 1: Introduction to Cognitive Science Instructor: Dr. Mary Boyle Department of Cognitive Science, UCSD

Spring 2003

Teaching Assistant Cognitive Science 107C: Cognitive Neuroscience Instructor: Dr. Andrea Chiba Department of Cognitive Science, UCSD

HONORS AND PROFESSIONAL ACTIVITIES

Regents Scholar

1990 - 1994

Freshman Honors Program

1990 - 1991

Undergraduate Honors Program

1992 - 1994

BS awarded with high honors

1994

Awarded “Teaching Assistant of the Year”

1996

UCSD Graduate Fellow

2001

Awarded Dean’s Travel Fellowship

2001

Awarded Dean’s Travel Fellowship

2002

PROFESSIONAL AFFILIATIONS

Society for Neuroscience Cognitive Neuroscience Society

xxiii

Society for Psychophysical Research

MAJOR RESEARCH INTERESTS

-

The

integration

of

anatomical,

electrophysiological,

and

behavioral

approaches to the study of cognition -

Neural mechanisms of endogenous event-related potentials

-

Behavioral and electrophysiological effects of attention and distraction

-

Hypnosis, meditation, and altered consciousness and effects on attention

-

Prosthetic brain computer interface (BCI) systems based on EEG

-

The mu rhythm and its use in BCI systems

-

Learning and side effects of long term use of BCI systems

-

Single trial analysis of EEG data and applications to BCI systems

FUNDING None (funded by teaching)

REFERENCES

Dr. Jaime Pineda

Dr. Andrey Vankov

Cognitive Neuroscience Laboratory

Cognitive Neuroscience Laboratory

Department of Cognitive Science

Department of Cognitive Science

University of California, San Diego

University of California, San Diego

La Jolla, CA 92093

La Jolla, CA 92093

(858) 534-9754

2488

xxiv

(858)822-

Dr. John Polich Cognitive Electrophysiology Laboratory Dept. of Neuropharmacology, TPC-10 The Scripps Research Institute La Jolla, CA 92037

(858) 784-8176

PUBLICATIONS

1. Allison, B.Z., Vankov, A., Overton, J., Cassarino, M., and Pineda, J.A. Selective attention to tactile stimuli during hypnosis and waking conditions. Soc. Neurosci. Abstr, 23(2): 1585, 1997. 2. Allison, B.Z., Vankov, A., Hughes, J.L., Arnoldussen, A., and Pineda, J.A. Near real time recognition of EEG changes associated with movement. Soc. Neurosci. Abstr., 24(1): 435, 1998. 3. Allison, B.Z., Vankov, A., Overton, J., and Pineda, J.A. Selective attention to tactile stimuli during hypnosis and waking conditions. J. Cog. Neurosci., Suppl S: 135, 1998. 4. Allison, B., Vankov, A., and Pineda, J.A. EEGs and ERPs associated with real and imagined movement of single limbs and combinations of limbs and applications to brain computer interface (BCI) systems. Soc. Neurosci. Abstr., Vol. 25, Program No. 568.8, 1999. 5. Pineda, J.A., Allison, B. Z., and Vankov, A. Self-movement, observation, and imagination effects on Mu rhythm and readiness potentials (RPs): Towards a brain

xxv

6. computer interface (BCI). In: From Basic Motor Functions to Functional Recovery. Eds. N. Gantchev and G. N. Gantchev, p. 159-164, 1999. 7. Allison, B., Vankov, A., and Pineda, J.A. Mu rhythm BCIs.

Presentation at

McDonnell – Pew retreat, 1999. 8. Allison, B.Z., Vankov, A., and Pineda, J.A. Rapid recognition of EEG changes preceding voluntary movements. J. Cog. Neurosci, Suppl S: 29, 1999. 9. Allison, B., Vankov, A., and Pineda, J.A. Mu rhythm and readiness potentials (RPs) and applications toward a brain computer interface (BCI). Invited presentation, Brain Computer Interface Conference, June 1999. 10. Pineda, J.A., Allison, B.Z., and Vankov, A.

Self-movement, observation, and

imagination effects on Mu rhythm and readiness potentials (RPs): Towards a braincomputer interface (BCI). IEEE Transactions on Rehabilitation Engineering, 8(2): 219-222, 2000. 11. Allison, B. Z., Vankov, A., Obayashi, J., and Pineda, J.A. ERPs in response to different display parameters and implications for brain computer interface systems. Soc. Neurosci. Abstr., Vol. 26, Program No. 839.5, 2000 12. Allison, B., Vankov, A., and Pineda, J.A. The need for speed: designing faster BCI systems. Presentation at McDonnell – Pew retreat, 2000. 13. Allison, B. Z., Vankov, A., and Pineda, J.A. Think and Spell. J. Cog. Neurosci., Suppl S: 148, 2001. 14. Allison, B. Z., Vankov, A., and Pineda, J.A. Toward a faster, better BCI. Invited presentation, Neuroinformatics conference, 2001. 15. Allison, B. Z., Vankov, A., and Pineda, J.A. Soc. Neurosci. Abstr., Vol. 27, Program No. 741.13, 2001. 16. Allison, B. Z., Vankov, A., and Pineda, J.A. Neurosci., Suppl S: 150, 2002.

xxvi

Improving a P300 BCI. J. Cog.

17. Allison, B. Z. and Pineda, J.A. ERPs in response to different display parameters. Poster presented at BCI 2002 Conference, Rensselearville, New York. 18. Allison, B. Z. and Pineda, J.A. ERPs Evoked by Different Matrix Sizes: Implications for a Brain Computer Interface (BCI) System. In press, IEEE. 19. Allison, B. Z. and Pineda, J.A. Effects of SOA and Flash Pattern Manipulations on ERPs, Performance, and Preference and Implications for a Brain COmputer INterface (BCI) System. In preparation.

xxvii

ABSTRACT OF THE DISSERTATION

P3 or not P3: Toward a Better P300 BCI

by

Brendan Allison Doctor of Philosophy in Cognitive Science University of California, San Diego, 2003 Professor Jaime A. Pineda, Chair

A brain computer interface (BCI) is a realtime comunication system designed to allow users to voluntarily send messages or commands without sending them through the brain's normal output pathways. BCI users send information by engaging in discrete mental tasks that produce distinct EEG signatures. These tasks, called cognemes, form the basis of a BCI language. In P300 BCIs, users view a display containing several stimuli, one of which is the target. Stimuli are flashed sequentially, and users count target flashes, thereby conveying one of two cognemes (/flash attended/ or /flash ignored/). Only attended flashes produce robust P300s, enabling target identification via the EEG. While BCIs can benefit disabled and healthy users, they are too slow to be practical for most situations. There are three avenues toward improved BCI information throughput. First, users could generate more cognemes per minute. Second, more information could be derived from each cogneme. Third, the EEG patterns associated with each cogneme

xxviii

xxix could be more accurately discriminated. This dissertation explored four manipulations in three studies designed to improve information throughput in a P300 BCI. In the first study, two flash patterns (single versus multiple flashes) and three stimulus onset asynchronies were manipulated. Results show that the multiple flashes approach derives more information from each cogneme and can create more robust differences in the ERPs. Faster flashes enabled more cognemes per minute, but produced less discriminable ERPs. Some subjects found the multiple - fast condition too difficult, but this may change with practice. The second study varied the number of elements in the BCI's vocabulary (16, 64, or 144 elements). Smaller vocabularies require less time to identify a target, but often require spelling out phrases not present in the vocabulary. Results showed that larger grids produced more distinct ERP differences. The third study applied independent component analysis (ICA) to data preprocessing. ICA rejected artifacts and isolated ERP components that varied with attention from those that did not. This would effectively speed the recognition and classification of patterns of brain activity. Therefore, the novel display and preprocessing approaches explored in this dissertation can substantially increase BCI information throughput.

CHAPTER 1: INTRODUCTION

One of the most important and distinguishing aspects of humans is the ability to communicate. Communication between people is richer and more complex and than any other form of communication, and plays a vital role in any relationship. Similarly, as artificial devices become more complicated and play a rapidly waxing role in everyday life, communicating effectively with them becomes increasingly important. It is impossible to directly convey thoughts, emotions, or concepts between people. Instead, these must be translated into verbal or written statements, gesticulations, facial gestures, drawings, or other recognizable expressions. Though not typically regarded as such, much of human anatomy is designed to act as a natural interface, allowing people to convey ideas from one brain to another. Verbal and written messages are typically sent using the mouth and throat or the hands and are received by the ears or eyes, all of which are mediated by extensive processing mechanisms in the brain. While communication between humans has been extensively developed and studied, communication between people and devices – especially sophisticated electronic systems – is relatively embryonic. Only 60 years ago, state of the art computing systems like ENIAC or UNIVAC required punch cards for communication. An efficient interface is no less important that the device itself; imagine trying to use a modern computer with punch cards. Modern means of interfacing with a computer such as a keyboard and mouse are vastly superior, but remain nonintuitive and are being continually developed. The use of sophisticated computer tools such as real-time graphics, multimedia, and ubiquitous computing, combined with future developments in 3-D representation, are

1

2 creating a complex computational environment in which information overload is common. In such an environment, the usual modes of communicating with a computer, such as keyboard and mouse, are very slow and inefficient. The trend to solving this problem has involved the development of automated task managers, better visualization tools, and the development of more intuitive interfaces that recognize innate human skills, such as handwriting, gesture, and speech. As brain science and computer technologies mature, it is inevitable that the ultimate intuitive interface will involve direct communication between the user’s brain and a computer, or brain computer interface (BCI). This safe and non-invasive method of communication requires the wearing of a small device similar to a headset to detect brain electrical activity and communicate that activity to a computer or electronic device. This method has many potential advantages over current input modes. This dissertation describes and explores specific ways to improve BCI communication. A BCI is a realtime communication system designed to allow a user to voluntarily send messages or commands without sending them through the brain’s natural output pathways. These systems can improve people’s ability to convey information via two general avenues. First, they may restore some communication ability to severely disabled individuals, who are sometimes unable to communicate any other way. Second, healthy users may find BCIs an appealing means of supplementing or even replacing other interfaces for a variety of reasons, discussed in chapter 3. BCIs are not currently in everyday use, and this is not likely to change in the near future. Their main drawback is their very poor information transfer rate. A typical speaker or skilled typist can easily send information above 100 words per minute, while

3 the best BCIs currently available allow only a few words per minute. BCIs have other drawbacks as well; for example, they are more expensive than most other interfaces, require preparation to use, are not supported by common software, and may produce fears of invasive mind reading. However, growing attention to BCI research, as well as ongoing developments in relevant fields such as cognitive neuroscience, pattern recognition, electronics, computing, and brain imaging, provides grounds for optimism regarding the future of BCIs. While BCIs will not become commonplace among everyday users for many years, they could be quite prevalent someday. BCIs are not at all a new notion in science fiction; movies such as The Matrix, Firefox, and Strange Days, as well as books such as Neuromancer and indeed the entire cyberpunk genre, are predicated on the notion of a BCI enabled society of the future. As ongoing developments continue in both BCI research and the related fields mentioned above (all of which are important facets of cognitive science), BCIs might become appealing, popular, useful, and even essential. What might a BCI enabled society look like? The utopian and/or dystopian views of science fiction are unlikely. Instead, BCIs will become increasingly embedded in other devices, blending with other electronic devices until they are as common and mundane as a wristwatch. Context dependent BCIs are likely, in which function is dependent on setting. A simple BCI with only three degrees of freedom could serve as a remote control for a TV when the user is watching a TV, a mouse while using a computer, speed dialer while holding (or thinking of) a phone, etc. Using a BCI, like using any other interface, may first be an exciting novelty for a tech-obsessed minority, but will eventually become a mundane part of everyday life.

4 Concerns that BCIs will bring about a new era of involuntary invasive monitoring among the general populace merit discussion for two reasons. First, most BCI researchers do not want such a world to come about as a result of their work; second, concerns about BCI abuse may impede BCI development. Such concerns are unfounded, and will remain so in the near future, as it is not possible to glean very much information about someone’s loyalties, thoughts, bad habits, etc. from the EEG. However, if cognitive neuroscience and brain imaging to continue to develop at a rapid pace, this will change. Even if it does become possible to extract information from EEG data contrary to the users’ wishes, individuals would have to choose to send raw EEG over the Internet or other media that could be intercepted by the government or other monitoring entity. This is unlikely; modern BCIs process the EEG using local mechanisms, and there are few situations in which it would be preferable to send one’s brainwaves to a remote location for processing. Even if one did so, users could and should choose to send a semiprocessed EEG signal containing only the information a BCI needs to function, or it could be encrypted. People today are generally cautious about sending credit card numbers, passwords, or other potentially damaging information via the Internet; it is difficult to imagine why anyone would haphazardly send raw EEG data if they knew it could be used against them. However, BCI privacy issues are a distant concern given the much more immediate and practical problems with BCIs. The main obstacle to ongoing BCI development is information throughput. This dissertation explores four avenues toward improving BCI information throughput. The first study explores both SOA and “flash patterns,” a novel approach toward stimulus presentation discussed further in chapter

5 four. The second study explores matrix size – in effect, the size of the BCI’s vocabulary, discussed further in chapter 5. The third study explores a preprocessing approach, ICA, which is useful in isolating EEG activity of interest from noise, discussed further in chapter 6. This dissertation is organized into the following chapters:

1.

Introduction

2.

Neurobiology of BCIs

3.

Current Status of BCI Research

4.

Study 1: Effects of SOA and Flash Pattern Manipulations on ERPs,

Performance, and Preference, and Implications for a Brain Computer Interface (BCI) System 5.

Study 2: ERPs Evoked by Different Matrix Sizes: Implications for

a Brain Computer Interface (BCI) System 6.

Study 3: Independent Components Analysis (ICA) and its Potential

Value in a P300 BCI System 7.

Conclusion

Appendix: Additional BCI Work BCI Glossary Works Cited

Chapters 2 and 3 contain review and commentary of literature relevant to BCIs. Chapter 2 provides an overview of EEG recording, including basic signal processing and

6 data analysis, the advantages and drawbacks of EEGs relative to other imaging approaches, and some EEGs used in BCIs. Chapter 3 discusses BCIs themselves, including an assessment of BCIs versus other interfaces, a thorough review of existing BCIs and some similar systems with critical and comparative commentary, and avenues toward improving BCIs – including the studies of this dissertation. Chapter 4 describes the first study, which examines two factors, SOA and flash patterns, and their effects on three measures, EEGs, performance, and subjective report. In any BCI, faster event presentation allows users to send more messages or commands per minute. However, faster event presentation also yields less distinct EEG patterns, and may be impossible or prohibitively difficult. This tradeoff is explored by using three different SOAs: 125, 250, and 500 ms. BCI information throughput can also be improved by deriving more information from each flash. This may be done by changing the display such that fewer events are required to identify which of several stimuli the user wishes to convey. The first study introduces a new approach toward flashing letters, called the multiple flashes approach, which can identify a target character with fewer events than the conventional single flash approach. The multiple flashes approach also exhibits tradeoffs with signal robustness and difficulty. What set size is best for a BCI? BCIs with large vocabularies may be desirable. If a BCI’s vocabulary is so small that users cannot find the information they wish to send, the user must circumlocute, spell out a word or phrase, or otherwise spend more time than would otherwise be necessary. However, larger vocabularies also increase the time needed to identify each element in the message. The second study of this dissertation, presented in chapter 5, explores this tradeoff. Subjects viewed three different grids,

7 containing 16, 64, or 144 elements each. As in the first study, three dependent variables are explored: EEG measures, performance, and subjective report. BCI information throughput is also critically dependent on the software used to classify EEG patterns. Better pattern classification software can improve information throughput in two ways. First, fewer errors are made, reducing time wasted with corrections. Second, less signal averaging is required, meaning that fewer events are required to convey interest in a particular element in the BCI’s vocabulary. Chapter 6 presents the third study of this dissertation, in which a preprocessing approach of potential value to BCIs called independent component analysis (ICA) is explored. Conventional data analysis approaches, which utilize measures such as peak amplitude and latency, are vulnerable to noise and may miss important information, especially information present in single trials. Noise is a relatively minor problem in laboratory EEG recording, in which subjects are instructed to remain still and noisy trials can be removed. It is a very serious problem in practical BCIs, in which users move more frequently and the removal of a noisy trial means that a user must repeat herself. ICA can isolate independent contributions to the EEG signal seen at different sites, which can eliminate noise and highlight components that vary with attention from those that do not. Chapter 7 presents a summary of the studies and their contributions to the field, suggestions for future directions, and concluding comments. The appendix describes additional BCI studies the author conducted. While those studies were not part of the dissertation, they are presented nonetheless due to their relevance to BCIs. The glossary contains a brief review of terminology used in this dissertation, and the references list the works cited in this dissertation.

8

CHAPTER 2: NEUROBIOLOGY OF BCIs

2.1. Why Are BCIs Based on Electrical Measures of Brain Activity? There are many ways to image brain activity, and different imaging techniques may yield different information because they utilize different correlates of neural activity. EEGs measure the brain’s electrical activity, while MEGs measure magnetic activity related to neural function. Functional MRI, PET, and SPECT measure changes in the brain’s blood flow, since brain regions may require more blood as they become more active. EEGs offer five crucial advantages over other functional imaging approaches as the basis for a BCI. First, it is currently impossible to measure changes in magnetic activity or blood flow without expensive and bulky equipment. This factor alone makes all functional imaging approaches other than EEGs impractical for a BCI designed to be used frequently. Scalp EEGs can be recorded with inexpensive and portable equipment. While the initial surgery to implant an intracranial monitoring system requires a hospital, EEGs can then be read for years without additional equipment. Second, preparing a subject for other recording approaches often requires considerable time, risk, and at least one highly trained technician. PET and SPECT both require the injection of radioactive material. This adds considerably to preparation time and creates enough risk to the subject that neither technique could be used frequently enough to allow for consistent

9

10 BCI use. MEGs and fMRIs are safe, and require less prep time than PET or SPECT, but nonetheless require a technician.

A subject can be prepped for a laboratory EEG

recording session in minutes. The individual who preps the subject needs very little training, and subjects can easily learn to prep themselves. Portable EEG systems are in their infancy, yet some have already been developed that allow subjects to prep themselves in less than one minute (eg, Pineda et al. 2003; Berka et al., 2003). Intracranial EEGs can also be recorded with very little preparation time after they are implanted. Third, only EEGs and MEGs can measure brain function continuously and analyze it in realtime. A typical EEG recording session produces comparatively small datasets that can be analyzed with a meager office computer by an artificial system with no noticeable delay. PET, SPECT, and fMRIs can only take one snapshot of the brain every several seconds, and at least a few more seconds are required to process that data. While ongoing advancements in processor speed may soon allow any functional brain image to be processed in realtime, the limitations on sampling rate are much more entrenched and trenchant. This is also a serious drawback for other imaging approaches; their very low sampling rate would result in a very slow BCI. Fourth, because EEGs have been the predominant noninvasive methodology for studying brain function for decades, the relationship between EEGs and brain function is much better documented. For example, EEG changes resulting from movement imagery are known, and many BCIs utilize these changes as a control signal. These movement-related changes in brain activity are much harder to pick up with other approaches, partly because of the low sampling rate of most approaches but also the relatively poor understanding of how movement imagery is reflected in fMRI, PET, SPECT, or MEG. Fifth, other approaches

11 place additional constraints on the subject. All of them require the subject to remain still throughout the imaging session, while modern EEGs do not. FMRIs are extremely loud, making it very difficult to hear and concentrate on other tasks. All MRIs also require a very powerful magnetic field, making them unfeasible for many users and creating additional complications for any recording session. Given these drawbacks, why are other imaging approaches used at all? The main drawback of scalp recorded electrophysiological signals is that they have poor spatial resolution. In fact, it is impossible to determine exactly which populations of neurons are responsible for any one set of electrical recordings on the scalp. Kutas and Dale (1997) note that “... the so-called ‘inverse problem’ of determining the locations, orientations and time-courses of the set of dipoles producing the electric and magnetic recordings is ill posed, i.e. it has no unique solution. In other words, there are, in general, infinitely many distributions of dipoles inside the brain which are consistent with any set of electric and/or magnetic recordings (Nunez 1981, Sarvas 1987).” The task of determining the exact sources of a signal seen on the scalp requires the use of other techniques with superior spatial resolution. Intracranial EEGs have excellent spatial resolution, but are only usable in extreme circumstances due to the need for surgery to implant the necessary microelectrode(s). Despite their limitations, EEGs have proven to be a very useful and practical tool in experimental research. Compared to other means of detecting neural activity, such as PET or fMRI, EEG recording is fast, easy, safe, inexpensive, noninvasive, requires fairly little subject preparation, can be done on almost everyone, requires very little equipment (all of which is portable), and can be performed in a wide variety of noisy settings.

12 Furthermore, interpreting such data is fairly easy given the abundance of relevant research. These constraints are all important for a practical BCI. It is thus not surprising that all published work to date involving a BCI is based on electrophysiological data. There is simply no other imaging technique that is practical as a component of a widely used BCI. In keeping with precedents in the literature (Vidal, 1973; Wolpaw and McFarland, 1994), the term BCI shall hereafter refer specifically to interfaces based on reading and interpreting electrophysiological data unless otherwise noted.

2.2: Why Does Electrical Activity Reflect Cognitive Activity? The type of brain cell that is primarily responsible for storing and transmitting information is the neuron. All neural communication involves the flow of various ions, such as sodium, potassium, chlorine, or calcium, into and out of each neuron. Since these ions have an electric charge, all neural communication involves changes in the electrical potential inside relative to the outside of the neuron that can be detected at a distance through a variety of means (Kutas and Dale, 1997; Kalat, 1998). Why is this electrical activity of interest? Most neuroscientists agree that mental and neural activities are necessarily correlated. That is, every thought, emotion, and action we experience is associated with different patterns of activity in networks of interconnected neurons. Thus, a fundamental assumption underlying EEG work is that any mental experience, even if unconscious, has a corresponding electrical signature in the brain that is theoretically detectable.

13 Unfortunately, acquiring, interpreting, and using the electrophysiological signatures of mental activity are not easy tasks. The brain contains hundreds of billions of neurons, each of which forms synapses with an average of 10,000 other neurons. Hence, meaningful cognitive activity involves the participation of large, distributed networks of millions of neurons working together. It is this electrical signature, viewed from the scalp, which is of interest to EEG and BCI researchers. Of course, no BCI can translate all of the brain’s electrical activity into messages or commands. Instead, BCIs attempt to classify portions of the user’s EEG into one of two or more distinct categories. Each category reflects distinct mental activities a user may perform in order to communicate via a BCI. For example, a P300 BCI determines whether a user was attending to a recently presented event. Other mental activities a user may perform while using a P300 BCI, such as imagined movement, are meaningless to a P300 BCI. A mu BCI can recognize the EEG patterns associated with some movement imagery, and translate these into messages or commands, but cannot utilize selective attention as an input signal. Hence, different BCIs allow different vocabularies, depending on the mental activities they can discriminate via the EEG. The term “cogneme” shall refer to the smallest amount of mental activity capable of producing a difference in meaning in a BCI. Cognemes are discussed further in section 3.2.

2.3: EEG Recording and Signal Processing 2.3.1 Macroelectrodes

14 A macroelectrode is a flat or cone-shaped metal disk. It is between 5 and 10 mm wide and typically made of gold, tin, silver, or silver chloride. While macroelectrodes can be permanently affixed to the scalp or even placed inside it, most macroelectrodes are scalp electrodes meant for short duration recording sessions. Scalp macroelectrodes are safe, non-toxic, and can be used by anyone. Macroelectrodes implanted on the surface of cortex can be used safely for much longer periods. Data recorded by these cortical macroelectrodes are called ECoGs. Since electrodes measure the electrical difference between an active and a neutral reference site, it is necessary to have two or more electrodes to measure activity. The minimum number of electrodes for a BCI, therefore, is two. When preparing subjects for recording, it is necessary to first prepare (“prep”) them in order to ensure a clean signal. In clinical and basic research, the area under each electrode is first abraded. This removes the top layer of skin cells, which are dead and conduct electricity poorly, and reduces the noise generated by changes in electrodermal activity, called the galvanic skin response or GSR. Since the skin is only lightly scratched, this process is not painful. Next, a conductive gel is placed between the electrode and the skin, and it is often necessary to fine-tune the electrode-scalp connection by manipulating and adding gel while recording the electrical resistance between the scalp and electrode. When the resistance drops below an acceptable threshold (typically 5 kiloOhms), the electrode is ready for recording. This “prepping” process is substantially faster and easier than prepping subjects for most other means of imaging the brain. New sensor technologies may make this even easier. Semi-dry electrodes utilize a different type of gel than conventional electrodes.

15 Because semi-dry gel is much more viscous than conventional electrode gel, and is prepackaged in the electrode, the process of fine-tuning the connection with the scalp is not necessary. This reduces prep time and eliminates the need for conventional gel. An even more promising development is the dry electrode, which requires no gel at all. Dry and semi-dry electrodes are not widely used at present because they are noisier than other electrodes, but ongoing research will likely address this problem (Taheri et al. 1994). EEG signals are often recorded from an electrode cap containing an array of electrodes. These caps may contain as many as 256 recording sites, though typical caps use 16, 32, 64, or 128 sites. High-density arrays can yield more information, especially about the spatial distribution of the electrical activity. However, they are not feasible for practical BCIs and for many scientific projects, as they require more time to prepare for each subject, produce more data and hence require more processing resources. In addition to scalp electrodes, experimenters often record from one or two electrodes near the eyes to detect the electrical activity associated with eye movements (called the electrooculogram or EOG), as well as an electrode to serve as a ground. The activity produced by eye movements and blinks can be one to two orders of magnitude greater than an EEG signal. Data containing excess artifact from eye movements or other activities such as fidgeting or swallowing is sometimes thrown out as “artifact,” though it is also possible to remove the noise from the data. A BCI could conceivably use EOG information to supplement EEG activity and provide additional degrees of freedom, though this is not yet common.

16 Another issue in electrode placement is how the electrodes are referenced. In a bipolar montage1, electrodes are grouped in non-overlapping pairs, and potentials are recorded between each pair. Hence, a bipolar montage meant to record 16 waveforms would require 32 electrodes. In a monopolar montage, each electrode is paired with a common reference. This common reference may be a single electrode placed at a “neutral” site (such as the mastoid or earlobe) whose activity is presumed to be unaffected by brain activity, or it may be an average of two or more electrodes. For example, the studies in this dissertation utilized a linked mastoid reference, in which activity at each electrode is compared to the mean of two electrodes, one on each mastoid. This approach is superior to the use of a single mastoid reference because it can more reliably detect asymmetric potentials. Some studies reference each electrode to more than two electrodes. Each electrode may be referenced to its neighbors, and it is also possible to refer each electrode to the average of all electrodes, called the “common average reference” or CAR technique. Most cognitive ERP research utilizes a monopolar reference. The data obtained in nearly all of these studies were recorded in shielded rooms in academic laboratories and are relatively free of external sources of noise. BCIs, however, must often operate in noisy environments, where electrical devices can create considerable interference. This is why some BCIs use bipolar differential recording, in which the activity at one site is measured relative to another nearby scalp site. Since both scalp sites tend to pick up

Technically, all electrode montages are bipolar, as they detect an electrical difference between two points. However, the definitions of “bipolar” and “monopolar” used here are very widely used in the EEG literature. 1

17 about the same noise, most noise is cancelled out, and the remaining signal reflects the difference in activity between two scalp sites.

2.3.2 Microelectrodes It is practically impossible to detect the electrical activity of a single neuron without placing an electrode in or very near it. Electrodes designed to monitor one or few neurons inside the brain are called microelectrodes (Carlson, 2002). Microelectrodes designed for acute recording may be made of glass. Those meant for use with BCIs must be capable of chronic recording, and are typically made of tungsten wires insulated with varnish. The tip of a microelectrode is typically sharpened to a point no larger than one micron. This approach is a popular one in animal studies, but cannot ethically be done with humans except when medically necessary, because placing the electrode in the brain causes damage to surrounding tissue and creates the risk of infection. Depending on the nature of the microelectrode and its placement, microelectrodes may record from a single neuron (called single unit recording) or more than one neuron (called multiple unit recording). Since most mental activity is assumed to involve large numbers of neurons, it may seem that information from one or few neurons would create a sampling problem. In fact, studies have shown that recordings from a very small subset of neurons may indeed be enough to estimate the activity of a large population. For instance, voluntary movements require the coordinated activity of millions of neurons in the primary motor cortex. However, each of these neurons sends more or less the same information, and thus recordings from few neurons can reveal information such as the direction and force

18 of movement (Georgopoulis et al., 1986; Schmidt et al., 1988; Schwartz, 1993; Donoghue and Sanes, 1994; Heetderks and Schmidt, 1995; Nicolelis et al., 1998; Chapin et al., Wessberg et al., 2000). BCIs based on both types of recording from motor areas (microelectrode and macroelectrode) have been built (eg, Chapin et al. 1999, Pineda et al. 2003). Microelectrodes offer a few key advantages over macroelectrodes. Because the microelectrode can be placed at a precise location in the scalp, they do not suffer from poor spatial resolution. In fact, the spatial resolution of an implanted microelectrode is far better than any noninvasive functional imaging approach. Similarly, since the electrode is near or inside a target neuron, the signal smearing that occurs as an electrical signal passes through the meninges, CSF, scalp, and skin does not occur, resulting in a cleaner signal. Microelectrodes are less susceptible to some types of noise (such as EM interference from an AC current or monitor) than macroelectrodes. Finally, since current technology does not allow noninvasive measurement of the activity of one or a few neurons, microelectrodes can provide information about brain function invisible to most other approaches. Microelectrodes have long been used in neurosurgery to help diagnose and treat a number of disorders such as epilepsy and Parkinsonism (eg, Jasper and Penfield, 1949; Walter and Crow, 1964). While microelectrodes are only implanted when surgically necessary, surgeons soon began to perform other experiments with intracranial systems in patients who required them (eg Penfield and Jasper, 1964; Ojemann, 1983). Microelectrodes have contributed significantly to research in many areas, including language (eg, Gogolitsin and Kropotov, 1981; Heit et al., 1990; Abdullaev and

19 Melnichuk, 1997), error detection (eg Bechtereva, 1971, 1978; Gemba et al., 1986; Weitzman et al., 1988), and motor function (eg Georgopoulis et al., 1986; Taylor et al., 2002; for review of microelectrode research, see Bechtereva and Abdullaev, 2000). They also serve as the basis for the implanted BCI systems discussed in section 3.6. Two novel types of microelectrodes have been used in BCI systems. One is the Utah Intracranial Electrode Array (UIEA), which contains 100 penetrating silicon electrodes, each between 1 and 1.5 mm long, arranged in a 10 x 10 array against a square silicon backing. It is placed on the surface of cortex, with the needles penetrating into the brain (Jones et al., 1992). This was shown to be capable of recording neural signals (Nordhausen et al., 1994), and a later study (Maynard et al., 1997) suggested that the information acquired by the UIEA could be used to record motor signals of value to a motor prosthesis. After the UIEA was shown to be capable of chronic recording in cats (Rousche and Normann, 1998), it was used chronically in human BCIs (Donoghue 2002, Serruya et al. 2002). The UIEA is also capable of electrially stimulating neurons (Rousche and Normann, 1999), suggesting it could also be of value writing information to the brain. Systems designed to write information to the brain, or “CBIs,” are discussed in chapter 3.

Figure 2-1: The Utah Intracranial Electrode Array (UIEA)

20 A second type is the cone electrode developed by Kennedy and colleagues, which consists of an insulated gold wire inside a hollow glass cone. Before the electrode is placed inside the brain, a sciatic nerve is placed inside the cone. This encourages cortical neurites from adjacent neurons to grow into the cone. This growth activity often makes stable recordings difficult for several weeks after implantation, but cone can then record successfully for years. The cone electrode was tested in rats (Kennedy, 1989), then monkeys (Kennedy et al. 1992a; Kennedy et al. 1992b; Kennedy and Bakay, 1997), and has been used successfully in human BCIs (Kennedy and Bakay 1998, Kennedy et al. 2000; Kennedy et al. 2002).

Figure 2-2: The cone electrode

2.3.3 Early Signal Processing The electrical information acquired from microelectrodes or macroelectrodes is first amplified, filtered, and converted from an analog to a digital signal. EEG signals are typically on the order of 100 microvolts and are amplified by a factor of 5,000 – 10,000. Filtering consists of removing frequencies that are not of interest to the experimenter.

21 There are four types of filters. A lowpass filter allows low frequency information to pass, but removes higher frequencies. A highpass filter allows high frequencies and removes lower ones. A bandpass filter, essentially a combination of lowpass and highpass filters, allows only a band of frequencies to pass, and a notch filter removes all frequencies in a certain range. The most common notch filter is 60 Hz, as AC current is 60 Hz and can cause significant noise in EEG recordings. Filtering is a crucial step in noise reduction, since certain types of artifact occur at known frequencies and cognitive activity very rarely occurs outside of the 3-40 Hz range. Typical filter settings include a lowpass filter of 100 Hz, a highpass filter of .01 Hz, and a notch filter at 60 Hz. Data must also be digitized so they can be analyzed using a digital computer. This is typically done with using an analog to digital board (ADB), although ongoing advances in microelectronics have resulted in small and inexpensive chips capable of digitizing EEG data. Though A/D conversion equipment can easily sample on the order of GHz, EEG data is rarely sampled above 512 Hz because the brain simply does not operate fast enough to justify higher frequencies. Sampling too fast provides unnecessary data that could be interpolated if necessary from adjacent data points. Sampling too slowly creates the risk of missing certain frequencies. The Nyquist sampling theorem states that the sampling rate must be greater than twice the maximum frequency of interest in the signal. A common sampling rate for cognitive tasks is 256 Hz. This rate is well above double the maximum frequencies generated by cognitive activity, yet slow enough to avoid extraneous data. Some BCIs (eg, Farwell and Donchin 1988) later resample the data at much lower frequencies in order to accentuate slower components like the P300 and reduce the contributions of faster components. While this resampling

22 did highlight the P300, the information lost from earlier components may have been useful as well. Another issue in A/D conversion is the binary bitrate. The greater the number of bits used to represent the signal, the greater the resolution. EEG studies typically use at least 8 bits, allowing resolution of up to 256 distinct points. Sampling with 12 or 16 bits is more common, and 24 bit sampling is sometimes seen with high density arrays. The process of amplifying, filtering, and digitizing the data is a simple, rapid, and automatic one; no human intervention is required. Most subsequent stages of EEG processing require more complex decision-making, and either a human or artificially intelligent system is necessary. These later stages of EEG signal processing may be as important a challenge to BCI designers as the task of classifying the resulting EEG data.

2.3.4. Artifact Removal Subjects in EEG experiments will blink, squirm, and glance about, as would be expected of anyone asked to sit in a chair for a long time and engage in what is often a repetitive task. Unfortunately, these movements may introduce periods of electrical noise or artifacts that may be difficult to discriminate from neural activity. Hence, a necessary stage in EEG processing is artifact rejection. Artifacts can dramatically alter the signal recorded at all scalp sites, especially those closest to the source of the noise. The experimenter may also wish to reject portions of the EEG record for other reasons, such as the subject becoming drowsy or failing to perform a task properly. Hence, it is often necessary to view the EEG data, on either a monitor or printout, and manually determine which regions of the data contain excessive artifacts. This is where signal processing

23 becomes relatively complicated. Determining what is considered artifact, how much artifact is excessive, and removing artifacts from real data require a skilled human or a sophisticated artificial system. There are two general methods for removing artifact from the EEG record. The simplest is to determine which time periods of data contain artifact and remove all data recorded during that period from further analysis. This method is fast, easy to implement or even automatize, reliable, and still in use by modern researchers. Though it results in the loss of a certain portion of the data recorded, this is often mitigated by recording for longer periods or larger numbers of trials. However, this method of artifact removable is not acceptable for many experiments. Studies may involve unique subject groups, such as children, or specific tasks that produce excessive artifact, or the data corrupted by artifacts may also contain data of interest. The levels of artifact produced under laboratory recording conditions are much lower than those that would be seen otherwise. A BCI that removed all data containing artifact might be left with too little clean data to be of practical use. More complex means of artifact removal exist. For example, if it is assumed that the measured signal at any one electrode site is equal to the data recorded from the brain (signal) plus some contribution of artifact (noise), it is possible to estimate the contributions of each and subtract the noise. The “clean” or artifact free data may then be processed further. Early approaches to the task of subtracting artifact from the EEG record, either online or after data collection, were met with limited success (Hillyard and Galambos 1970; Girton and Kamiya 1973; Gevins et al. 1977). However, as a reliable means of

24 removing artifact from the EEG record and leaving clean data would be of tremendous value, there has been an ample amount of research toward this goal (Barlow and Remond 1981; Verleger 1982; Gratton et al. 1983; Jervis et al. 1985, 1988, 1989; de Beer, 1994; Berg and Sherg, 1994; Huotilainen et al., 1995; Karhunen, 1996; Makeig et al., 1996; Vigario, 1997; Jung et al., 1997, 1998a, 1998b). Many of these newer approaches involve techniques including independent component analysis, neural networks, Kohonen maps, and other methods which were either unavailable or much less well known during the early days of EEG signal processing. Techniques for online removal of EEG artifact without rejecting other data are now prevalent in EEG processing software. However, this software is typically used with data recorded under carefully controlled conditions in a shielded environment. The challenge of quickly and reliably removing artifacts from the EEG record in the noisy environments encountered outside of a laboratory testing environment remains a difficult but necessary one for BCI designers.

2.3.5. Free Running EEG (FREEG) The next stages of EEG processing depend on the nature of the experiment and the type of signal desired. EEGs may be analyzed in two ways: as free running EEG or as Event Related Potentials (ERPs). In keeping with precedents in the literature, the term “EEG” shall refer to any EEG activity – free running EEGs or event related potentials. Free running EEGs are sometimes shortened to “FREEGs.” Numerous studies have found that the EEG record contains many regular oscillations that are believed to reflect synchronized rhythmic activity in a group of neurons. These oscillations can be categorized according to their frequency and location. For example, waves with a

25 frequency of 8-13 Hz over the occipital lobe are called alpha waves and appear to reflect “idling” in the visual cortex, while the same frequency oscillations over the posterior frontal lobe are called mu waves and reflect idling activity in motor areas. Rhythmic synchronized EEG activity appears to be most prevalent in areas that are not active at a particular time. Some FREEG components are thus referred to as “idling rhythms” over a specific area. FREEGs can now be divided into several categories, according to the frequency and location of the wave (Coles et al., 1986; Kutas and Dale, 1997). These frequency ranges are not equally studied by all researchers, and there is no universal accord on exactly which frequencies belong in which category. However, most research articles explicitly define the frequency ranges associated with each label, and rarely deviate by more than 1 Hz from the labels described below. Some categories of FREEG activity are:

delta activity (0.5 – 4 Hz) reflects very slow, high amplitude oscillations of neural populations and may be seen in many brain regions. In healthy subjects, delta waves are only seen during deep sleep. theta activity (4-8 Hz) also appears distributed across the brain. It is associated with very relaxed or deep meditative states and is prevalent during some sleep stages. alpha activity (8-13 Hz) is most prevalent over the occipital and parietal areas. It is typically associated with relaxed wakefulness and appears in lighter sleep stages, such as stages 1 and 2. Alpha activity has been shown to increase over the occipital lobe when a subject’s eyes are closed. As the occipital lobe is primarily responsible for vision, this

26 observation is consistent with the view that alpha activity represents an idling rhythm over visual processing areas. mu activity (8-13 Hz) is most evident over the posterior frontal lobe, the region which contains motor processing areas. The mu rhythm is most pronounced when subjects are not moving, and becomes desynchronized when subjects observe or think about movement (Pineda et al., 1998) again consistent with the view that rhythmic synchronized activity reflects an idling rhythm. beta activity (13-40 Hz) usually consists of irregular, very small amplitude waves which are most prevalent over frontal regions when the subject is alert and engaged in a task requiring mental activity. This observation seems inconsistent with the “idling rhythm” hypothesis, but given the high frequency, erratic nature, and low amplitude of beta waves, they seem to reflect less synchronized activity than the other FREEG types described here.

2.3.6. Event Related Potentials (ERPs) Experimenters are often interested in the brain’s electrical activity that precedes or follows a specific event - an event related potential or ERP. This event may be the appearance of a word on a monitor, a tone presented over earphones, the subject pressing a button, or a myriad of other events. Hence, the rhythmic firing patterns that characterize free running EEGs are considered noise. ERPs are assumed to reflect the sum of a series of independent or semi-independent components that are uniquely affected by different task contingencies.

27 ERPs provide a different type of information about mental activity than FREEGs. They reflect the interruption of idling rhythms by sensory, motor, or cognitive events. By studying differences in the brain’s information processing associated with these events, researchers can approach a variety of questions. For example, when do different regions of the brain first respond to an event? What behaviors or cognitive changes are associated with these neural changes? How do ERPs change as the subject’s cognitive state or expectations change? For the purposes of a practical BCI, detection of these changes in a minimum number of trials (ideally, within single trials) is essential. Very few experiments analyze the results of single trial ERPs. As the brain is a very stochastic system, the data from any isolated ERP often contain far too much noise to be very informative through classical pattern recognition methods. Instead, ERPs are generally elicited dozens or hundreds of times from each subject and averaged together. The process of averaging together a large number of epochs causes most extraneous activity (i.e., irrelevant to the event) to cancel itself out, elucidating the electrophysiological signal of interest. If enough time locked trials are averaged together, even minute signals may become apparent. The noise is reduced proportional to the square root of the number of trials averaged (Coles et al., 1986). Many nested averages may be created and analyzed. If three types of stimuli are presented to two subject groups under two conditions, there may be an average ERP for all subjects in the first group responding to the first stimulus type in the first condition, another average for the second group’s response in the first stimulus and condition type, and so forth. The experimenter may then wish to average together across conditions, and

28 then across groups, depending on the questions being posed in the experiment. Averages of averages, called grand averages, are common. Of course, signal processing techniques requiring many trials are not an option for a real time BCI based on ERP measures. Instead, such a BCI must very quickly recognize patterns of interest from a continuous stream of data. The simple fact that the time periods related to a certain event are not extracted before beginning averaging, as is normal in conventional ERP experiments, creates another problem. A BCI dependent on a user’s responses to specific events must be capable of synchronizing the EEG recording software with the presentation software; otherwise, responses to specific stimuli cannot be determined accurately. After the experimenter has created the appropriate averages, the components are scored before statistical analysis. Components are usually scored according to peak amplitude and latency. To determine peak amplitude, one must simply find the highest (most positive) or lowest (most negative) point in a certain time period. Latency refers to the amount of time between the time locked event under investigation and the peak amplitude. Many ERP waveforms are characterized by a letter followed by a number. The letter refers to whether the wave is mostly positive (P) or negative (N) during the time window under investigation, and the number refers to the latency. For example, as will be seen later, one commonly studied ERP component is the P300, so named because it reflects a positive change that generally peaks about (or shortly after) 300 milliseconds after a stimulus is presented. As with free running EEG scoring, there may be many different amplitudes and latencies for any one ERP component, as the experimenter may

29 be interested in analyzing several time windows separately. For example, the peak amplitude and latency of the P300 usually refers to the most positive point between 250 and 600 ms after stimulus presentation. Free running EEGs reflect ongoing oscillations at specific power spectra independent of any stimuli. ERPs instead contain distinct, nonrepeating components time-locked to specific stimuli. An intermediate means of analyzing EEG data is via event related spectral perturbation (ERSP), changes in ongoing oscillations in response to specific stimuli. These changes are also called event related synchronization (ERS) or event related desynchronization (ERD). The former term refers to increases in overall power due to an event, and the latter to decreases in power. These changes are most commonly used in the Graz BCI, discussed in section 3.4.2 of the following chapter.

2.4. How Can EEGs Be Useful? This question can be explored along two perspectives. The first, which has guided nearly all EEG research, is pure research. The second is applied research. Investigators who use EEG signals are rarely, if ever, interested only in the electrophysiological differences between groups or experimental conditions. Rather, they use these as a tool to elucidate larger questions about how the mind and brain work. That is, they try to fit the results into an existing or new model or theory, in which the EEG record is recognized as a byproduct of neural activity rather than a control signal. 2.4.1 What Can We Learn About the Mind and Brain from EEGs?

30 As noted above, EEGs have several practical advantages over other recording techniques. The greatest strength of EEGs is their temporal resolution. Unlike any other imaging approach except MEGs, they provide information on a millisecond-bymillisecond basis, and thus are an excellent way to study the time course of mental processes. Changes in ERP component latency between different groups and conditions can be assumed, for example, to reflect changes in stimulus processing. Furthermore, the magnitude of an ERP signal or free running EEG pattern presumably indicates the strength and extent of synchronous activation from the responsible neural generators. If old and young subject groups are each given the same task, and the task elicits a similar waveform in both groups with a smaller peak amplitude in the older population, it can be inferred that the task might be producing less coordinated neural activity in the older subjects. Many other reasonable conclusions can also be drawn from these basic assumptions. For example, if two different tasks produce ERP waveforms with similar temporal and spatial characteristics, it may be because both tasks involve similar mental processes. Though BCIs based on measurements of frequency, amplitude, latency, and power have been successful, it is very likely that BCIs utilizing more sophisticated preprocessing and discrimination techniques, such as ICA, neural networks, and Markov models, will soon become prevalent. This not only benefits future BCIs, as noted above, but makes it possible for future EEG studies to reveal more about the brain. Another driving force behind more sophisticated analysis techniques is the need to address noise. ERPs in the context of pure research are typically recorded from subjects in a room shielded from sonic and electromagnetic interference, during a short time period, with the subjects being asked to minimize extraneous movement and avoid a variety of behaviors (such as drinking alcohol, sleep deprivation, or not eating), as well as other constraints. If one or more of these constraints is violated, the task of determining whether variance in

31 the EEG signal stems from changes in the testing condition or extraneous factors (often called “confounds“) using only these four simple measures becomes increasingly difficult. In their 1997 review titled Electrical and magnetic readings of mental functions, Kutas and Dale write: [In many cognitive ERP studies], an undue emphasis was placed on looking at the ERP waveform, specifically at the largest effects on peaks and troughs that could readily be discerned with the eye. In the previous section we detailed why this approach may be problematic. However, whatever might have been missed, the effects that have been reported tended to be the largest, most reliable and undeniably real; thus, each must be explained by any viable theory of the function under study. Moreover, despite the technical differences between the various brain imaging techniques, it is our belief that much time can be saved by using this history to guide contemporary research in brain imaging for cognitive purposes (p. 214-5).

Indeed, many of the experiments were performed when more complex means of analyzing EEGs were not available or considered reliable. While any well designed EEG experiment helps clarify how the brain functions differently under different circumstances, some of the experiments presented were performed for the primary purpose of detecting confounds (such as recent eating or loss of sleep) to warn future experimenters so that they could design experiments more effectively. It is important to bear these confounds in mind when considering the design of any BCI system.

32 Nevertheless, any source of variance considered a confound to one experimenter may provide valuable information to another. Furthermore, as pattern recognition techniques continue to develop, it will become increasingly easy to identify the unique electrophysiological changes introduced by any single source of variance. For example, as will be seen shortly, many factors can reduce P300 amplitude, including fatigue, changes in food, nicotine, and alcohol consumption, aging, and inattention to stimuli (Polich, 1998). Based on classical analysis techniques, it is impossible to determine whether a reduction in P300 amplitude is due to any one of these factors. If experimenters are interested in designing a BCI to detect changes in attentiveness to stimuli (hence viewing all other sources of P300 amplitude variance as confounds), they must ensure that none of these, nor other factors which may confound the data, vary significantly.

However, it is highly probable that each of these sources of P300

amplitude variance possesses its own unique electrophysiological signature that will be identified as pattern recognition techniques develop further. If the unique electrophysiological characteristics of any confounding variable can be identified and recognized as distinct from the electrophysiological characteristics of the variable of interest, then that confounding variable is no longer a confound. Instead, it is simply providing additional information that may be of value to the design of a BCI. Sources of EEG variance are not only challenges for BCI designers, but also opportunities to greatly enhance our understanding of exactly how changes in both the environment and the cognitive, emotional, perceptual, physiological, and emotional state of a subject can affect the mind and brain.

33 2.4.2 How Can EEGs be Useful in BCIs? The question of how EEGs could be useful can be interpreted from another perspective as well, that of applied science. A BCI designer, like a research scientist, is interested in the relationship between EEGs and cognition. However, a BCI designer is less concerned about more global ramifications for human information processing and brain function. An EEG study considered “useful” to a pure scientist because it elucidates the nature and time course of information processing may be of no interest to a BCI designer if the study involved EEGs and/or cognemes that could not be utilized in a practical BCI. There are several criteria a BCI designer should consider when deciding what cognemes and resulting EEGs are best suited to BCIs:

1)

Discriminability: In all BCIs, a user repeatedly generates

one of at least two cognemes, each of which can yield distinct EEGs. Hence, cognemes that produce substantially different EEGs are preferable to cognemes that produce similar EEGs. A more robust difference between EEGs provides more information for the pattern recognition system used and thereby improves accuracy. 2)

Universality: Since most BCIs are designed to be used by

severely disabled individuals, certain types of activity common in academic studies, such as pressing a button, are not possible. Other tasks may not be appropriate for specific groups; for example, a P300 BCI may not function well for individuals with ADD or difficulty

34 recognizing and attending to object location (as might occur after damage to posterior parietal areas). Universality could also mean that all individuals should be able to voluntarily generate a similar EEG pattern. If the same cogneme generates widely different EEGs in different subjects, this creates a more difficult problem for the pattern recognition approach, which must be capable of adapting to each individual user. 3)

Side effects: In a BCI, a user engages in specific mental

activity for the purpose of generating specific EEG patterns, sometimes for several hours per day. During training, users learn how to accentuate the EEG characteristics unique to each cogneme. This process may produce side effects. While this issue has not yet been explored, it is very important. Cognemes that can be generated without side effects – or that produce desireable side effects – are preferable to those that yield negative side effects. 4) Presence in alert, waking users: Some types of EEG activity cannot be easily generated in BCIs. Delta activity, for example, is only seen during sleep in healthy adults. Theta activity can only be generated during sleep or meditative states. 5)

Relationship to cognitive factors: BCIs rely on the fact

that users can voluntarily alter their EEG activity through mental activity. Thus, EEGs that cannot be easily modulated via voluntary mental activity are not useful in BCIs. The auditory brainstem response (ABR) is an example of an easily recognized component present in all

35 healthy adults. However, like most early components of the ERP, it varies only with stimulus characteritics and not cognitive factors. While it is useful for diagnosing hearing deficits, it cannot reveal information about a user’s tonic or phasic mental state. 6)

Attentional requirements: As described in the following

chapters, the cognemes used in some BCIs require the user’s complete attention, while others do not. Similarly, some cognemes seem to be very fatiguing for the user, while others are not. BCI designers should try to utilize cognemes that users can perform easily and effortlessly over extended periods. 7)

Training

requirements:

some

BCIs

utilize

EEG

components that cannot be easily generated by an untrained user. Days or even months of practice may be necessary before a user can use such BCIs effectively. Other BCIs (such as P300 BCIs) that can be used with no training may be preferable. 8)

User preferences: Users may exhibit a preference and

facility for eliciting certain cognemes independent of the other criteria described here. This may not be important for casual BCI users, but individuals who rely on a BCI as their sole means of communication may consider this the most important criterion. 9)

Intuitiveness: Most BCIs, like most other interfaces,

exhibit no natural mapping between the message or command sent and the activity needed to generate it. If a BCI user wanted to spell “cat,”

36 she would not think of a dog, nor anything remotely catty. It is possible that BCIs using more literal, direct, natural mappings between cognemes and outputs would require less training and be easier to use than other BCIs. This is discussed further in the following chapter.

The first criterion, discriminability, is dependent on the pattern recognition approach used. Most BCI work is grounded on the assumption that a difference between two brainwaves that is apparent to the pattern recognition system usually used to discriminate different EEGs – the human brain – is also apparent to an artificial pattern recognition system. This assumption is generally valid, especially for simpler pattern recognition systems. For example, the difference between large and small P300s (or high and low amplitude mu waves) is easily recognized by humans or simple systems. However, it is becoming increasingly clear that more sophisticated preprocessing and pattern recognition approaches can identify EEG differences not apparent to the human eye. Preprocessing approaches like ICA may highlight components not apparent to the naked eye. Similarly, it is relatively difficult for humans to notice changes in alpha and beta activity that become apparent after a FFT, or subtleties of temporal and spatial distribution that a neural network may find informative. The fact remains that obvious differences between EEGs such as amplitude are highly informative. Yet there is a need for much more research on less obvious aspects of the EEG. This could improve the performance of a conventional BCI as well as enable

37 BCIs utilizing new cognemes that were previously ignored because they did not produce EEG differences apparent to the naked eye.

2.4.3 What Can’t EEGs Tell Us? A few prerequisites must be met for the activity of any network of neurons to be visible to an electrode on the scalp. First, the neurons must generate most of their electrical signals, or field potentials, along a specific axis oriented perpendicular to the scalp. Neurons parallel to the scalp do not produce electrical activity visible outside that region of the scalp. Some types of neurons, such as radially symmetric neurons (shown in the figure) have dendritic extensions in all directions. Thus, though the neuron may be very active, the net electrical charge visible to an outside recording device will be zero (de No and Honrubia, 1965). Second, the neuronal dendrites must be aligned in parallel and point in the same direction so that their field potentials summate to create a signal that is detectable at a distance. Randomly oriented neurons, such as those shown in the figure below, would not produce a signal recordable at a distance. The activity at any one synapse may be cancelled out by activity at a synapse oriented in the opposite direction. Fortunately, many types of neurons in the cortex (such as pyramidal cells) are aligned and oriented in one direction perpendicular to the surface. A third constraint is that the neurons should fire in near synchrony. If a line of neurons fires sporadically or asynchronously, then the resulting electrical signal will be a slow change in voltage over an extended time if it is detected at all, rather than a sudden and sharp electrical difference easy to detect from a surface electrode. This is a serious

38 constraint. The meninges surrounding the brain and the scalp contain numerous layers of different tissues, each of which may smear or dampen the electrical signal in different ways (i.e., low spatial filtering). Thus, electrical activity substantial enough to be apparent to an intracranial electrode may be invisible to a scalp electrode. Furthermore, even when a signal is visible outside the scalp, it is difficult to determine which brain region created it because of this smearing. This is why EEG recordings have poor spatial resolution.

Figure 2-3: Examples of neuronal populations whose activity would be difficult or impossible to detect with a scalp electrode.

39 Fourth, the electrical activity produced by each neuron needs to have the same electrical sign. If half of the neurons experience a negative change in voltage with respect to the surrounding fluid, and the other half experience a positive voltage change, the net effect visible at a distance will be zero. Two further constraints on what mental or cognemic activity can be read with modern imaging approaches are relevant. First, most of what is generally perceived as mental activity- such as memories, concepts, and emotions- depends on populations of neurons dispersed throughout the brain. Many of these populations are invisible to scalp electrodes; worse, they are distributed so widely that they cannot easily be isolated from other neural activity. Neurons and neuronal populations exhibit remarkable multitasking, and the neural activity of interest must be somehow distinguishable from other neural activity. As scientists have a very poor idea how neurons represent and communicate memories, concepts, and emotions, it is difficult to know what electrical or other activity to seek. Another very serious limitation relevant to all imaging approaches is that the brain’s representation of memories and concepts differs widely between individuals. Subject A may have the same sleep stage EEGs as subject B, since the underlying physiology and sleep stages are the same, and thus it is easy to isolate and interpret sleep related EEGs in either subject. But finding a consistent and universal bioelectric correlate of the words “tree” or “justice,” or a memory of one’s 7th birthday, is unlikely without significant technological developments. There are constraints other than those listed above, and it may seem that the task of deriving any useful information from an electrode on the scalp is a daunting one.

40 Indeed, the overwhelming majority of neural communication is practically invisible to a scalp electrode. The information provided by electrodes represents a small portion of neural (and thus cognitive) activity. Furthermore, this section has only presented the problems inherent in finding a discernible voltage change. What to do with that information presents numerous additional challenges that will be addressed shortly.

2.5. What Have EEGs Told Us About Cognition? A primary criterion in determining which papers to review was their potential contribution to a BCI. Hence many well studied ERP brainwaves, such as the auditory brainstem response (ABR), N1, and Pmp have been ignored. No existing BCI systems use these brainwaves. In contrast, there are several published articles describing BCIs based on P300s, and the known sources of P300 variability could be utilized in a BCI in a myriad of ways. 2.5.1 ERPs: P300s

While the P300 may appear in response to a wide variety of stimuli, it is typically elicited in what is called the “oddball” task. In this task, the subject is presented with two types of stimuli, one of which is frequent. The infrequent or oddball event is made the target of the subject’s attention, perhaps by asking the subject to push a button each time it appears, or keep a mental count of it. In this paradigm, the target and background stimuli elicit similar responses, but the magnitude of response to targets is larger and the difference is considered the P300. Other paradigms have been used to evoke a P300,

41 including the absence of an expected stimulus, an indication that the P300 reflects mainly endogenous processes. Debate continues as to the exact functional significance of the P300. The debate is complicated considerably by the fact that the P300 does not originate in any one region of the brain (Ruchkin et al., 1990; Johnson, 1986, 1998a, 1993), nor do all P300s look the same. Differences in the nature of the subject and the eliciting stimuli can produce a wide variety of P300 like waves that may well reflect significantly different neural processes. Two subtypes of the P300, called the P3a and P3b, have been recognized for decades (eg, Courchesne et al. 1975; Squires et al. 1975; for review see Polich 1988). More recent work using independent component analysis (ICA) has shown that the P300 may consist of numerous components that reflect stimulus, cognitive, and motor response characteristics (Westerfield et al. 1999, Jung et al. 2000). What matters most to BCI designers is that the P300 does not simply reflect sensory processing of incoming stimuli, but rather reflects higher cognitive processes that may be involved in evaluating the stimulus and updating one’s view of the world (reviewed in Verleger 1997; Polich 1998). Hence, individuals can voluntarily generate different P300 responses to similar stimuli simply by altering their evaluation of them. P300s vary predictably with any factor that affects a subject’s cognitive evaluation of stimuli, such as: (1)

The nature of the stimuli and the ability to detect them;

(2)

The processing of the stimuli;

(3)

The cognitive state at the time stimuli are presented; and

(4)

The overall ability to process stimuli once detected.

2.5.1.1 Stimuli and their detection

42 One critical determinant of P300 amplitude is information transmission. If a subject is unable to detect or categorize a stimulus event, as may happen if it is too faint or presented too quickly, P300 amplitude is reduced (Johnson, 1986). It is thus vital that any stimuli used in BCIs be easily discriminable by the user. Since some users are worse at signal detection than others, BCIs should be capable of presenting slower, brighter, or otherwise more discernible stimuli for such users. Why might some users be better at signal detection than others? Visual deficits, aging, fatigue, and other factors might make it difficult to detect stimuli, especially at faster presentation speeds. Another factor not appreciated in the literature is the subject’s familiarity with fast displays. The rapidly growing use of video games, action films, and other rapidly changing displays raises the question of whether those who frequently view such displays might be better at processing rapidly presented stimuli. That issue is explored in the first study of this dissertation. Another well-known determinant of P300 amplitude is stimulus probability, whether local (Ford et al., 1982; Duncan-Johnson et al. 1984; Polich et al., 1990) or global (Johnson, 1986, 1988; Donchin and Coles, 1988; Polich, 1998). These studies all show that targets that are more probable evoke smaller P300s. A very recent study (Gonsalvez and Polich, 2002) noted that target-to-target interval, or TTI, might be a more useful means of gauging P300 amplitude. Since more probable targets occur more frequently, their TTI is shorter than less probable targets if the overall rate of stimulus presentation is held constant. The ramification for P300 BCIs may seem obvious: minimize probability – or, maximize TTI. While this can increase P300 amplitude, it also means that target stimuli are presented less frequently, reducing overall speed. This tradeoff is explored in the first study of this dissertation and is discussed further in the following chapter.

43 2.5.1.2 Stimulus Processing

Aside from discriminability and probability, the relationship between stimulus characteristics and the P300 is relatively unimportant in a BCI. What would be the value of a P300 BCI that could reveal information about a recently presented stimulus? Since P300 BCI systems present all stimuli used to generate P300s, there is no reason to explore stimulus characteristics via the EEG. One observation that supports the view that the P300 reflects stimulus evaluation rather than simply stimulus recognition is the repeated finding that a subject’s expectations or beliefs about a stimulus affect the P300 produced. Stimuli that are more salient produce larger amplitude P300s than less meaningful events (eg, Sutton et al. 1965; Johnson, 1986, 1988; Ruchkin et al., 1990). Any stimulus can easily be made more salient than others simply by asking the subject to attend to it. One early and very robust finding is that the P300 is larger for attended than unattended events (e.g. Wilkinson and Morlock, 1967; Hillyard et al., 1971; Picton and Hillyard, 1974; Squires et al., 1975; Neville and Lawson, 1987; Heinze et al., 1989, Luck et al., 1990; Katayama and Polich, 1997). A classic and often replicated task that has shown this is the Posner paradigm. In this paradigm, subjects are asked to respond to a stimulus that may appear in any of several locations. Before the stimulus appears, subjects are cued to a certain region, causing them to direct attention there. In trials in which the cue more reliably predicts the location of the stimulus (and thus the stimulus appears in an attended region), called valid trials, P300 amplitude is higher than in invalid trials, in which the subject’s attention was directed away from the region in which the stimulus later appeared. Furthermore, if the subject is given no cue regarding the location of the upcoming stimulus, the resulting P300 amplitude is smaller than it is during valid trials but larger than invalid trials.

44 All P300 BCIs depend on the fact that the P300 varies predictably with selective attention. Furthermore, because P300 amplitude is reduced if subjects accurately anticipate the location of a stimulus, P300 BCIs randomize the presentation of events2. P300 amplitude can also be increased if temporal uncertainty is high; that is, if subjects cannot anticipate the exact time a stimulus will occur. The studies in this dissertation varied ISI as well as stimulus location, making it impossible for subjects to accurately predict either one. A related finding is that P300 amplitude is larger for remembered items (Karis et al. 1984; Fabiani et al. 1986; Fabiani et al., 1990; Noldy et al., 1990). If a list of words to be remembered is presented to a subject, words which are later recalled show a higher P300 amplitude than words which are later forgotten. This may be useful in designing devices to help people learn new information or to anticipate memory failure. The P300 memory effect appears even if the subject is told to try to hide this effect from EEG recording. For example, Allen and Iacono (1997) asked subjects to remember a list of words. The subjects were then presented a new list containing some of the previously seen or old words mixed with new words. Some subjects were asked to lie about whether they had seen the words on the new list, while other subjects were given the same instructions and told they would be given a cash reward if their brainwaves did not reveal that they were lying. Subjects in the latter condition had higher P300 amplitude to words on the new list than the first group of subjects. The authors suggested that the latter group had a larger P300 because the stimuli were especially salient to them, as they were

2 Farwell and Donchin (1988) acknowledge that presenting stimuli in a predictable order would reduce P300 amplitude, but might cause the appearance of a different EEG component, the contingent negative variation or CNV. The CNV might also be useful in a BCI under these circumstances. Would the information gained from the CNV be

45 associated with a financial reward. As the authors note, this finding could be very useful in situations in which reliable deception detection is necessary. Similar findings have been reported by others (Farwell and Donchin, 1991; Rosenfeld et al., 1991; Farwell and Richardson, 1993; Farwell and Smith, 2001) and serve as the basis for “brainwave fingerprinting,” a BCI like paradigm described further in the following chapter.

2.5.1.3 Current Mental State Since the P300 is reflective of a subject’s cognitive processing of stimuli, it follows that temporary variations in a subject’s mental state, such as those induced by alertness, biorhythms, eating, or substance use would produce predictable changes in the P300 (for review, see Polich and Kok, 1995; Polich, 1998). Numerous experiments have supported this hypothesis. Individuals who are tired through lack of sleep or recent exercise show a longer P300 latency than normal subjects do. Many species exhibit cyclic variations in activity, including a 90 minute ultradian cycle, circadian patterns such as the sleep/wake cycle, and circannual or seasonal fluctuations. All of these have been found to influence P300 measures in humans, though menstrual cycles do not. Subjects who have eaten recently show a higher amplitude and shorter latency than those who have not. Recent nicotine consumption affects both behavioral and P300 measures in some tasks (Houlihan et al., 1996; Pineda et al., 1998). Caffeine, alcohol, and other substances have also been shown to influence the P300 (Lorist et al., 1994; Sommer et al., 1993; Callaway et al., 1983). While much of the above work was motivated by the

sufficiently informative to justify the reduction in P300 amplitude? This interesting question has never been explored.

46 desire to avoid confounds produced by such temporary fluctuations in an individual’s state, the resulting knowledge about P300 variability could be useful in BCI design.

2.5.1.4 Overall Mental State A fourth category of factors that affect the P300 consists of aspects of a subject that cannot be expected to vary significantly over a short time period (e.g., several days or weeks). This category includes many factors related to a subject’s medical condition, such as dementia, psychological disorders, or lesions, and it is of particular interest to people interested in designing BCIs for individuals suffering from these disorders. P300 latency has been shown to increase along with a reduction in P3 amplitude in elderly subjects and individuals with senile dementia (Goodin et al., 1978a, 1978b). Polich et al. (1986) showed that the increase in P300 latency is correlated with severity of impairment. One experiment of note is from Polich et al. (1990a), who found that P300 measures could discriminate between normal aged subjects and aged subjects suffering from early stages of Alzheimer’s disease. As detecting Alzheimer’s disease is difficult in its early stages, this observation may be of considerable value in designing diagnostic systems (Polich and Hoffman, 1997a). The wide variety of brain lesions produces a myriad of potential electrophysiological changes (for review, see Knight and Scabini, 1998). Theoretically, ERP measures could be useful in diagnosing the type, location, and severity of lesions, though they will probably be supplementary to methods with better spatial resolution such as MRIs.

47 As noted above, the P300 varies predictably after a subject uses substances like alcohol or nicotine. Individuals who are at risk of alcoholism also show numerous electrophysiological effects, including P300 differences (for review, see Porjesz and Begleiter, 1998). The P300 is different in persons addicted to alcohol (Begleiter and Porjesz, 1995) and nicotine (Pineda et al., 1998), even if they have not recently consumed any drugs.

Since individuals with psychiatric disorders have altered cognitive

processes, it should be possible to detect the presence and severity of a psychiatric disorder using EEG measures. The P300 has been shown to differ in people suffering from obsessive compulsive disorder (Marks 1997), schizophrenia (Ford et al., 1994; Rao et al., (1995); Turetsky et al., 1998a, 1998b), and depression (Bruder et al., 1995). Numerous other disorders, including epilepsy, Parkinson’s disease, and multiple sclerosis, can affect P300 measures (e.g. Pulvermuller et al., 1996; Naganuma et al., 1998). These factors cannot be easily manipulated by a user or BCI designer. Nonetheless, they are important for two reasons. First, they may create meaningful exclusion criteria for P300 BCIs; individuals unable to generate robust and reliable P300s may need to use a different type of BCI. Second, any P300 BCI used by an individual with a condition likely to produce unusual P300s, such as senility or depression, should account for the user’s condition and its likely effects on the P300.

2.5.2. Event Related Potentials: Slow Cortical Potentials (SCPs) SCPs refer to a variety of different EEG components, including the contingent negative varation (CNV), stimulus preceding negativity (SPN), post imperative negative

48 variation (PINV), and readiness potential (RP). The contingent negative varation (CNV) is a slow negative shift in the EEG seen during the interval between two stimuli. The first stimulus must signal that a second stimulus will occur after a delay; hence, the second stimulus is contingent on the first (Walter et al. 1964; Brunia et al. 1995; McCallum and Curry 1994). The CNV is believed to reflect at least two processes: an orienting response to the first (or warning) stimulus and expectation of the second (or imperative) stimulus. Many authors (eg, Damen and Brunia 1994) refer to the latter, or SPN, as the “true” CNV. This is reasonable, as the orienting response is not contingent, while the expectation is. The CNV typically returns to baseline after the imperative stimulus, or even crosses baseline and develops into a late positive component. However, it may also remain negative after the imperative stimulus; this ongoing negativity is called the PINV. PINV’s have never been used in BCIs. While RPs are considered SCPs, they are addressed below in a separate section because BCIs utilizing SCPs are functionally distinct from BCIs utilizing RPs. The Thought Translation Device (TTD) developed by Birbaumer and colleagues utilizes SCPs and is described in chapter 3, section 3.3. Some versions of the Graz BCI developed by Pfurtscheller and colleagues utilize RPs and are discussed in chapter 3, section 3.4. 2.5.3. ERPs: Readiness Potentials (RPs) Most ERPs reflect neural activity after stimulus presentation. However, it is also possible to explore the brain’s behavior prior to an event. This would be fruitless unless

49 the event were somehow initiated, or at least anticipated, by the subject. The event under investigation would also have to be detectable to an outside observer, as it is crucial to be able to define the millisecond in which the event occurs. The brain activity preceding a thought or emotion is very difficult to isolate, though it could be useful to a BCI. The brain is responsible for movement, and it is easy to determine exactly when a movement is initiated. The first reports of a change in EEG recordings preceding a movement were published over 30 years ago (Kornhuber and Deecke, 1964, 1965). This change in potential reflecting a subject’s readiness to move, or readiness potential (RP, also referred to as the Bereitschaftspotential or BP) appears as a negative voltage deflection occurring over half a second before a movement is initiated (Kristeva et al., 1978; Praamstra et al., 1995; Dirnberger et al., 1997). As with the P300, the RP was found to consist of multiple subcomponents (Deecke et al., 1976, 1984; Kutas and Donchin, 1980; Hackley and Miller, 1995). Libet (1985) suggested that the first of these might reflect an unconscious decision to act, while the later component reflected conscious awareness of the upcoming movement. Deecke (1987) challenged this view by arguing for distinct neural sources for the two components. Specifically, Deecke et al. suggested that initial RP activity stems from the SMA, while the later activity is generated by the primary motor cortex. The readiness potential can be used to identify a user’s intent to move well before the movement occurs. In fact, the first “BCI3” described in the literature allowed a user

3 Whether Walter’s system is the first BCI depends on the definition of a BCI. According to the definitions used by most authors, it is. However, the definition used in this thesis specifies that the control must be voluntary, and this is not true of Walter’s system.

50 to control a slide projector via electrodes implanted in the primary motor cortex. Subjects believed that they were advancing the slide projector by pressing a button, and were surprised when told that the button they pressed was not plugged in to the projector. The system could also advance slides before the button press occurred. In this condition, subjects reported that they were just about to press a button when the slide carousel advanced (Walter, 1964). Thus, it is possible to build a BCI that can send messages or commands with less delay than any other interface, because all other interfaces depend on a motor signal triggering movement. While Walter’s system allowed users to send one command4, readiness potential activity can reveal additional information such as which limb will be moved (Kalcher et al. 1996; Allison et al. 1999; Pineda et al. 2000).

4

not/”

Walter’s system utilized two cognemes: “/I’m going to press the button/” or “/I’m

51

Figure 2-4: Readiness potentials. 2.5.4 Event Related Potentials: Steady State Visual Evoked Potentials (SSVEPs)

It has long been established that any stimulus in the visual field that flickers at a specific frequency can cause neurons in visual areas to fire at the same frequency. These neural oscillations are called SSVEPs, also known as Steady State Visual Evoked Responses or SSVERs (Regan 1966; Creutzfeldt and Kundt 1967; Ciganek 1967). This effect is enhanced by attending to the flickering stimulus (Regan 1989; Morgan et al. 1996; Muller and Hillyard 1998a; Muller and Hillyard 1998b; Muller and Hillyard 2000). This suggests that users can indicate their interest in specific stimuli by choosing to attend or ignore it, thus providing the basis for a BCI. SSVEP BCIs are similar to P300 BCIs in that both allow a user to send information by voluntarily modulating their attention, though SSVEP and P300 BCIs use different types of stimuli. SSVEP based BCIs are discussed in chapter 3, section 3.2.

52 2.5.5. Free Running EEGs: Various Spectra and Alertness

No one can remain vigilant for a very long time. Ample research has confirmed that individuals presented with a monotonous task over an extended time period will become drowsy and make errors that are correlated with changes in EEG activity (e.g., Oken and Salinsky, 1992; Makeig and Inlow, 1993 Jung et al., 1997; Klimesch et al., 1998). Due largely to recent progress, it is becoming possible to identify the electrophysiological differences between an alert operator likely to perform well and the same operator when drowsy and prone to errors. Makeig and Jung (1995) asked subjects to perform a monotonous task requiring them to detect auditory and visual signals over a five 30 minute periods. As expected, subjects performed very well at the beginning of each session, but soon became less attentive and made errors. The two graphs below show ten subjects’ averaged EEG twenty seconds before and after each target. The vertical axis of both graphs shows the relative amplitude of the EEG spectra. The horizontal axes show the EEG spectra from 050 Hz and the time to or since each stimulus was presented. The left graph shows the EEGs associated with missed targets or attentional lapses, while the right graph shows the responses to correctly detected targets.

53 Figure 2-5: Frequency, time, and power for missed and hit targets at site Cz By subtracting the left graph from the right one, the authors were able to create a similar graph showing the hit minus lapse difference. These results confirmed that, shortly before missed targets, subjects showed electrophysiological changes associated with drowsiness, including increases in alpha, delta, and theta activity. Furthermore, EEGs preceding missed targets show a relative decrease in beta activity, which is believed to reflect alertness. The authors also found that there were predictable changes in EEG activity following errors. In addition to the scientific value of this finding, it could prove valuable in incorporating an error correction system in a BCI. A more recent project found that EEG activity indicating errors could be recognized easily, which could “improve both the speed and accuracy of EEG based communication (Schalk et al., 1998).” A problem with analyzing alertness level based on EEG power spectra is that different subjects show changes in different spectra as they become less alert. This problem and a solution to it were explored in Jung (1997), in which the authors designed a system that could scan all relevant spectral frequencies for each subject and thus determine which frequency changes were correlated with alertness deficits for each subject.

2.5.6. Free running EEGs: Mu Rhythms and Movement The mu rhythm, like the readiness potential described above, has been reliably found to fluctuate predictably as subjects consider or initiate movement. The mu rhythm consists of synchronized EEG activity of about 8-13 Hz over the sensorimotor cortex. It

54 is most pronounced when subjects are at rest and are not planning to initiate voluntary movement. At least a second before subjects initiate voluntary movement, the mu rhythm over the hemisphere contralateral to the region moved shows a decrease in amplitude and thus power. This attenuation becomes more symmetric over both hemispheres as subjects actually initiate the movement and remains until shortly after the movement is initiated. Mu activity returns to baseline levels within a second after movement is initiated and may briefly increase above baseline (Allison et al., 1998; Hughes et al., 1998). These activity dependent changes in mu have also been called Event Related Desynchronization (ERD) and Event Related Synchronization (ERS) by Pfurtscheller and his colleagues. While the mu rhythm has been studied primarily in right handed subjects, recent work has shown that left handed subjects show comparable but slightly different changes in mu activity preceding movement (Stancak and Pfurtscheller, 1996). What does this pattern of activity reflect? The regions of the cortex most responsible for motor activity are located just anterior to the central sulcus. This area includes the primary motor cortex (M1), supplementary motor area (SMA), and premotor area (PMA). The SMA and PMA are presumably responsible for contemplating and planning movement, while M1 is more concerned with actually initiating movement. Thus one interpretation of mu attenuation dynamics is that the first, asymmetric drop in mu power preceding voluntary movement reflects early planning of movement, while the subsequent bilateral drop in mu power reflects activity of M1 neurons necessary to trigger movement (e.g., Andres et al., 1998). In addition to simply detecting the consideration or initiation of movement, mu rhythm dynamics can also elucidate the strength and speed of movement. Two recent

55 studies (Stancak and Pfurtscheller, 1996a, 1996b) compared changes in mu activity surrounding “brisk versus slow” voluntary finger movements. More powerful finger movements, as reflected by higher amplitude EMG, produced a greater ERD in brisk movements, though this effect did not achieve significance for slow movements. The authors found that there was no other significant difference between these two conditions before movement, but that mu power increased more 1-2 seconds after brisk movements than slow movements. The mu rhythm thus has potential for BCIs for many reasons. It is present in nearly all adults, including many individuals with motor disabilities (Pfurtscheller 1989). It is easy to train in subjects while they are awake with eyes open (Kuhlman 1978a, Niedermeyer and Lopes da Silva 1987, Wolpaw et al. 1990). Since it can be affected by visual and imagined input (Pineda et al., 1998), it may be possible for users to learn to use a mu rhythm based BCI utilizing a variety of stimuli and cognitive strategies to affect mu activity. Being a FREEG measure, it does not suffer from the problems associated with ERP based BCIs. The pattern recognition required may be simple - detecting power changes alone could be (and has been) fruitful in BCI design. Finally, the mu rhythm can be modulated in either or both hemispheres (Wolpaw et al. 1994; Pineda et al., 2003).

CHAPTER 3: CURRENT STATUS OF BCI RESEARCH

3.1. What is a BCI? “Scientists would rather use someone else’s toothbrush than her terminology” Helen Neville (1993)

There remains considerable disagreement in the BCI community regarding the exact definition of a BCI or even the acronym itself.

This dissertation uses a novel

definition consistent with the intent conveyed by most other authors’ definitions5, and very similar to that provided in a recent review article (Wolpaw et al., 2002). A brain computer interface (BCI) is a realtime communication system designed to allow a user to voluntarily send messages or commands without sending them through the brain’s natural output pathways. All BCIs have at least five components6: signal acquisition, feature extraction, translation algorithm, mapping to output device, and an operating protocol that may include software to present stimuli and/or feedback. The question of whether the acronym “BCI” should refer to a bidirectional interface or a one-way device (a brain – to – computer interface) is a difficult one. Arguing for the former definition, an “interface” is defined as a bidirectional system.

5For examples of terms and definitions used to describe BCIs, please see appendix 2. 6 from Wolpaw et al. 2002; similar componentry is described in numerous articles.

56

57 However, the latter view of BCIs is almost universally shared in the literature, while the term “CBI” has been used to refer to the complementary computer – to – brain interface (Konig and Verschure, 2002). The use of these two terms (BCI and CBI) to refer to devices that read and write to the brain, respectively, allows for an easy distinction between the two types of systems. Further, despite the dictionary definition of “interface,” most interfaces in use today are in fact one-way devices that only read information from a user – examples include a keyboard, keypad, mouse, or voice recognition system. It seems premature at this time to challenge the conventional view of BCIs as one – way interfaces. The term may well change as more CBIs are developed, and a new term will have to be coined to describe a truly bidirectional interface. The phrase “messages or commands,” whether from or to the brain, must be used liberally. No BCI has yet been developed in which a message can be sent directly; it is not possible to think “hello” and have that word appear on the screen. Instead, users send messages by imagining changes in movement, redirecting their attention, or performing specific mental or cognitive tasks that the BCI interprets as a particular message. It is likely that a “literal” BCI that can directly interpret brain activity associated with specific messages or commands will eventually be developed. Such a BCI might be more intuitive and easy to use than existing “interpretive” BCIs. The literal/interpretive distinction is not currently used in the literature. A much more common approach has been to discriminate BCIs based on the category of neural activity involved in sending information (e.g., P300 BCI, mu BCI, SCP BCI). BCIs are also often categorized by whether they rely on spontaneous rhythms such as the mu (called “spontaneous BCIs” or ERPs such as the P300 (called “evoked BCIs”). While an

58 evoked BCI, such as a P300 BCI, could potentially utilize both ERPs and the free running EEG, none has yet been developed7. The spontaneous vs. evoked distinction is somewhat misleading. Some BCIs that use EEG rhythms conventionally regarded as spontaneous, such as the Wadsworth BCIs discussed below, nonetheless require the user to generate this activity at specific times in response to external prompts, and thus from the user’s perspective are not spontaneous. It may be more useful to classify BCIs as asynchronous (or non-cue-based) and synchronous (or cue-based), depending on whether the user can send messages or commands freely or must pace himself according to prompts (Bischof and Pfurtscheller, 2003). Asynchronous BCIs have the advantage of allowing the user more flexibility, but often require substantial training and cannot utilize ERPs such as the P300, limiting their potential effectiveness. Another informative distinction has been between dependent and independent BCIs. “A dependent BCI uses the brain’s normal output pathways to carry the message, but activity in these areas is needed to generate the brain activity (e.g. EEG) that does carry it…. In contrast, an independent BCI does not depend in any way on the brain’s normal output pathways” Wolpaw et al. (2002). The substantial majority of BCIs are of the independent type. Dependent BCIs may not be helpful to severely disabled individuals. Finally, there is no widely accepted protocol for naming and describing BCIs. Many are referred to by the laboratories that developed them (e.g., Wadsworth BCI, Graz BCI, Oxford Putney BCI) or by names given by those laboratories (e.g., Thought

7

For example, a P300 BCI might also look at alpha and beta activity.

59 Translation Device from the Birbaumer group, Direct Brain Interface from the Levine group, virtual stoplight BCI from Bayliss and Ballard’s work). These generally refer not to specific BCIs but to whichever BCI is currently used by that laboratory. For example, the Wadsworth BCI once referred to the BCI described in Wolpaw and McFarland (1991), and later the system described in Wolpaw and McFarland (1994); it now refers to the BCI presently in use by the Wadsworth lab. Specific BCIs are usually referenced by the authors describing them (e.g., Farwell and Donchin BCI). If no name for a BCI is given in the literature, it is named here according to the author(s) most associated with their development and year it was first described (e.g., Chapin, 1999).

3.2. Purpose and Relative Merits of BCIs

What advantages do BCIs have over conventional interfaces, which can currently provide much faster and more reliable information throughput without requiring the inconvenience of sensors on the head? To date, the overwhelming majority of BCI work has been directed toward developing assistive/augmentive communication devices for severely disabled individuals who are unable to use the brain’s normal output pathways to operate conventional interfaces. However, there are exceptions; BCI research by Calhoun, MacMillan, and Middendorf, for example, is targeted to Air Force pilots for control of an aircraft or to send other commands. Pineda et al. (2003) describes a BCI system that could be used to control a computer game. Commercial firms such as IBVA

60 and Cyberlink have developed BCIs for games and other entertainment applications such as creating music or art. A BCI need not necessarily replace a conventional interface. It may be used to provide a supplementary means of communication while the user takes advantage of other interfaces. This raises the question of whether the overall information throughput when using a BCI combined with conventional interfaces would in fact be higher than without the BCI. Presumably, using the BCI creates some distraction and thus hampers the use of a conventional system. This trade-off has not been explored in the literature. While some types of BCIs are exhausting, and new users find that BCIs occupy all of their attention, many experienced users of BCIs or neuronauts have reported that using some BCIs requires little cognitive effort. Hence, it is possible that a well-defined task and interface could allow an experienced user to seamlessly operate a keyboard and a BCI. Healthy individuals may wish to use a BCI if their hands and voice are busy with other tasks. For example, an airline mechanic or soldier may be unable to convey a message with hands or voice since the hands may be busy and there may be excessive background noise.

Likewise, for a soldier in hostile territory, the situation may

necessitate stillness and silence. At present, all conventional interfaces require overt motor activity, which can be detected by others. Hence, another potential advantage of BCIs is in terms of stealth or privacy. In situations in which secrecy is paramount, a BCI enables a user to encode a message without the possibility of eavesdropping. Instructions conveyed via a keyboard or mouse can be read simply by watching the necessary motor activity. Spoken words are

61 easy to overhear; they also create noise, which interferes with neighbors’ ability to use speech recognition systems. A BCI is ultimately the most natural and intuitive interface possible. A keyboard or mouse is a very nonintuitive way of conveying a message; there is nothing humans do in nature that is similar to typing on a keyboard. In contrast, there’s no easier way to say “hello” or “please call the nurse” than to simply think a word or phrase with the intention of sending it. This effectively bypasses the algorithmic stage of goal planning described by Marr (Marr, 1980).

A literal BCI such as this has not yet been developed, and is not

possible using the EEG, given the limitations on EEG recording presented in the chapter on the neural concomitants of BCIs. As neuroimaging and processing approaches are developed and refined, literal BCIs may become more feasible. While it seems that a literal BCI would offer substantial advantages over any “interpretive” interface, this may not be the case, as discussed later in this chapter. The hardware necessary to instantiate a BCI is continually shrinking, and BCIs may soon be a very practical and portable means of sending information. It is possible to design a BCI today that utilizes only a headband for both collection and processing; similarly, the necessary equipment could easily be incorporated in other products worn on the head such as glasses, headphones, a microphone, eye tracker, cap, hat, or helmet. This would enable a user to send information anytime, anywhere, without requiring additional, bulky equipment. However, this is only a theoretical advantage, as existing BCI systems are indeed bulkier and heavier than most other interfaces. BCIs can provide information that simply cannot be conveyed as clearly through any other interface, such as cognitive intent, alertness, emotion, etc. For example,

62 consider a futuristic BCI in which a user can imagine a new song, picture, or animation and have those sounds or images recorded for later presentation. Given the user’s artistic skills, it may be difficult or impossible to translate this creative imagery to an outside medium – or doing so may take so long that some of the imagery is lost. Such BCIs don’t currently exist, though the theoretical groundwork has been demonstrated8.Another example of a possible BCI that utilizes information unavailable through conventional interfaces would be an emotive BCI, which might present a different operating environment or different vocabulary based on the user’s emotional state. The idea of “emotive computing” is not new, but the idea of doing so via direct monitoring of brain activity is. Other means of artificially gauging emotion, such as those based on eye or facial activity are not as accurate and rich as a BCI could be. While both of these examples are only theoretical, the BCI-like systems presented later in this chapter describe some systems currently in use that obtain information directly from the EEG in realtime that would be unavailable with any other interface. For some applications, a BCI may be faster than conventional interfaces. For example, it is possible to determine a user’s intent to move from the EEG before that information is actually sent to the spinal cord (eg, Allison et al. 1999, Pineda et al. 2000), and may take 80-100 ms for a signal from the brain to produce motor activity (Kandel et al., 1992). This advantage is not as farfetched as it may first seem. Some of the implanted BCI systems described in this chapter were able to determine an animal’s intent to move and trigger a response before the animal exhibited any physical movement. Such a

8 For example, Kadner et al. (2001) describes a means of estimating a perceived tone from the EEG; Desain 2002 describes a system which can distinguish between imagery

63 system would be faster than any other interface, which must wait until a movement becomes overt. The information about movement planning may also be useful for error detection. Finally, individuals may choose to use a BCI despite the availability of superior interfaces simply because a BCI is a new, exotic, and exciting technology. Certain demographic groups, such as electronic gamers, are eager adopters of innovative electronic toys. How readily different consumer groups (i.e., patients, gamers, pilots, military) adopt BCIs depends not only on their cost and effectiveness, but also on the perception of their usefulness, convenience, appeal, and “coolness.” Despite the many potential advantages of BCIs, their widespread use is not imminent. The bottom line remains that many of these advantages are only theoretical, while limitations remain very real. State of the art BCIs are slow, inaccurate, expensive, and uncomfortable. They are not supported by conventional software, and often require sensor technology that is inconvenient to use. The pace of research is increasing, and rapid progress is being made in addressing these limitations, so cautious optimism is appropriate. In the next few years, BCIs will become increasingly useful to both the severely disabled and by less disabled or even some healthy individuals in specific situations.

3.3. Prototypical BCIs

of different musical works above chance.

64 Wolpaw et al. (2002) divided BCIs into five categories based on the type of brain signal utilized and the area of the brain in which it was recorded:

P300 evoked

potentials, visual evoked potentials, slow cortical potentials, mu and beta rhythms and other activity from sensorimotor cortex, and cortical neuronal activity.

That

categorization is used here as well, with the fourth shortened to “mu BCIs.” In addition, the fourth category is divided into two categories: systems based on movement related imagery and “mental task” BCIs, in which the user imagines performing one of two or more discrete mental tasks such as typing, singing, or arithmetic. All of the systems described in this section are BCIs or are offline systems that are included because of their relevance to BCIs. P300 BCIs are discussed first and most extensively.

3.3.1 P300 BCIs 3.3.1.1. Farwell and Donchin (1988). The first published report of a BCI based on P300 measures was Farwell and Donchin (1988). This paper described a system that allowed users to spell out words that were then sent to a voice synthesizer. The authors sought to explore two questions:

1) Could the P300 be used as a means of conveying information via a BCI? and 2) What would be the operating characteristics of such a system?

The second question was explored by manipulating three variables:

65 1) SOA9 2) The pattern recognition approach, and 3) Number of trials averaged before attempting to distinguish between trials that contained a P300 and those which did not

Four subjects were prepared for recording by placing an electrode over Pz, referred to linked mastoids. The EOG was recorded by two electrodes placed above and below the right eye (supra- and suborbital). Subjects were then shown a 6 x 6 grid containing the 26 letters of the alphabet and additional characters such as space or backspace, as seen in Fig. 1:

9 Throughout this paper, the term “ISI” is used, though “SOA” (Stimulus Onset Asynchrony) is the correct term. P. 513 defines ISI as “the time from the onset of the flash of one row or column to the onset of the flash of the next row or column.” This defines SOA, not ISI. Thus, the term “SOA” is used here where “ISI” is used in their paper. Donchin’s later paper describing a P300 BCI, Donchin et al. (2000), correctly uses the term “SOA” instead of “ISI.”

66

Figure 3-1: The display used in Farwell and Donchin (1988)

They were asked to focus on a particular character, labeled the target character, and count the number of times it was intensified (“flashed”) while ignoring other flashes. Next, one of the rows was flashed for 100 ms, followed by a brief SOA, and then another row was flashed until all six rows had been flashed. In order to increase uncertainty about the next flash, thereby leading to a larger P300, rows were flashed randomly rather than in sequence. The six columns were then flashed in a similar manner. A trial consisted of either six row or six column flashes. Subjects did not choose which letters to designate as the target, rather, they were asked to focus on the letters of the word “BRAIN.” They would first count flashes of the letter “B” until prompted to move to the next letter, then count flashes of the “R” and so on. During most of the first session and all of the second, realtime feedback was not provided. Subjects continued counting each letter until “Choose one letter or command”

67 appeared on the display. An online system rejected any trials with excess artifact and triggered additional flashes to ensure that the number of recorded trials was always 30. Fig. 2 below shows the averaged responses to attended and unattended stimuli for two different SOAs.

Figure 3-2: Grand averaged responses to attended (target) and unattended (nontarget) flashes from Farwell and Donchin (1988)

As can be observed, all subjects exhibited a noteworthy difference between attended and unattended cells at both speeds. While the fact that attended and unattended trials can be distinguished by the human eye in heavily averaged data, such as this, is encouraging, a practical BCI must be able to make this discrimination based on far fewer trials using an artificial system. The authors compared the effectiveness of four different

68 algorithms to determine whether each flash (which they refer to as a “subtrial”) contained a P300. The algorithms were applied to the period between 0-600 ms following each flash onset. This resulted in a score that measured the size of the P300 evoked by each flash.

1) Stepwise Discriminant Analysis (SWDA): Using data recorded from the first two letters (“B” and “R”), the authors derived a template created by averaging together several trials including a P300. This training set was also used to create a discriminant function that measures the distance between each epoch and this template. The function provided a different weight for each data point, providing more weight to time points that were most critical. This function was then applied to the remaining data (the “analysis set”), providing a numeric value indicating each flash’s distance from the template. 2) Peak picking: The “P300 window” was defined as the time range within which the average attended wave form in the training set for each subject was positive. P300 amplitude was defined as the difference between the lowest negative point preceding this P300 window and the highest positive point in this window. 3) Area: The area was the sum of all data points in the P300 window. 4) Covariance: The attended subtrials in the training set were averaged to create a P300 template. The covariance of each subtrial in the analysis set with this template was used to compute the score.

69 A score for each letter was computed by adding the scores of each row and column flash containing that letter. Thus, each of the 36 letters had a score indicating its probability of containing a P300. The scores for each letter were summed across a variable number of trials. If the cell earning the highest score contained the target letter, the trial was considered a “hit.” A bootstrapping procedure was used to estimate the performance of each approach. First, 1000 sets of between 2-40 trials from each subject’s analysis set were randomly chosen. The number of trials averaged together was systematically varied to explore the speed – accuracy tradeoff. Next, the four preprocessing algorithms were applied, and the number of hits attained by each was recorded. This made it possible to explore the performance of each algorithm for each subject at different SOA’s and with different numbers of averaged trials. While no single algorithm was best for all subjects, SWDA was best for 3 of 4 subjects at 125 ms ISI, and peak picking was best for 3 subjects at 500 ms ISI. It was reasoned that SWDA, like covariance, is very sensitive to latency jitter, thus, peak picking is likely to suffer from false positives at faster ISIs, when P300s produced by several flashes may be present in the 600 ms analysis window. The authors noted that their system faced a tradeoff between four variables: ISI, number of trials averaged, scoring algorithm, and accuracy. Messages can be conveyed more quickly by reducing ISI or averaging fewer trials, but this reduces accuracy. The ISI and number of trials were merged to a single variable, time, and this was plotted against accuracy for the four systems as seen in Fig. 3. It is clear that none of the four variables is best in all circumstances:

70

Figure 3-3: Figure and legend from Farwell and Donchin (1988)

Hence, an ideal P300 BCI should be capable of allowing a dynamic ISI and utilizing a variety of approaches to allow the user to best achieve the desired accuracy in minimum time. BCIs should also account for other factors than performance, such as user comfort.

71 It is noteworthy that, while Donchin and Farwell were very concerned with minimizing the number of trials required, they were decidedly skeptical about single trial recognition.

The detection of the P300 clearly requires the application of signal averaging, which depends, of course, on the presentation of many stimuli. The effectiveness of this procedure as a communication channel depends on the degree to which the message can be communicated with a small number of trials using an efficient, cost-effective, on-line detector of the P300. It is necessary, therefore, to determine the smallest number of trials in which the system can make the detections at different levels of accuracy (p.513).

In fact, subsequent papers regarding P300 BCIs successfully achieved single trial classification. The authors did propose that such a system could be improved through a variety of means, including the use a menu based system, in which elements in the matrix may call up another menu; the use words instead of letters as matrix elements. They noted that this produces a tradeoff between a much more limited vocabulary and much faster communication; restricting the options presented based on the user’s recent choices. For example, if a user spells “TH,” the next message sent is very likely a vowel or backspace; and finally, incorporate additional components of the ERP, such as the CNV.

72 Interestingly, while these are all good suggestions, only the last one has been explored in a P300 system. Both “brainwave fingerprinting” (e.g., Farwell and Smith 2001) and this dissertation utilize non – P300 components.

3.3.1.2. Polikoff et al. (1995). This study laid the groundwork for a BCI that would allow subjects to move a cursor around the screen using an oddball paradigm, the classic method of eliciting a P300 response. Three subjects were asked to fixate on a central fixation cross. Surrounding the central cross were four other crosses indicating compass positions (N, E, S, W). Subjects were asked to attend to one of the four compass directions and count the number of times it was briefly replaced with an asterisk. Each of these “asterisk flashes” lasted 250 ms, and the ISI was either 750 or 1000 ms. A session consisted of fifty complete sets. Two different stimulus sets were used. In the first, only the four compass directions were flashed, and thus the target probability was 0.25. In the second, a “null stimulus” was introduced in which no asterisk appeared, reducing target probability to 0.20. The authors utilized a simple online mechanism for rejecting eyeblink. If the EOG signal exceeded a threshold, it was assumed that the subject blinked, and the resulting data would be corrupt. The element was flashed again, and a set was not considered complete until all four or five elements had been presented without blinks. Only two subjects participated in the 0.20 probability condition. EEG data were recorded from Fz, Cz, and Pz. EOG was recorded

inferior and

lateral to the right eye. The flash that produced the largest amplitude between 300 and

73 600 ms after target onset was considered the target flash. The system performed above chance with single trial data. Performance was better in the lower probability condition (mean 54.02% correct for .2 probable targets; mean 40.85% for .25 probable targets). The authors compared the impact of averaging by comparing single trial performance with averages of two or three trials. They found that, while averaging did improve performance, the performance improvement was not worth the extra time required. Performance with averages of three trials improved to only 60%. The authors expressed their intention of developing a realtime version of this system, in which the central fixation cross would move based on input from the user. However, such a study has not been reported. This is regrettable, as this would have been the first P300 BCI to utilize a dynamic display based on user input; such a system has still not been developed. Polikoff et al. (1995) was the first to explore the effects of manipulating target probability in the context of a P300 BCI – although only two probabilities were explored in only two subjects, and probability only differed by .05 (Polikoff et al. 1995, Polikoff 2002).

3.3.1.3. Donchin et al. (2000) 10. Donchin and colleagues explored the use of newer preprocessing and classification approaches in a BCI otherwise similar to the original P300 BCI described in Farwell and Donchin (1988). By applying the discriminant function to reconstructed

10 This work was described in Spencer et al. (1998), Spencer et al. (1999), and Donchin et. al (2000). As Donchin et al. (2000) is the most informative of these

74 ERPs associated with each of the individual cells rather than each row and column flash and incorporating an improved SWDA algorithm and discrete wavelet transform, the authors improved the offline performance of their system. However, the online version of the system performed relatively poorly. The study consisted of two conditions: offline and online. Ten healthy subjects and four disabled subjects (three of whom had complete paraplegia) participated in the offline component of the study. They viewed a display similar to that in Farwell and Donchin (1988), shown in Fig. 4:

Figure 3-4: The display used in Donchin et al. (2000)

Subjects were asked to attend to a target letter (in this case, the letter “P”) and count the number of times it was intensified (“flashed”) while ignoring flashes that did not contain the target. Flashes lasted for 100 ms, with an SOA of 125 ms; unlike their

references, and is the only one to undergo peer review, it is considered the “main”

75 previous study, only one SOA was used. Another difference was that rows or columns were flashed interchangeably, rather than first flashing all rows and then all columns. Thus, a “trial” was redefined as the time necessary to flash all 12 rows and columns. Subjects participated in blocks of 15 trials each. Data were recorded from Fz, Cz, Pz, O1, and O2, referenced to the left mastoid and re-referenced offline to linked mastoids. After the data were recorded in response to each flash, the ERPs evoked by each row and each column flash were averaged together, creating 36 ERP epochs for each trial. For example, to create an ERP epoch for the “P,” the responses to individual flashes of the third column and fourth row were averaged together. To explore the relationship between accuracy and the number of trials required, a variable number of trials (between 2 and 40) were averaged together before being passed to the preprocessing algorithm. Two preprocessing algorithms were explored. In the first, the single – trial cell epochs from 0-600 ms after stimulus onset were filtered at 0-8 Hz and resampled at 50 Hz, then a SWDA approach similar to that described in Farwell and Donchin (1988) was applied. In the second, epochs from 0-640 ms after stimulus onset were filtered from 0-50 Hz and resampled at 50 Hz, and a discrete wavelet transform (DWT) was applied to the data before SWDA. The system’s accuracy was then estimated using the same bootstrap approach as in Farwell and Donchin (1988). The two graphs and table shown in Fig. 5 illustrate the relationship between each preprocessing approach and accuracy for a different numbers of trials for both healthy and disabled subjects:

reference for Donchin’s second P300 BCI project.

76

Figure 3-5: Accuracy and bitrate for able bodied and healthy subjects with and without the DWT.

Results indicated that this BCI performed faster than Farwell and Donchin (1988), and worked with disabled subjects. The DWT provided a moderate performance improvement. The authors present six reasons for the improvement:

1) In the new system, the discriminant analysis was applied to each individual cell, rather than row and column ERPs 2) Both systems used commercially available SWDA packages; the authors claim that the second system used a newer and better SWDA approach

77 3) The DWT

was a useful preprocessing algorithm

4) Higher quality displays 5) Improved digitization hardware and software 6) Improved amplifiers

A seventh possible reason is that each flash could be either a row or column flash. In the first study, all rows were flashed, and then all columns were flashed. The newer approach increases uncertainty about which flash will come next, which may produce a slight increase in P300 amplitude. However, the system did not perform as well online. Five of the healthy subjects completed a second condition using an online version of the system. They were allowed to choose five different letters to designate as the target. The number of trials presented was equal to the number required to attain 90% accuracy for that subject, as determined by offline analysis of the results from the first condition. The online system only attained 56% accuracy – well above chance, but substantially below offline performance. There is no apparent reason for this performance difference, and the authors offered no explanation. Donchin et al. (2000) suggests possible avenues for improving a P300 BCI system. In addition to some of the ones mentioned previously in Farwell and Donchin (1988), they also suggested the use of a spell checker. This is an excellent point, since a P300 BCI used to spell may be especially robust to error. Spell checking systems are effective, and humans are very good at interpreting text that contains some errors.

78 As with Farwell and Donchin (1988), the authors recognize the importance of minimizing the number of trials required, but consider single trial recognition unfeasible. They argue that “… it is virtually impossible to visualize, or even detect numerically, the presence of an ERP in the epoch following a single event. The ERP is substantially smaller than the ongoing EEG activity; hence, detecting ERP’s requires a method that extracts the ERP signal from the EEG ‘noise.’ … Much of the effort of developing the BCI consists of determining the smallest number of trials that must be averaged to ensure reliable detection” Again, P300 BCIs that operate on a single trial basis have in fact been developed; Polikoff (1995) showed this a few years prior to this study. While this later Donchin study did show that an offline version of the system could be used with disabled patients, there have not yet been any published reports confirming that an online P300 BCI would work with individuals with no motor control at all. Donchin has very recently begun work toward this (Donchin, 2002a). An improved version of his system using BCI2000 is being developed with collaborators at the Wadsworth Center in New York. Preliminary results from a single late stage ALS patient with only horizontal eye control with an offline version of his system were positive (Donchin, 2002b).

3.3.1.4. Bayliss (2001)11.

11The data reported in Bayliss’ thesis have also been presented in Bayliss and Ballard (1998), Bayliss and Ballard (1999), Bayliss and Ballard (2000a), Bayliss and Ballard (2000b), Bayliss and Ballard (2000c), and Bayliss and Ballard (2001).

79 The doctoral dissertation presented by Jessica Bayliss in 2001 describes two studies utilizing a P300 BCI. Like previous studies, Bayliss was very concerned about minimizing the number of trials required for accurate performance. Unlike previous work, Bayliss explored the importance of environment, with an emphasis on showing that a P300 BCI could be a flexible system capable of operating in very noisy environments with a more natural display and task parameters than a virtual keyboard used for spelling. In the first study, called the “virtual stoplight” experiment, five subjects sat in a modified go – cart wearing a head mounted display (HMD). The author had determined in a previous study (Bayliss and Ballard, 1998) that the electrical noise generated at scalp electrodes by an HMD and eye tracker were comparable to those created by the monitor of a laptop computer. A virtual town was shown on the HMD, and subjects drove through the town using the controls of the go – cart. They sometimes encountered traffic lights, which could be green, yellow, or red. They were required to stop at red lights only, making red lights infrequent and task relevant. Since subjects controlled their movement in the virtual town and quickly learned where stoplights were located, they had some control over when they would encounter a traffic light and thus when stimuli were presented. Data were recorded from 100 ms prior to 1 second after each light change. EEG data were recorded from Fz, Cz, CPz, Pz, P3, and P4 sites referred to linked mastoids. Lower and upper vertical EOG was also recorded. The averages illustrated in Fig. 6 show that a P300 was indeed evoked by red stop lights:

80

Figure 3-6: Averaged ERPs evoked by red, green, and yellow lights in the virtual stoplight study. From Bayliss, 2001

It appears that green lights also evoked a P300, perhaps because a green light is also task relevant (indicating that a driver can go forward). The P300 evoked by green lights appears to have a smaller peak and includes a more robust P2 and N2 component. Bayliss explored the effectiveness of four different signal processing algorithms to determine which would best discriminate between red light P300s and other signals. Three preprocessing approaches were applied to each subject’s data from whichever site produced a maximal P300: ICA, a Kalman filter, and robust Kalman filter. The data were classified using correlation with each of the preprocessing methods and correlation without preprocessing, resulting in four different approaches. The table below shows the offline performance of each of these approaches for each of the subjects’ responses to red and yellow lights. ICA and both Kalman filters performed better than simple correlation, and the Kalman filters were slightly better than ICA.

81 Table 3-1: Performance of different classification approaches for each of five subjects in the offline version of the virtual stoplight study. From Bayliss, 2001.

Two of the subjects returned to participate in an online version of the study. The robust Kalman filter trained in the offline version was used to categorize responses. If a P3 was detected to a red light, a signal activated the brakes of the go – cart. As can be seen in the table below, performance was comparable to that seen in the offline system:

Table 3-2: Classification accuracy for two subjects using the virtual stoplight BCI. From Bayliss, 2001.

Nine subjects participated in another study described in Bayliss (2001), called the “virtual apartment” study. Electrodes were placed over Fz, Cz, Pz, P3, P4, and above and below the eye, referenced to linked mastoids. Subjects sat in a chair and viewed a virtual apartment. This apartment contained four items. Text appeared at the bottom of the

82 screen designating one of the items as the target. Next, a red bubble would flash near each item at the rate of one per second. The subject counted the number of times the target flashed, until the pattern recognition system had enough information to determine which of the items was the target, or 50 flashes were presented without successful categorization. Visual feedback was then provided. The items which could be controlled and the type of feedback given if the item were selected were:

1) A light - the room would become either lighter or darker, as if the light were toggled on or off. 2) A television - a picture would either appear or disappear. 3) A stereo - musical notes would appear or disappear above it. 4) A person - the word “BYE” would appear near him, and attending to a red dot near him ”BYE” would make him disappear. If absent, the word “HI” would appear, and selecting it would make the person appear.

After the goal was attained or 50 trials were presented without success, a new goal was chosen and the process repeated until 250 trials were presented. Subjects were first familiarized with the task by counting the number of times the light flashed. They then participated in three conditions presented in random order:

1) VR condition: Subjects wore an HMD in which the screen moved as the subject’s head moved, as typically occurs in HMDs.

83 2) MONITOR condition: The virtual apartment was presented on a monitor. 3) FIXED DISPLAY: Subjects wore an HMD, but the screen did not move as the subject’s head moved.

The data obtained at Pz after each flash were analyzed online. Bayliss utilized a variable averaging approach. Once a single trial was presented, the system determined its correlation with P300 and non-P300 templates. If the system judged that it could confidently determine whether a P300 was present, it triggered a response. If not, a second trial was presented. Fig. 7 shows the grand averaged ERPs across all subjects in the VR condition over site Pz:

Figure 3-7: Grand averaged ERPs for all subjects in the VR condition

No significant difference in ERP measures was found across the different conditions. There were only slight performance differences between the three conditions,

84 with the best performance seen in the MONITOR condition and the worst performance in the FIXED DISPLAY condition. Most subjects stated that they liked the VR environment best, and all reported that their least favorite condition was the FIXED DISPLAY. Bayliss also explored changes in performance from the beginning to the end of the study. Six subjects showed some improvement over time, while the other three did not improve or performed worse at the end of the study. This is a very intriguing observation, since the P300 might be expected to habituate over time and thus yield worse performance. Bayliss noted that the performance improvement may simply be because subjects became more relaxed throughout the study and produced less muscle artifact. In addition, subjects in her study did not use a P300 BCI over an extended time period. Thus, the very important question of how a user’s ERPs change over time in the context of a P300 BCI needs to be further explored. Bayliss’ dissertation represents a significant contribution to the field. Her consideration of three factors (ERP measures, performance, and subjective report) is an improvement over prior P300 studies that did not consider subject’s preferences. The study shows that P300 BCIs can indeed be implemented with a variety of different displays and tasks in noisy environments, and the observation that different display setups (VR, MONITOR, or FIXED CONDITION) have little effect is important for future designers concerned about the ideal display environment. The comparison of different preprocessing pattern recognition approaches is informative and will hopefully inspire further exploration of this important topic. Finally, the clear demonstration that single trial recognition is possible in different environments is extremely encouraging.

85 3.3.1.5. Hilit (2003). The last study utilizing a P300 BCI was reported by Hilit (2003). ERPs were recorded from three scalp sites in six subjects. Subjects viewed a 6 x 6 display similar to that used in Farwell and Donchin (1988). Rows and columns were individually flashed, and were asked to count the number of times a target flashed while ignoring other events. Subjects participated in offline and then online conditions. Data recorded from the offline condition was categorized using two approaches. In the first, the SNR is improved by passing the data through a matched filter. The filter outputs features that are then sent to a maximum likelihood classifier that compares the features with known instances that contain and do not contain a P300. The second approach was identical to the first, except that ICA was applied to the data before the matched filter. Single trial recognition was rarely attained. The first approach achieved a mean rate of 4.2 signals/min at 90% accuracy, equal to 17.6 bits/minute. The second achieved a mean rate of 5.45 signals/min at 92.1% accuracy, or 23.75 bits/min. The same six subjects then participated in an online version of the study. Data were processed in realtime using the second approach. Hilit also used a variable averaging approach; if the system did not have enough information to make a decision, more trials were presented until a decision was possible. The online version achieved an overall rate of 4.5 signals/min. at 79.5% accuracy, or 15.43 bits/min. Hilit’s work did not use a novel display or pattern recognition approach, though it used a novel combination of them. It is the first BCI to use ICA and a variable averaging approach with a “virtual keyboard” display.

86 3.3.1.6. Commentary on P300 BCI Systems The studies described above provide substantial grounds for optimism about future P300 BCI development. A wide variety of different subjects, including disabled subjects, can generate the activity necessary to send messages or commands without training and with minimal exertion or discomfort. The P300 can be evoked via a myriad of different displays in various environments. New preprocessing and pattern recognition approaches show considerable promise in improving the speed/accuracy tradeoff of such systems. Of particular interest is the wide disagreement about the feasibility of single trial P300 recognition. While in some studies averaging was necessary, others showed that single trial classification well above chance was possible. A few factors may explain this. First, the high hit percentages seen in Polikoff et al. (1995) and Bayliss’ work come at the expense of a limited vocabulary. Subjects in Polikoff’s study could choose among only four or five options, and subjects in Bayliss’ work had either three (in the virtual stoplight study) or four (in the virtual apartment) choices. In Donchin and Hilit’s work, on the other hand, there were 36 options. Thus, while attaining a hit rate of only 30% would be about or barely above chance with fewer options, it would be well above chance if 36 options were present. Second, Polikoff and his colleagues, and Bayliss and her advisor, come from an engineering/computing background, and were likely better able to judge the best classification algorithms than Donchin, who has less experience with pattern recognition. Third, Farwell and Donchin (1988) expressed concern that more complex pattern recognition approaches may be unfeasible given the limited computing power available in 1988. Bayliss utilized an SGI Onyx workstation with substantially

87 more power. Hence, Donchin’s pessimistic evaluation of single trial recognition may have been initially influenced by concerns that sophisticated algorithms would slow down the BCI. As processing power continues to improve, increasingly powerful algorithms may result in ongoing improvements to the speed/accuracy tradeoff. It is also important to note that, while Donchin’s second study recorded from more than one EEG channel, they used only one channel when classifying data. Since the P300 can be seen at numerous sites, an algorithm designed to utilize information from multiple electrodes may be able to perform better than approaches using only one site. Not only can sampling from additional channels reduce signal variability, the additional sites provide a simple “backup” signal when classification based on one site is uncertain. The use of multiple sites also makes it possible to take advantage of the fact that the P300 has a distinct scalp topography. Thus, larger sampling matrices provide both spatial and temporal information about the P300. Finally, the use of the term “P300 BCI” is somewhat problematic for three reasons. First, all of the P300 BCIs to date utilized pattern recognition approaches that based their classifications, in part, on non – P300 data12. Earlier components such as the N1 and N2 may also vary with attention. To date, no one has studied the relative contributions of P300 and non-P300 components, as could be done (for example) by exploring the weights applied to each timepoint by the SWDA analysis used in Donchin’s work. Second, as research into the P300 and its composition continues, it is becoming increasingly clear that the P300 can be divided into subcomponents (Makeig et al. 1999,

12 Polikoff’s system is the only one that makes discriminations based on a time window narrow enough that non – P300 components are likely to be excluded.

88 Jung et al. 2001), some of which may be more or less informative to a BCI. This may make the term “P300 BCI” obsolete as more meaningful ways of interpreting the ERP differences following an ignored or attended event are discovered. Similarly, it has long been established that attended vs. ignored events can produce changes in rhythmic EEG activity (mainly alpha) in addition to ERP changes (eg, Borjesz 1960, Klimesch et al. 1998, Makeig et al. 1999, Spencer and Polich 1999, Yordanova et al 2001, Jung et al 2001). Third, P300 BCIs could be confused with other systems that utilize the P300, such as the “brainwave fingerprinting” approach described by Farwell discussed below. However, “P300 BCI” is an acceptable term for now. The ERP differences evoked by attentional shifts are more pronounced in the P300 than any other component, no BCI has yet isolated and utilized specific subcomponents of the P300, and nobody seems to have confused P300 BCIs with brainwave fingerprinting. As P300 BCIs evolve with technology, the term probably will as well.

3.3.2. Visual Evoked Potential (VEP) BCIs 3.3.2.1. Vidal (1973) The first scientific paper to describe the means for building a BCI was Vidal (1973). In it, Vidal introduced the term “BCI,” presciently described several possible approaches toward BCI design, and argued that BCIs based on the EEG are feasible. He built the first operational system a few years later (Vidal, 1977). It consisted of four bipolar electrodes placed over occipital areas to detect visual evoked potential (VEP) changes. Subjects were shown a diamond containing a checkerboard pattern. A maze was

89 also displayed on the screen, and users were instructed to navigate through it by fixating on one of the four corners of the diamond and thus producing distinct activity patterns over occipital areas. SWDA was used to determine which of the four corners was attended. This approach was over 90% accurate.

3.3.2.2. Sutter (1992) In the next study to use the VEP in a BCI, subjects were shown an 8 x 8 grid containing letters and words and asked to focus their gaze on a target. Different subsets of these letters would then oscillate. Each symbol belonged to many subsets, and each subset was presented several times. Thus, by considering the responses evoked by each subset, it was possible to determine which character was the target. Different subsets were differentiated by flashing at different colors (red or green) and at different frequencies (40 – 70 Hz). Users were first trained to use the system, and their responses to each subset were recorded and used to create a VEP template of the response to each subset. This template was then used in an online version. Over 70 normal and 20 disabled users were trained on the system for 10 minutes to 1 hour, and most could communicate at 10-12 words per minute. A moderately disabled patient with ALS was able to communicate at about the same rate using four electrodes implanted epidurally. One innovation described in this BCI is the menu-based system for stimulus display. If the word the user wished to convey appeared on the screen, it could be selected. If not, the user designated the first letter of the word as the target, and a new screen appeared containing words beginning with that letter. The system was programmed with about 700 common words. The elements in the matrix could be

90 modified by the user to allow letters, words, sentences, or control commands. This system is a substantially more flexible one and the first to be used on a disabled patient. (Sutter, 1992; Sutter and Tran, 1992).

3.3.2.3. McMillan (1997)/Middendorf (2000) The Alternative Control Technology (ACT) laboratory of the US Air Force has developed BCIs utilizing the steady state visual evoked potential or SSVEP (also called the SSVER). Subjects in these experiments were shown two vertical rectangles oscillating at 13.25 Hz at different phases. By choosing to attend to one of the rectangles, subjects could exert control within a one second period. The system sampled twice a second, allowing the user to exert continuous control. One BCI allowed a user to control a functional electrical stimulator (FES) to flex or relax muscles in his knee. Another operated inside a flight simulator allowed pilots to roll the simulator left or right. The flight simulator BCI could determine the intended direction 80-95% of the time, and the FES allowed 95.8% accuracy. While subjects could use both of these systems above chance without any training, performance improved over the course of 20-45 minute training sessions with feedback. The type of feedback (discrete or proportional) did not matter significantly. The authors noticed considerable inter – subject variability, and one of the subjects was not able to achieve control (McMillan et al., 1997; Jones et al. 1998). A third type of BCI explored by ACT presented two rectangles oscillating at different frequencies (23.42 and 17.56 Hz). During each trial, subjects looked at one of the rectangles for a few seconds. Subjects performed 200 trials without training. The BCI

91 could correctly determine the target rectangle with 92% accuracy with an average selection time of 2.1 seconds (Middendorf et al., 2000).

3.3.2.4. Gao (2002) A BCI described recently by Gao (2002) showed substantial improvements over previous work. In his system, subjects could discriminate among up to 40 different targets using voluntary changes in the SSVEP. Data were recorded from two active electrodes, preprocessed with ICA, and classified using an “approximate entropy” approach. Subjects attained a mean transfer rate of 27.15 bits/min, with one subject sending over 50 bits/minute. This is one of the fastest communication rates described in the literature. The improvement over prior work is likely due to the use of active electrodes and much more sophisticated pattern recognition. The authors are working toward making this more compact and portable (Gao, 2002). VEP BCIs have received relatively little attention in the literature, even though they allow high information transfer rates with little or no training. This is primarily because it is believed that these systems require the user to shift his gaze. Thus, they could not be operated by an individual without control of eye muscles; moreover, if a user is moving his eyes, this can be converted to a control signal using an eye-tracking device. This places VEP BCIs in the category of “dependent BCIs” (Wolpaw et al., 2002) in that they depend on the operation of peripheral nerves and muscles to function. All other BCIs developed so far are “independent BCIs” in that no muscle or peripheral nerve activity is required. However, the assumption that gaze shifting is needed to affect SSVER activity may be in error. Recent work by Hillyard and colleagues (e.g., Teder –

92 Salejarvi et al. 1999) showed that subjects could influence SSVER activity only by shifting attention and not eye gaze. Whether SSVER BCIs that do not allow eye movements would be as effective as those presented to date remains an empirical question to be tested.

3.3.3. Slow Cortical Potential (SCP) BCIs: The Thought Translation Device (TTD) Another type of EEG activity that can serve as a BCI control signal is the slow cortical potential (SCP). The SCP is a very low frequency component with potential shifts often developing over .5 – 10 seconds. SCP BCIs have been the primary interest of Birbaumer and his colleagues at the University of Tubingen, who refer to their BCI as the Thought Translation Device (TTD) (e.g., Birbaumer et al. 1999). The first TTD that allowed patient communication was described in Kubler et al. (1999). Three subjects with total motor paralysis resulting from ALS were prepped with electrodes over C3, Cz, and C4 sites. They were trained to move a cursor in one of four directions. Vertical movement was controlled by generating positive or negative SCPs at Cz, and horizontal movement was controlled by positive or negative SCP asymmetry between C3 and C4. Training sessions consisted of 100-200 trials and trials lasted four seconds. The onset of each trial was conveyed by a low tone, and the subject had two seconds during the ensuing “active phase” to move the cursor toward one of the four corners indicated by an arrow. After two seconds, a high tone was conveyed to indicate the beginning of the “passive phase,” which was used as a baseline for the next trial. The subject could also decide not to move the cursor if there was no substantial difference between activity recorded during the active phase and the previous passive phase.

93 Training continued until subjects were able to perform a task sequence involving three individual movements (e.g., “keep the ball in the center, again keep the ball in the center, then move it toward the bottom”) with 70% accuracy. This typically required over 100 training sessions over a period of 3-5 months. Two subjects completed training, and the third was unable to achieve adequate control. In a later study, three of five ALS subjects attained 70% accuracy. Two dimensional control was generally not possible; later iterations of the TTD allowed the user to send only one of two signals. Either the SCP was higher than during the previous baseline phase, or it was lower. Achieving 70% accuracy for a three task sequence has been used as a general approach to training in all subsequent TTDs. Once subjects are trained, they may be switched to one of several different interfaces to allow communication. A language support program (LSP) was developed by these investigators allowing users to spell using SCP control. In this task, the subject is shown the first half of the German alphabet and may either select or ignore it. If the user selected it, he would then be allowed to choose between the first two quarters of the alphabet, and so on. If the user did not choose the first half of the alphabet during the active phase, a two second passive phase occurred and the second half was then presented (Birbaumer et al. 1999; Birbaumer et al. 2000, Perelmouter and Birbaumer 2000). Kaiser et al. (2001) described an alternate interface designed to be more useful to severely disabled users. It involved a three – layered menu structure, as shown in Fig. 8:

94

Figure 3-8: The display used in Kaiser et al (2001)

The system presented options one at a time, beginning at the top of the first level. As with the LSP, users could either select an option, bringing up a new palette of choices, or ignore it until the next option appeared. If the user wanted a cushion behind his back, for example, he would ignore the first two options presented at the first level, then select “body/position.” The system would then present “foot,” followed by “face,” “position,” then “cushion.” The user would ignore the selections until “cushion,” then choose it and select the following choice, “cushion behind back.” The design and testing of interfaces customized to the needs of the severely disabled is a crucial issue in BCI development. While this work is an excellent start,

95 researchers need to explore different interfaces with a variety of subjects and different types of BCIs. Birbaumer and colleagues continue to actively develop the TTD. They have explored the areas affected by SCP training using fMRIs, which may eventually lead to an improved TTD utilizing combined EEG/fMRI information. They have also developed a version of the TTD using BCI2000 (see below) and are exploring the use of ERPs and mu control in combination with SCPs. Perhaps the most exciting development is their use of transcranial magnetic stimulation (TMS) in order to improve training. Preliminary results with this approach on healthy subjects have indicated that low frequency TMS can improve learning in generating SCP positivity, while high frequency TMS pulses improve learning in generating SCP negativity. This may provide the basis for the first bi-directional human interface capable of both reading from and writing to the brain (Karim et al., 2002; Hinterberger et al., 2002; Birbaumer et al., 2002; Birbaumer et al., 2003). The TTD has provided several individuals who would otherwise be unable to communicate with a link to the outside world. This is a significant accomplishment. Nonetheless, training requires many months, and communication rates are slow, i.e., users can only convey about a word every two minutes.

3.3.4. Mu BCIs The previous chapter described how real and/or imagined movements could produce changes in EEG activity. Changes in the free running EEG associated with movement imagery affect the mu rhythm. When these changes are time locked to specific

96 events and the event causes desynchronization of the activity, the changes are referred to as Event Related Desynchronization (ERD). If an event causes synchronization, this is known as Event Related Synchronization (ERS). There have been more papers published on BCIs utilizing mu/beta changes or ERD/ERS than any other type of BCI. Part of the reason is that they can be used by a wide variety of people to send messages or commands with several hours of training. They rely on easily generated signals that can be recognized with a simple approach such as a Fast Fourier Transform (FFT). It is thus not surprising that a myriad of BCIs based on EEG changes associated with movement have been developed in numerous studies spanning over a decade.

3.3.4.1. Wolpaw (1986) The first such BCI was proposed by Wolpaw et al (1986), and a working version was described in Wolpaw et al. (1991). The objective of this BCI was to allow individuals with severe motor deficits to communicate by moving a cursor up or down on a monitor screen with changes in mu rhythm amplitude or power. EEG was recorded bipolarly with electrodes anterior and posterior to one electrode site (C3) using gold disk electrodes. Data were sampled at 384 Hz, and activity in the mu band (8-12 Hz) was translated to one of 76 cursor positions. Lower mu amplitudes translated to a cursor position near the bottom of the screen, and higher mu amplitudes produced upward cursor movement. Five subjects volunteered to participate in three 45 minute training sessions per week for two months. During each trial, subjects first saw a screen with a cursor in the

97 center and a target at either the top or bottom. They would then attempt to move the cursor toward the target by modulating their mu activity. After the cursor reached the top or bottom of the screen, there was a brief delay before the next trial (see Fig. 9).

Figure 3 9: The display used in Wolpaw et al. (1991)

The effectiveness of this training varied considerably across subjects. One of the five subjects failed to learn to move the cursor and left the study after three weeks. The remaining four learned to modulate their mu amplitude enough to achieve performance well above chance. The authors noted that their system could be improved by using two dimensional cursor control based on mu rhythm changes at different scalp locations, more rapid cursor movement, and a more sophisticated interface design. All of these improvements were implemented in a later version (Wolpaw et al. 1994) in which EEGs were recorded from two bipolar channels, FC3/CP3 and FC4/CP4, with an improved sampling rate of 200 ms. These sites are located over the left and right

98 motor cortices, the presumed cortical origin of mu activity. Subjects were trained to modify their mu amplitude in both hemispheres to allow a cursor to move both vertically and horizontally. This BCI also used simple pattern recognition techniques. After exploring several different methods of translating left and right hemisphere mu activity into cursor movement, the authors simply based vertical cursor movement on the linear sum of each hemisphere’s mu activity, while horizontal cursor movement was derived by subtracting left hemisphere mu amplitude from right hemisphere mu amplitude. As with the previous study (Wolpaw et al., 1991), all but one subject learned to modulate mu activity to produce performance well above chance with several weeks of training. Fig. 10 shows the system’s speed and accuracy for each of the five subjects:

Figure 3-10: Accuracy and hit rate from Wolpaw et al. (1991)

Despite the apparent advantage of a BCI that allows two degrees of freedom over a BCI allowing one, these authors have not pursued a two dimensional mu BCI. They

99 found that such a BCI was harder to use and less accurate than their one dimensional BCI, and was not popular with subjects (Wolpaw, 2002). The designers of these mu-based systems have remained very active. Their work has further improved the speed and accuracy of their Wadsworth BCI and clarified numerous issues of interest to BCI researchers. Their papers have explored the role of feedback, improved spatial filters, ensured that the appearance of mu changes was not due to background EMG activity, demonstrated their system’s ability to function with a late stage ALS patient, explored improved interfaces to enable spelling or answering questions, and explored the value of using movement related changes in beta activity as well as alpha,. (Miner et al., 1996, Wolpaw et al. 1997; Miner et al., 1998; Wolpaw et al., 1998a, 1998b; McFarland et al., 1997, 1998; Wolpaw et al. 1999; Wolpaw et al. 2000a; Wolpaw et al. 2000b; Wolpaw et al. 2002).

3.3.4.2. Pfurtscheller (1996). A different research group led by Pfurtscheller at the University of Graz has also been actively exploring BCIs based on ERD/ERS rhythm changes. Kalcher et al. (1996) described a system called the “Graz BCI II” designed to use changes in mu activity to discriminate between three types of movement: left finger, right finger, or right toes. It is an online version of the “Graz BCI” described in Pfurtscheller et al. (1993). EEG was recorded bipolarly over three sites, C3-C3’, Cz-Cz’, and C4-C4’, approximately the same areas used in Wolpaw and McFarland (1994). Four subjects participated in either four or five sessions. In the first three sessions, subjects completed numerous trials in which they were presented with an arrow pointing in one of three

100 directions, indicating which of the three movement types should be performed. The arrow remained for 1250 ms, and data were recorded during the last 1000 ms for online classification. Data from the first session were used to train the classifier. In the later sessions, about two seconds after the subject moved and the arrow disappeared, the BCI presented feedback to the subject indicating both which of the three movements the BCI believed the subject moved and the BCI’s confidence in that decision. The last one or two sessions (two subjects participated in a fifth session) had the same design as the first three, except that the subject was asked to imagine the movement rather than perform it. This BCI used a distinction sensitive learning vector quantizer (DSLVQ) algorithm implemented through neural networks (Kohonen, 1990; Flotzinger et al., 1992, 1994). This technique essentially creates idealized or prototypical vectors for each of the three movements. Whenever a subject initiates one of the three possible movements, the resulting new data, his or her ERP pattern associated with that movement, is compared to the three idealized vectors. The BCI then computes a weighted sum from the new data to each vector and guesses at which of the three movements the subject performed. The system could also report its confidence in each of its categorization attempts. The Graz BCI II was able to discriminate between the three types of movements above chance for all subjects in both real and imagined movements. On average, it classified real movements with about 50% accuracy and imaginary movements with about 45% accuracy. Another important result from this study is that almost no training was necessary; the BCI could achieve good performance within the subject’s first few training sessions. This is in stark contrast to Wolpaw et al.’s New York BCI, in which subjects were trained

101 for many hours over several weeks before being tested on the BCI. The designers of the Graz II BCI provide the following explanation: “As the results did not improve over the course of the sessions, it cannot be said that the subjects have learned from biofeedback. In the New York BCI (Wolpaw et al., 1991, 1994; McFarland et al., 1993), however, where no clear guidelines were given as to which strategies the subjects should adopt for good performance, results were reported to improve over sessions. This indicates that the strategies required in the Graz paradigm (movement planning of the required body parts) are well defined and cannot be refined by biofeedback training (Kalcher et al., 1996).” The Graz BCI has since been improved in a variety of ways. The authors have primarily explored numerous approaches toward preprocessing and classification, such as the use of adaptive autoregression, common spatial filters, newer neural networks for DSLVQ, linear discriminant analysis, and improved frequency component selection. The use of common spatial filters produced especially low error rates, but came at a cost: the approach requires more electrodes than other approaches, and is especially sensitive to placement. Even without the use of common spatial filters, the Graz BCI can now attain above 90% accuracy when discriminating between left and right motor imagery. The Graz group has also studied the role of feedback, showing that realtime feedback is not essential but does improve performance. Three papers have described how the Graz BCI can be implemented on a Windows platform (Guger et al., 1999; Guger, 2001; Guger and Edlinger, 2002)13. Two studies explored the use of different interfaces with disabled subjects. One of them allowed a tetraplegic subject to control a

102 hand orthosis with almost 100% accuracy. Another study, conducted with Birbaumer and Kubler of the University of Tubingen, allowed a patient with severe cerebral palsy to spell using a virtual keyboard (Muller et al., 2002; Neuper et al., 2002).

3.3.4.3. Kostav and Polak (2000) This work described a mu BCI similar to the Wadsworth BCI but which used different preprocessing and classification approaches. Several subjects were prepped for recording using up to 28 channels. The cap used contained “pre – gelled” electrodes which utilized a much more viscous gel than typical electrode caps to reduce preparation time and discomfort. All subjects were trained in a paradigm similar to that used in the Wadsworth BCI: a cursor appeared at the center of the screen, and a target appeared at the top or bottom. The subject learned to control mu rhythm activity to direct the cursor toward the target. Two subjects also participated in a second condition, in which targets could also appear to the left or right of the screen and two-dimensional control was needed. Data from a subset of each subject’s electrodes were preprocessed using autoregressive (AR) feature extraction, then classified using a neural network. Subjects attained accuracy near 100% in the one – dimensional movement condition. The two subjects who completed the two – dimensional condition also performed fairly well. In their final session, one subject hit targets 70% of the time, and the other attained 85% accuracy.

A system similar to the Graz BCI that operates under Windows can be purchased from a company founded by one of Pfurtscheller’s colleagues, Christopf Guger 13

103 As expected, a post hoc FFT over sites C3 and C4 revealed that subjects showed more power around 10 Hz when moving the cursor up than down. This effect was reversed over sites P3 and P4. This confirms the view that the source of mu activity is beneath the central and parietal electrodes, in areas corresponding to somatosensory and motor cortex.

3.3.4.4. Pineda et al.(2003) All previously described movement based BCIs used a plain visual display during training and operation. A recent study sought to explore the effects of a much more complex and engaging environment - a first person 3D shooter game – on subjects’ ability to learn and execute mu control. EEG activity was recorded bipolarly from C3 and C4 electrode sites embedded in a headband. While more sophisticated and conventional electrode caps were available, the authors sought to show that control could be achieved using a more portable, practical, and fashionable14 apparatus such as a headband. Mu activity was extracted using the variable epoch Fourier transform (VEFD), an approach similar to an FFT that utilizes a sliding window to avoid ballistic cursor movements. As a comparison, some subjects learned to control mu activity in a simple environment, in which mu changes affected the height of a plain yellow bar. Other subjects were trained in the game environment, in which mu changes turned the character (and thus the image seen in the game environment) to the left or right. It was found that subjects could be trained to achieve reliable mu control using the virtual environment in approximately 3-6

(http://www.gtec.at/).

104 hours. This was a faster learning rate than in the simple condition. Furthermore, subjects were able to exert mu control even in the distracting environment of a computer game (Pineda et al., 2003). The fact that experienced mu neuronauts could exert control in a computer game environment is not surprising given that cognitive research has shown that learning is most effective when at least four fundamental conditions are present:

1) active

engagement; 2) frequent interactions; 3) feedback, and 4) connections to real world contexts (Roschelle et al., 2000). As subjects first try to learn mu control, they report that motor imagery is often crucial. Mu activity can be decreased by imaging movement, and increased by avoiding thoughts of movement. However, after approximately 6 hours of training, subjects report that motor imagery is no longer necessary, and that they can change mu activity at will. Mu control becomes a background skill requiring little or no attention, much like learning to ride a bike. The suggestion that training can be improved using an immersive, complex environment complements the observation that the feedback provided in the Wadsworth BCI after training was unnecessary (Wolpaw et al., 1998). Taken together, these studies suggest that the nature of feedback is very important during training, but unnecessary after the subject was trained. As noted in Wolpaw et al. (1998), feedback may even impair perforance as it may distract subjects. The relationship between distraction and performance in BCIs has not been well explored, and is discusses later in this chapter. As at least one subject in the San Diego BCI study reported clenching her teeth to exert control, some of the rapid learning seen using this BCI may have been due to the

14

Fashion is in the eye of the beholder.

105 use of non – EEG activity. Another possible drawback of the study is that the subjects were not playing a game so much as using a virtual environment. The subject’s avatar in the game was never threatened or injured, there was no competition of any kind, no requirement to actually exert control, and no reward for performance. Hence, some of the conditions meant to increase learning, such as active engagement and frequent interactions, could be further developed in an environment in which subjects played a game in which their character could be injured and subjects were highly motivated by extra pay for good performance and/or the desire to beat a human or computer opponent.

3.3.5. Mental task BCIs An emerging new category of BCIs called “mental task” BCIs is quickly gaining acceptance. In this approach, subjects imagine performing one out of several possible mental tasks in order to send a message. Wolpaw et al. (2002) discussed mental task BCIs along with mu BCIs for three reasons: both rely heavily on mu changes; most realtime systems described in the literature utilize at least one movement imagery task; and there are very few mental task BCIs. They are presented as a separate category here because they rely on explicit mental strategies, while mu BCIs do not.

3.3.5.1. Kiern and Aunon (1990) The first paper to describe an offline system that laid the groundwork for a mental task BCI was Kiern and Aunon (1990). Five subjects were prepared with electrodes at C3, C4, P3, P4, O1, and O2. Subjects were asked to imagine performing one of five tasks: relaxation, multiplication, rotating a 3D figure, composing a letter to a friend, and

106 visualizing numbers written on a blackboard. Each subject performed these tasks with eyes open and eyes closed, creating ten experimental conditions. They were asked to perform each task for 10 seconds, and this was repeated five times in each of two sessions. Data were analyzed offline. The spectral density was first derived using a Fourier transform and classified with a Bayes quadratic classifier. This approach was able to identify which of ten tasks was being imagined with at least 70% accuracy for all subjects. When asked to discriminate between task pairs, it could do so with 90% accuracy. The authors also explored the value of preprocessing the signal using autoregression with orders from 1 – 10. Accuracy improved as order increased from 1 to 5, but higher orders did not significantly improve performance. The differences between conditions existed in the 812 Hz range (alpha and mu) as well as in theta, beta, and delta activity. Charles Anderson and colleagues adopted this approach in 1995. They have done considerable work exploring the best preprocessing and pattern recognition approaches and can discriminate between these five tasks with at least 90% accuracy. While Anderson has not yet built an online mental task BCI, he is currently developing one, and his work to date with offline systems should contribute significantly to his and others’ mental task BCIs (Anderson et al 1995, Anderson and Sijercic 1996, Anderson 1997, Anderson et al. 1998, Anderson and Peterson 2001, Anderson et al. 2002a, Anderson et al. 2002b).

3.3.5.2. Pfurtscheller (2001)

107 Pfurtscheller and his colleagues explored the use of a non – motor mental task as part of their Graz BCI. As is typical of their approach, during each trial, subjects were prompted to imagine one of four movements (left hand, right hand, foot, and tongue) but might also be prompted to perform mental subtraction. Data from the alpha/mu and beta ranges were decomposed into frequency bands and then classified with a hidden Markov model (Obermaier et al. 2001). The system’s accuracy was explored when discriminating between two and five different mental tasks. As might be expected, accuracy was best when discriminating among fewer classes. Two observations bear repeating. One is that the subjects used in the Graz mental task study had all been trained in a previous version of the Graz BCI (which used only left and right hand movement), but showed no preference for these two tasks. Second, no pair of tasks could be discriminated significantly more easily than any other. This latter result is consistent with observations in other mental task BCIs. This suggests that people who design and use mental task BCIs are relatively free to choose the tasks according to whatever is most comfortable for the user, rather than which tasks might result in the best classification.

3.3.5.3. Penny et al. (2000) Steve Roberts and his colleagues have developed a BCI called the “Oxford – Putney BCI system” able to discriminate between imagined movement and other imagined tasks. The first such system used autoregression and a Bayes classifier to discriminate between two task pairs: motor imagery vs. relaxation and motor imagery vs. math. In the motor imagery task, subjects were asked to imagine opening and closing

108 their dominant hand. In the math task, subjects were given a large number and asked to imagine repeatedly subtracting seven from it. Data were recorded from two electrodes placed near C3 and C4, and feedback was provided by cursor movement. The differential activity was found primarily in the mu band, as well as theta and beta. The average accuracy over seven subjects was 87% (Penny et al, 2000). A later paper by the same group described an offline system that was able to discriminate between performance of three tasks: auditory imagination, imagined navigation, and imagined right hand movement. This approach could discriminate task pairs with about 80% accuracy and could discriminate which of three tasks was imagined at about 75% accuracy (Sykacek et al., 2002). The papers by Roberts and his colleagues showed the viability of an online mental task BCI and further explored the question of which pattern recognition approaches are best.

3.3.6. Implanted BCIs The BCI systems described thus far are all noninvasive, that is, they operate using electrodes placed on the scalp. As noted in the neuro chapter, the EEG signal recorded from the scalp is distorted by various protective tissues covering the brain, as well as the skull, blood flow, CSF, and other components between the brain and scalp. Hence, implanting electrodes that bypass these tissues, often on the surface of cortex or implanted into the brain, can provide better signal-to-noise information than extracranial recordings. The electrophysiological information derived from electrodes in contact with the cortex is called the electrocorticogram or ECoG. A second advantage of recording directly from the cortex is that it enables recording from single neurons, or small

109 populations of them, which is not possible with extracranial EEG. It also enables monitoring neurons deep inside the brain whose activity would be effectively invisible to a scalp electrode. Further, since electrodes are implanted chronically, it is unnecessary for a user to be “prepped” prior to each session with the BCI. This advantage is particularly appealing to severely disabled individuals who are unable to prep themselves and may wish to use their BCI for several hours a day. Of course, the initial surgery required to implant the electrodes is a major procedure, and implanted electrodes should be examined periodically by a doctor. The surgical procedures required to implant electrodes in humans are only appropriate in extreme cases, such as in chronically and severely disabled individuals. Thus, most implanted BCI research has focused on providing a means by which motor information derived from electrodes in contact with motor areas of the brain can control an external device such as a keyboard or robot arm.

3.3.6.1. Huggins et al. (1997) The first successfully implanted direct brain interface (DBI) system in humans was described in Huggins et al. (1997). It described an offline system in which electrodes were implanted into various cortical areas in patients suffering from severe epilepsy15. The areas implanted varied among subjects, but focused primarily on motor and temporal areas. Subjects were asked to perform a variety of actions and say either “pah” or “tah” while ECoGs were recorded. This activity was cross – correlated with other periods in

The authors who developed and tested the DBI repeatedly emphasize that the decision of whether and where to implant the electrodes was guided solely by medical and not research factors. 15

110 which the subject performed the same activity. This approach was generally successful, and follow up papers developed this offline system with more subjects. The system was able to discriminate between the performance of six different tasks accurately. Of 17 subjects, 13 attained a hit rate above 90%. The authors also included false positive rates and HF (hit minus false alarm) differences for each subject; eight attained an HF difference above 80% (Huggins et al. 1999; Levine et al. 1999; Levine et al. 2000). An online version of this system using simple feedback was also developed (Rohde et al. 1999; Rohde 2000; Huggins et al. 2002; Rohde et al. 2002). The system did produce some improvement in learning. However, the feedback was not based on the pattern recognition system’s evaluation of the user’s actions, but “an indirect measure of the ‘quality’ of the EEG.” The authors are developing an online version that uses the cross – correlation method’s results to provide feedback and can operate based on imagined rather than actual movements.

3.3.6.2. Kennedy and Bakay (1998) In this study, a BCI system was implanted into a single subject suffering from ALS. At the time of implantation, the subject had only limited control of eye muscles and could move the left corner of her mouth. The authors used fMRI to isolate an ideal area for implantation. Two neurotrophic electrodes were placed in the hand area of right motor cortex. As these electrodes are designed to encourage neurons to grow into them over time, it is unclear how many neurons were actually monitored by this device. Fortyfive days after implantation, action potentials were stably recorded and the patient was trained to control a binary switch. This was done by asking the subject to engage in self-

111 generated mental imagery while providing visual and auditory feedback. The patient was able to send a correct signal 10 times on the first day of training with no errors. Good performance was also reported over the subsequent several weeks. Unfortunately, the patient died 77 days after implantation. A similar approach was later used with another subject, JR, who suffered a brainstem stroke that left him without motor control, aside from some facial muscle activity. Within two months of surgery, neural activity could be recorded as the patient moved his mouth and tongue. Five months after surgery, the patient could generate activity without any overt movements. This activity allowed JR to move a cursor over a single row of five icons. He could select an icon by holding the cursor over it for a specific duration. JR later recovered a small amount of motor control of his left neck, arm or toe. The authors designed a new interface that allowed him to use this BCI to control horizontal activity and EMG to control vertical activity. This was mapped onto a virtual keyboard presented on a monitor. JR was able to spell at about three words per minute with some errors. JR initially had to exhibit overt motor activity to control the BCI but learned to communicate without doing so. This raises the question of what JR was thinking as he became more experienced with the system. After JR learned to use the system without overt movements, an experimenter asked, “What were you thinking when moving the cursor?” JR replied, “NOTHING.” Nonetheless, using the BCI appeared to be mentally taxing, and JR often needed a break. Since neural activity in motor areas typically produces EMG activity, and JR was still capable of controlling muscle activity from the implanted area, he may have

112 developed neurons in his motor cortex devoted only to controlling the cursor. The authors refer to this as “cursor cortex” (Kennedy et al., 2000). This is consistent with observations from the animal work presented below. The notion of “cursor cortex” is intriguing. It has been reported that individuals who use a mu or mental task BCI must initially devote their full attention to motor or other imagery, but do not need to do so after training. This suggests that subjects have developed neural mechanisms devoted to no other purpose than controlling a BCI. If so, this is an entirely new type of brain tissue worthy of further study. Which types of neurons, and what connections among them, are best suited to “cursor cortex?” Which chemical systems are most relevant? Can everyone develop cursor cortex? Are there any noteworthy side effects? The answers to these questions are likely to both improve future BCIs and enhance our understanding of cognitive neuroscience.

3.3.6.3. Chapin et al. (1999) These investigators described a BCI in which a rat gained one-dimensional control over a robot arm. Six rats were first trained to press a lever that moved a robot arm to provide water. The sequence of movements made by the rat was recorded and analyzed, and it was found that depressing a lever required four distinct movements of the forelimb and paw. The authors then implanted electrode arrays in primary motor cortex (M1) and ventrolateral thalamus (VL), the region of the thalamus associated with limb movements (Kandel et al., 1992). This enabled simultaneous recording from between 21 and 46 individual neurons in these areas.

113 A simple discriminant function analysis was initially used to interpret the activity in these neuronal populations to predict each movement. While it had an overall accuracy of 82%, this approach was less effective at predicting the timing and force of each movement. In order to account for temporal as well as spatial information, the authors first used PCA to derive uncorrelated components from the activity of the neuronal populations being recorded. The first principal component was used to successfully train a recurrent backprop neural network to predict lever movement (r = .86). Finally, this two-stage process was used to interpret the rats’ brain activity in realtime. This was effective in four of the six rats, with two of them achieving 100% efficiency in controlling the robot arm without moving the lever. The authors also noted that the neuronal populations were active before the onset of any EMG activity, and that “over continued trials, the ability of the brain-derived signal to control the robot arm became increasingly independent of the forelimb movement, with which it was normally associated.”

114 Figure 3-11: This BCI enables a rat to control a robot arm. The switch (D) that operates the robot arm (B) can be controlled by the rat pressing the bar, or by neural activity (C) recorded from electrodes in the brain (A).

While direct one-dimensional control of a robot arm could be of considerable value to the disabled, allowing more complex movements would be of greater benefit. A subsequent study (Wessberg et al., 2000) sought to explore whether a primate could effect complex three-dimensional movements using only brain activity. Two owl monkeys were implanted with electrodes in cortical areas. The first monkey had 96 microwire electrodes implanted in five different areas – left primary motor cortex, left and right dorsal premotor area, and left and right posterior parietal cortex. The second monkey had 32 electrodes planted in left M1 and left dorsal premotor cortex. The monkeys were then trained on two tasks while activity was recorded. In the first task, monkeys moved a lever left or right in response to a visual cue. In the second, monkeys made three dimensional hand movements to reach for food. It was found that both linear and neural network algorithms could accurately predict limb movements in both tasks in realtime. Both monkeys were able to use brain activity to directly control one and three-dimensional movements of a robot arm. This control worked in realtime even when the robot arm was at a remote location and information was transmitted via the Internet. Another positive result was that the second monkey could still exert control 24 months after the electrodes were implanted; this is consistent with human studies {will cover in this chapter} showing that neuroprosthetic systems using electrodes implanted in primates continue to function safely and effectively for years after implantation.

115

3.3.6.4. Taylor, Tillery, and Schwartz (2002) A recent study also showed that monkeys could direct 3D movements in realtime via electrodes chronically implanted in the brain. Taylor et al. (2002) placed electrodes in the left motor and premotor areas of two rhesus macaque monkeys such that about 18 single neurons were being monitored in each monkey. Each monkey’s left arm was restrained, while a position sensor was placed over its right hand. The monkeys were shown a 3D virtual environment with two spheres: a yellow moving cursor and blue stationary target. At the beginning of each trial, the monkey saw the yellow cursor appear at the center of an imaginary cube. The blue target appeared at one of the eight corners of the cube and the monkeys had 10-15 seconds to direct the yellow cursor to the target cursor.

Figure 3-12: (A) shows the apparatus the monkeys used while operating the BCI. (B) shows the eight locations on the corners of an imaginary cube where a target may appear.

116

The monkeys were first trained to direct the cursor toward the target using hand motions. In this “open – loop” condition, the monkey’s actual hand movements were used to direct the cursor, and neural activity associated with these movements was translated into a control signal offline. The authors determined that, had these trajectories calculated offline been used as a control signal, both monkeys would have hit their targets significantly above chance. However, performance would still have been poor, with only 27% of trials resulting in “hits.” The monkeys were then tested in a “closed – loop condition” in which neural activity was translated into a control signal in realtime and used to direct the yellow cursor toward the target. Unlike the “open – loop” condition, monkeys had realtime feedback about cursor position derived directly from their neural activity. Both monkeys’ performance improved significantly in the “closed – loop” condition, hitting the target about 49% of the time.

Figure 3-13:

Drawings representing the open loop condition, in which

activity is recorded from cortical neurons to be studied offline, and the closed loop condition, in which neural activity directs the movement of a cursor in realtime, providing the monkey with feedback.

117 One of the monkeys also participated in faster versions of the open and closed loop condition, in which the cursor gain was increased and the time allowed to move the yellow cursor toward the target was decreased to 800 ms. The monkey still showed significantly better performance in the “closed – loop” trials, with a mean hit rate of 42% compared to 12% in the “open – loop” condition. The same monkey also participated in two different “closed – loop” movement tasks to determine whether its ability to directly control cursor movements could be generalized. In one task, the monkey had to move the cursor out to a peripheral target as before, but had to then move it back to the center to obtain a reward. In another task, six new cursor positions were added, one in the center of the face of the imaginary cube. The monkey was able to perform both tasks well above chance. Other noteworthy observations were reported. First, monkeys’ performance improved with daily practice; furthermore, there was some improvement within each day. Second, different translation parameters had to be used for each neuron. Third, the monkeys’ neurons’ tuning characteristics changed as their performance improved at the task, though they became more stable later in the study. Thus, the translation algorithm should be updated periodically to improve performance. Fourth, the monkeys’ physical movements as measured by EMG decreased or disappeared as they learned the “closed – loop” control task.

3.3.6.5. Donoghue (2002)/Serruya et al., 2002 In one study, the Utah Intracranial Electrode Array (UIEA) was implanted into the primary motor cortex of three macaque monkeys such that between 7 and 30 neurons

118 were being recorded. Monkeys were first trained to use a manipulandum to direct a cursor toward a target on a monitor. The target appeared at random, and moved pseudorandomly (even if the monkey hit the target) to provide the monkey with an ongoing task. A linear filter customized to each neuron then transformed the monkey’s neural activity into a fairly accurate reconstruction of hand movements; the reconstructed hand movements accounted for over 60% of the variance seen in the actual ones. One of the monkeys was then tested in a “closed – loop” version of a similar task. In this condition, the cursor appeared at a random location but remained stationary. The monkey first performed the “open – loop” version of the task for about one minute while activity was recorded so that preliminary linear filters could be constructed. The monkey was then switched to neural control without warning. The monkey was able to direct the cursor through the BCI immediately. Though EMG was not recorded, the authors observed that the monkey sometimes did not make any hand motions during the “closed – loop” condition. The online reconstructions were about 70% as accurate as the actual hand movements, and monkeys required slightly more time to acquire the targets. The task could not be performed if filter coefficients were randomly altered or switched between neurons, indicating that linear filters must indeed be customized to each neuron. (Donoghue 2002, Serruya et al. 2002).

3.4. What isn’t a BCI? Given the general lack of clear accord on what exactly constitutes a BCI, it is necessary to briefly describe some related systems and clarify why they are not

119 considered BCIs. Some of these systems come very close to the proposed definition and have influenced BCI development, and are thus discussed more fully here. Several devices currently in the market allow a user to send messages or commands via sensors placed on the head but convey information primarily through eye movements, facial gestures, tooth clenching, or other non – EEG activity. Systems which utilize information carried in the electrooculogram (EOG) and/or electromyogram (EMG) function well, and are sometimes combined with a BCI to allow a user increased degrees of freedom, but are not considered true BCIs because they use normal output pathways. Systems that use neurofeedback to encourage relaxation, treat disorders, or increase personal wellness are often incorrectly considered BCIs. In neurofeedback, users learn to voluntarily modulate activity in certain specific frequency bands such as alpha or beta in order to produce a desired effect such as relaxation or increased concentration (Robbins, 2000). While a neurofeedback system can be effective, it alone is not a BCI because it does not by itself allow communication. Some BCIs use neurofeedback as a means of training subjects on their use, but these systems are BCIs because they enable communication and not because they use neurofeedback. Other, more related, systems read and translate information about brain function in realtime for non – medical applications. Examples include an alertness or workload monitor and the brainwave fingerprinting approach described by Farwell (e.g, Farwell and Richardson, 1993). These systems do not meet the definition of a BCI in that they are not designed to allow the user to send a message or commands. Instead, they read

120 information that the user’s brain generates as a byproduct of performing a task. The user is not expected to voluntarily modulate brain activity in order to convey information16.

3.4.1. Alertness / workload monitors A group of San Diego based researchers working in conjunction with the Navy described a device capable of detecting alertness deficits in attention-critical situations (Jung et al., 1997). The system’s objective was to passively monitor subjects performing a simulated sonar detection task and provide a report of their alertness level. Subjects were asked to perform in at least three 28 minute long sessions in which they were presented auditory and visual stimuli and asked to respond to both. This paradigm was expected to produce periodic alertness deficits. EEGs were recorded from two midline sites, Cz and P/Oz. The authors trained the system on each subject’s data, allowing customized pattern recognition parameters for each subject. This was considered necessary as previous work had shown that there is considerable inter subject variability in power spectrum changes associated with alertness (e.g. Makeig and Inlow, 1993). Training was accomplished by applying PCA to the data followed by a feedforward neural network, which learned through backpropagation. The investigators found that some components of the EEG power spectrum did indeed fluctuate predictably with alertness, despite considerable inter-subject variability.

Although the user could choose to alter his brain activity to get a response from the system and thus send a message (for example, someone using an alertness monitor may choose to zone out to send the message “I’m bored”), the system is not designed to allow this sort of communication. This is why the definition used in this thesis includes the phrase “designed to allow a user to voluntarily send messages or commands.” 16

121

Figure 3-14: Relationship between amplitude in theta and alpha bands and error rate. Top graph, site Cz; bottom graph, site POz.

As illustrated in Fig. 14, as subjects’ local error rate increased, so did their mean EEG power in two regions corresponding roughly to theta and alpha activity, around 4 and 14 Hz. The authors further found that their algorithm was able to estimate the local error rate better than previously published methods. While the BCI-like system discussed in this paper did not operate in real time, it is clear that this algorithm could be adapted to operate continuously. A second, commercially available realtime alertness monitoring system (developed by Advanced Brain Monitoring - www.b-alert.com) allows the user to wear a

122 cap with two semi-dry electrodes positioned at areas corresponding roughly to parietooccipital and frontal midline sites. The resulting estimate of an individual’s alertness is used to warn if an individual in an attention – critical situation (such as a truck driver, air traffic controller, or nuclear plant technician) becomes dangerously inattentive. While the electrode sites used are similar to those in Jung et al. (1997), the pattern recognition approach is different, though details are proprietary.

Figure 3-15: The sensor headset developed by Advanced Brain Monitoring (ABM). The rectangular apparatus over the back of the head is used to transmit EEG information via radio frequency.

Similar systems designed to monitor workload have been described (e.g., Pope et al., 2001; Gevins 2002). These systems can warn if a user is overworked and prone to

123 error, and could increase workload if the user is clearly capable of handling more tasks. The information gained from developing alertness and workload monitors may someday be useful in a BCI. For example, monitoring rapid changes in alertness could enhance a BCI requiring selective attention, such as a P300 BCI. Furthermore, any BCI should ideally account for effects of fatigue and distraction on the EEG, which may require information learned through development of these BCI like systems.

3.4.2. Brainwave Fingerprinting The P300 complex of an auditory or visual event-related potential is different for novel stimuli compared to stimuli a subject has previously seen. This aspect of the P300 is not under a user’s voluntary control. Hence, it is possible to present images or other stimuli to subjects to determine whether they are telling the truth about having previously seen them. For example, consider an individual accused of murder who insists he has never seen neither the victim nor the murder scene. If the individual is shown images of the victim and murder scene, and his ERP response indicates familiarity with the images, he is potentially lying about his unfamiliarity with the victim and scene. Larry Farwell, one of the authors of Farwell and Donchin (1988) that described the first P300 BCI, has been involved in exploring the use of P300 and related components for lie detection. Farwell refers to the ERP activity following presentation of a stimulus as the MERMER – memory encoding and response multifaceted electrophysiological response. The MERMER appears to be based primarily on the P300, though other components are incorporated into this index. According to published reports, the system was able to

124 discriminate FBI agents from non – agents with 100% accuracy (Farwell and Richardson, 1993). More recently published evidence describes numerous studies with various police departments and government agencies that demonstrate that the system is highly accurate (Farwell and Smith, 2001)17. Claims have been made that the approach is 100% accurate in over 200 tests (see Farwell’s website at www.brainwavescience.com) and testimony based on brainwave fingerprinting has been accepted in court in Iowa. Brainwave fingerprinting is based on a solid theoretical foundation and, if used properly, may someday prove a valuable tool in determining an individual’s guilt or innocence. However, brainwave fingerprinting remains highly controversial. Farwell’s former advisor testified in the Iowa case claiming that the approach is not yet ready for reliable deployment in forensic environments (Farwell 2002; Donchin 2002b). Other widely recognized EEG experts have argued that further development and testing is necessary. Another leading P300 researcher has stated that he is developing a means of training subjects to counteract Farwell’s system, suggesting that the system is not reliable (Rosenfeld, 2001). These criticisms need to be addressed before brainwave fingerprinting can be used as a forensic investigative tool. In addition to scientific concerns, many in today’s society may not be comfortable with this sort of “mind - reading” approach. Brainwave fingerprinting research has also further explored cognitive and emotive factors that affect the P300 complex, research that may someday be useful in a BCI. Also, like alertness and workload monitors and neurofeedback, it has drawn significant attention to the feasibility of deriving useful information from EEG activity in

These were published with the FBI; the second author of both of these studies was an FBI agent. 17

125 realtime. These systems have also encouraged the development of practical sensors specialized for non-laboratory conditions which are crucial for effective BCIs.

3.4.3. Chapin (2002) The only true “CBI” developed to date in which commands can be sent directly to the brain also merits discussion because of its importance in the field of communication between brains and computers. While other systems have affected brain activity in realtime, this is the first system that can send a specific command to a brain. Systems like these may soon become part of the first bi-directional BCI. In a typical operant learning paradigm, a subject learns to associate certain behavior(s) with reward or punishment (Skinner, 1938). A common example is training a rat to move through a maze by providing a food reward. It is believed that the reward is processed through brain areas including the medial forebrain bundle (MFB), an area crucial for learning (Kandel et al., 1992). The MFB is more active when a task is rewarded and less active when a behavior is punished, suggesting that MFB activity helps an organism learn adaptive motivated behavior, i.e., whether a behavior was beneficial and should be repeated if identical conditions arise. Thus, it should be possible to train an organism by directly altering MFB function in the absence of any other reward or punishment. In Talwar et al. (2002), rats were implanted with stimulating electrodes in the medial forebrain bundle (MFB) and areas of the left and right primary somatosensory cortex (S1) that represent the whiskers. A backpack was also mounted on the rat that could be controlled by a laptop computer, allowing an experimenter to initiate a stimulus

126 pulse train from up to 500 m away. Rats were first trained to run through a simple figure – eight maze. Stimulation of left or right S1 served as a cue to turn left or right. When the rats turned correctly, they were administered MFB stimulation as a reward. The rats learned the maze successfully, and could even perform the same turns at the correct locations if the maze was removed and they were placed in an open environment. The rats were then trained to navigate through a more complex three-dimensional obstacle course, complete with a ladder, stairs, ramp, and narrow ledge. Stimulation of the MFB in the absence of S1 stimulation only served as a signal to move forward without turning. The rats could navigate through the task without error. (See Fig. 16) They could also be directed to climb or jump from surfaces, run through pipes and elevated areas, move through piles of concrete rubble, and move through open, well-lit areas. Rats tend to prefer dark, enclosed areas, and thus would hesitate at some points in the 3D maze. It was discovered that additional MFB stimulation was effective in encouraging the rats to engage in movements they would normally avoid. As the authors note, providing an MFB reward has advantages over a typical reward. An MFB reward “is relatively non – satiating, and animals need not initiate consummatory behaviors to receive rewards. As virtual cues and rewards are perceived within a body – centered frame of reference, they may facilitate learning independently of the external environment. It may also be possible to increase the ‘bandwidth’ of conditionable information by stimulating multiple brain sites, thereby increasing the variety of reactions that can be elicited.”

127

Figure

3-16:

Examples

of

guided

rat

navigation

using

brain

microstimulation. Sketches are constructed from digitized video recordings. Red dots indicate rat head positions at 1-s intervals; green dots indicate positions at which reward stimulations were administered to the medial forebrain bundle (MFB); blue arrows indicate positions at which right (R) and left (L) directional cues were issued; black arrows indicate positions 0.5 s after directional commands.

There may be useful applications of such CBI technology. Rats or other animals could be trained to engage in “search and/or rescue” operations in areas not suitable for humans. Examples include landmine detection, scouting in hostile terrain, exploring areas too small for humans, locating the extent and severity of radiation spills, etc. (Chapin 2002; Talwar et al 2002).

128

3.5. Summary of Issues 3.5.1 BCIs and Cognition What does a BCI user need to think about to operate the BCI? This is an important question for a variety of reasons. Some mental tasks may be easier or more intuitive for different people, and some mental tasks (such as movement imagery, common in mu BCIs) may not be possible for disabled users. Different mental strategies may affect the EEG, and hence finding the optimal strategy may improve performance. Aside from performance issues, a user may prefer a BCI that can be operated using simple tasks, perhaps in a distracting environment, to a BCI requiring all their attention. One striking observation is the lack of any natural mapping between mental activity and the desired outcome. In most BCIs, like keyboards, the mental tasks a user performs in order to send information have nothing to do with the message itself. In a VEP BCI, for example, a user may spell by directing eye gaze; in a P300 BCI, a user may spell by counting or ignoring flashes. BCIs requiring the imagination of movement allow a more natural mapping between thought and outcome. Authors who work with mu BCIs (eg, Allison et al. 2000, Wolpaw 2003) have reported that subjects initially imagine movement to move a cursor, but the motor imagery exhibits no clear mapping to the desired outcome. To move a cursor down, a user would imagine any kind of movement (not just downward); to move a cursor up, a user would avoid thinking of any movement. In one study (Pineda et al., 2003), each subject reported using a different type of movement imagery to move the

129 cursor. The imagery was often consistent with the subject’s background, that is, it was meaningful whole-body movement; for example, one subject from the tennis team would imagine himself playing tennis in order to influence mu activity. No subject reported that imagining upward or downward movement produced corresponding cursor movement. Some implanted systems may also map movement imagery to cursor movement; most of the implanted systems described above allow cursor movement, and do so via electrodes placed in motor cortex. Thus, users trying to move a cursor up may indeed be thinking about upward movement. However, this is difficult to verify, as many of these systems were used in nonhuman animals. One of the patients who use an implanted system to control a cursor, JR, reported he was thinking of “nothing” while using the system. In addition, explicit mental strategies may be less relevant in experienced users. Many authors have noted that subjects initially claim BCI use is very distracting, and requires complete attention to details regardless of whatever mental strategy is used. However, as subjects become better trained, BCI use requires less attention and no longer requires the use of explicit mental strategies. Instead, BCI use becomes a background ability, much like riding a bicycle or other types of skill learning. For example, subjects who are first exposed to a mu BCI report that they must devote all of their attention to imagined movement in order to produce low amounts of mu, and must concentrate to avoid any thought of movement to produce high mu power. However, within a few hours of training, they report that movement imagery is no longer necessary (Pineda et al., 2003; Sarnacki, 2002; Wolpaw, 2002). Similarly, subjects using the Thought Translation Device or Anderson’s mental task BCI initially use intense mental imagery, but find that

130 less concentration is necessary as training progresses (Anderson, 2002; Parker, 2003; Birbaumer, 2003). Further research into the changes in mental activity and attentional demands seen as people use BCIs over extended periods is necessary, especially since the majority of BCI publications only examine the first few hours of a subject’s BCI use, whereas the typical BCI user is a veteran neuronaut. Hence, the important question of what mental strategies are best suited or most appropriate for BCIs requires further research. Different categories of BCIs require different strategies, many BCI users devise their own strategies, and explicit strategies may be less relevant as users become more experienced. While the notion of a literal mapping from thought to action seems appealing, such a BCI may not be easiest or fastest. What would it take to build a literal BCI, in which a user need only think of a certain message or command to see it enacted? This question is largely a function of the domain of possible messages or commands a user wishes to send18. It may be possible to construct a rudimentary literal BCI today using a mental task BCI – the system might allow the user to activate or deactivate a music player by thinking of music, for example, but such a BCI would be of very little value. A more flexible literal BCI cannot be developed without research into the unique EEG characteristics seen in each individual while thinking of specific messages or commands. Such BCIs will probably need to be customized to each user, as different subjects exhibit different EEG activity.

An “unrestricted” literal BCI, in which a user can think of any message or command and have it enacted, is probably impossible and certainly undesireable. 18

131 Suppes and his colleagues have recently published several articles relevant to a potential literal BCI. In one study, subjects were shown or read words and asked to comprehend or internally repeat them. The authors could identify the word being processed using sixteen electrodes well above chance, even when using only single trial EEG (Suppes et al. 1997). Similar results were obtained in a later study, in which words and sentences were presented auditorily to two subjects (Suppes et al. 1998). Two subsequent studies showed that the EEG associated with imagination of sentences (Suppes et al. 1999a) and simple visual images (Suppes et al. 1999b) could be categorized well above chance, and that there was suprisingly little variation between subjects’ EEG. A later study showed that data from their previous studies could be represented using only a few sine waves (Suppes and Han, 2000). In all of these studies, categorization was based on the EEG evoked when a single word, sentence, or image was presented. This paradigm would not be useful in a BCI. It is not clear whether the EEGs evoked when spontaneously thinking of a word, sentence, or image, or when focusing on one of many such elements. Unfortunately, all of these publications from Suppes and colleagues have not undergone the traditional peer review process. Given the extraordinary nature of his claims, his work should be reviewed and replicated by other researchers.

3.5.2. BCIs and language BCIs offer a new means of sending messages. When new communication technologies are developed, terminology from a related domain is usually applicable. For example, written languages use the term “grapheme” to refer to the smallest amount of

132 information capable of affecting a message – a single letter. This term worked well when keyboards were invented, since they are simply another means of expressing written language in which the smallest amount of information that can be sent is a single letter. This is not the case with BCIs. In a BCI, a user conveys information by repeatedly engaging in mental activities that produce distinct patterns of EEG disciminable by an artificial pattern recognition system. Hence, the smallest amount of information capable of affecting a message is the smallest amount of mental activity capable of affecting a system’s decision as to what the user means to convey. The term “cogneme” shall refer to the smallest amount of mental activity capable of producing a difference in meaning in a BCI. In an event related BCI, a cogneme is the user’s response to each event. For example, in a P300 BCI, a cogneme refers to the user’s decision to ignore or count each flash. In a spontaneous BCI, a cogneme may instead refer to sustained mental activity for a specific period. Most extracranial BCIs today allow only two cognemes, but this is likely to change as it becomes possible to discriminate more distinct mental states based on the EEG. Just as a grapheme is meaningless outside of written language, a cogneme refers only to information sent via a BCI. An individual who engages in mental activity that might drive a BCI while not using one is not producing cognemes19, just as someone who thinks of writing a letter is not producing graphemes. Furthermore, a cogneme is not necessarily cognitive. As noted above, individuals may send messages through BCIs without any explicit thought.

Cognemes have nothing to do with the “language of thought” described by Fodor (1975). 19

133 In most BCIs, cognemes are combined to allow users to send single letters. In these systems, it may be appropriate to think of cognemes as sub-graphemic features, consistent with terminology seen in written languages. However, some BCIs cannot combine cognemes to form single letters, and instead only allow the user to send one of several simple messages (eg, Kaiser et al. 2001). Hence the blending between BCI language and written or spoken language breaks down; while phonemes or graphemes are typically combined to form morphemes20, cognemes may be combined to form letters, digrams, morphemes, longer messages, or even nonlinguistic signals like musical phrases, icons, pictures, or whatever else is available in the BCI’s vocabulary. The term “element” shall refer to the smallest amount of information a user can convey to an outside agent using a BCI. Most current BCIs combine multiple cogmenes to form an element. An example of an exception is the training routine for the TTD, in which a user can send a “yes” or “no” by briefly generating either high or low amplitude slow cortical potentials.

3.5.3. BCIs and Information Throughput It is clear that no single BCI is best for all users in all environments. The decision about which BCI should be used depends not only on the BCI’s capabilities, but also on the needs, abilities, and desires of the user as well as the operating environment, cost, and other factors. However, it is possible to make some comments about specific facets of BCIs.

A morpheme is a meaningful linguistic consisting of a word, such as “dog,” or a word element such as “-ed,” that cannot be divided into smaller meaningful parts. 20

134 One objective means of comparing the utility of BCIs is by their information throughput. It is possible to determine the number of bits per minute conveyed by a communication system based on the number of options available with each selection and the system’s accuracy:

Figure 3-17: Relationship between number of possible choices (cognemes), accuracy, choices per minute, and bitrate. Figure and legend from Wolpaw et al. (2000)

135 Numerous BCI papers have used the formula derived from Shannon and Weaver in order to express performance in terms of bits per minute. Wolpaw et al. 2000 presented substantive commentary about this formula. “ … The information transfer rate of a BCI that can select between two choices with 90% accuracy is twice that of a BCI that can select between them with 80% accuracy, and equal to that of a BCI that can select between four possible choices with 65% accuracy. The enormous importance of accuracy, illustrated by the doubling in information transfer rate with improvement from 80% to 90% accuracy in a two-choice system, has not usually received appropriate recognition in BCI-related publications. While the effectiveness of each BCI system will depend in considerable part on the application to which it is applied, bit rate furnishes an objective measure for comparing different systems and for measuring improvements within systems.” The authors’ observation that this formula is only one means of gauging performance merits further attention. BCI designers need to also consider the application as well as the interface and user preferences. The formula itself may be somewhat misleading for two reasons. First, the formula assumes that all cognemes may be selected with equal probability; this is not always the case. If some cognemes are more likely than others, this could be used to influence a pattern recognition system. Second, the formula only accounts for hit and miss rates, and not the false alarm rate. As some authors have noted (eg, Bayliss 2001 and Millan 2002), in some BCIs, a false alarm may be worse than a miss. For example, some BCIs described by Birbaumer and colleagues (eg, Perelmouter and Birbaumer 2000, Kaiser et al. 2001) feature a multilevel menu system. The user is given a choice of selecting an option through SCP control, and if no selection is made, the next option is

136 presented. If no selections are made at that layer (a “miss”), the choices from that layer are presented again. Since there are few choices per layer, this only results in a slight delay. However, if a false alarm occurs, the user may be forced into one more layers of menus containing unwanted choices, and (since “backspace” is not an option) will ultimately send an erroneous command that must be corrected by navigating through the whole menu system again. The many possible avenues toward improving information throughput in a BCI can be divided into three general categories:

1.

More cognemes per minute

2.

Fewer cognemes per message a.

Fewer cognemes per element i.

More bits per cogneme

ii.

Dynamic vocabulary selection

b.

c. 3. each cogneme

Fewer elements per message i.

Larger vocabulary

ii.

Personalized vocabulary

iii.

More informative elements

iv.

Error detection and correction

v.

Predictive grammar

Allow more cognemes

Improved recognition of the distinct EEG states associated with

137 a.

Improved preprocessing and pattern recognition

b.

Utilize cognemes with maximally discriminable EEG

signatures c.

Larger electrode montage

d.

Improved EEG sensors

The first category is the most straighforward. In the case of an evoked BCI, this means that more events must be presented each minute. A spontaneous BCI must be capable of identifying cognemes based on a smaller time window. Most avenues toward a faster BCI present tradeoffs with other means of improving throughput and/or create other drawbacks; presenting more cognemes per minute does both. When subjects generate more cognemes per minute, the BCI must be able to categorize them based on less data; otherwise, categorization errors will increase. In addition, faster event presentation may result in less robust EEGs during the shorter time window. In a P300 BCI, for example, the amplitude of the P300 is inversely proportional to the number of targets presented per second. It may appear that the ideal relationship between speed and accuracy can be easily determined using the formula above. However, the formula only shows how to maximize information throughput, not user comfort. A user may prefer a more accurate BCI to one that is faster because the latter allows more errors. Subjects are only capable of generating a finite number of cognemes per minute for any given BCI. Thus, subjects may find faster BCIs impossible to use or prohibitively difficult to use, especially over an extended period. The first study of this dissertation explores the advantages and

138 drawbacks of changing the rate of event presentation (and thus cogneme generation and ERP measures) in a P300 BCI. The second category is broadest, and presents many unexplored possibilites for improvement. The first study of this dissertation explores an approach to reduce the number of cognemes required to select an element in a P300 BCI. This is done by increasing the number of bits conveyed with each flash. Doing so presents two possible drawbacks – less distinct EEGs and a more difficult task for the user. Another approach is to dynamically reduce the vocabulary based on previous selections. If a user spells “THR,” this does not provide enough information to complete the message, but it is possible to present the user with only the five vowels and backspace. If a BCI usually allows a user to select one of 64 elements, but requires him to choose only one of seven in this example, the selection can be made with fewer cognemes. In a conventional P300 BCI, this could be done by presenting the user with a new vocabulary (a screen containing only six options) or by keeping the same vocabulary but devising a dynamic flash approach that only illuminates letters that could result in a meaningful word. The former option is not feasible, as it requires users to become familiar with a new display far too often than would be comfortable. The only drawback of dynamic vocabulary selection is that the frequent appearance of new screens or flash patterns may be disorienting. An excellent example of an advanced word completion system is the Dasher system. While the Dasher system is designed around eye tracking systems, its word completion routines could be useful in a BCI. A recent study showed that subjects using

139 the Dasher system could send more characters per minute with fewer errors than a conventional on-screen keyboard (Ward et al., 2002). There are many ways to design a BCI to allow a user to send a message using fewer elements, but very few have been studied. One option is to allow a larger vocabulary, so that a user is more likely to find the element s/he wishes to convey. If the BCI’s vocabulary does not contain the element the user wishes to send, s/he may need to circumlocute or otherwise send more elements to convey the message. A larger vocabulary may be implemented in two ways. Most BCIs do not contain menus and the user is stuck with one palette of elements. In such BCIs, a larger vocabulary can be attained only by increasing the size of each palette. The second study of this dissertation explores this possibility by comparing matrices of 16, 64, or 144 elements. The tradeoff of a larger vocabulary is that more cognemes are required to identify each element. Very large vocabularies may increase training time as the user becomes familiar with the locations of hundreds or thousands of elements. In a menu selection BCI, in which the user may choose the palette of elements most appropriate to his message, a larger vocabulary could be implemented by allowing more palettes as well as larger palettes. For example, if an 8 x 8 matrix of options includes 54 letters and 10 selections that each allows a new 8 x 8 matrix, the BCI has a vocabulary of 694 elements. However, if the same “starting screen” allows 44 letters and 20 menu navigation options, the BCI has a vocabulary of 1324 elements. The vocabulary could also be enlarged by allowing a menu structure with more than two layers. The drawback of menuing is that it is often necessary to first choose the desired menu and then choose the desired element, which may be slower than simply locating the desired

140 element without menu navigation. Menuing could be very helpful in BCIs, and it has not been well explored21. Most BCIs published to date present the same vocabulary to all users. While this is appropriate for laboratory testing, BCIs designed to be used as someone’s sole means of communication typically allow for a customized vocabulary. A vocabulary may be customized by the user or other person, or BCIs could include software to do so. There is no drawback to this approach, as long as the customization is done well. One mechanism capable of producing dramatic increases in BCI speed is the use of more informative elements. BCIs typically use single letters as elements. Thus, any message must be painstakingly spelled out. BCIs that allow elements such as digrams, prefixes or suffixes, common words, names, phrases, sentences, or pictures could greatly reduce the time needed to send a message. However, this approach requires a large vocabulary (probably including a menu system), may disorient users, and may increase the time needed to send some messages as substantially more searching is required. Error correction mechanisms have been preliminarily explored. If someone sends a message or command that is incongrous with previous signals, such as an incorrect word, grammatical error, impossible or meaningless command (such as requesting a nurse when one is already in the room), a BCI could automatically ignore it and perhaps suggest an alternative.

It is also possible to isolate brain activity associated with

perception of error, such as the error related negativity (ERN). There may also be unique EEG activity associated with the perception of error in a mu BCI (e.g., Schalk et al.

Menuing was not explored in this thesis because it necessitates a system capable of identifying a target element in realtime, which was beyond the scope of this thesis. 21

141 2000). Error correction reduces or eliminates the need to issue corrections manually, thereby reducing the number of elements per message. This is another approach with no apparent drawback. Predictive grammar, like dynamic vocabulary selection, has been proposed but not studied in BCIs. If a BCI can accurately guess the word, phrase, or sentence a user intends to send based on his initial selections, it could automatically complete it. There is no drawback to this approach either. A final means of reducing the number of cognemes needed per message is to allow more cognemes. Even a small increase in the number of cognemes could enable a substantial improvement in overall speed. The number of elements available equals the number of different cognemes raised to the power of the number of cognemes needed for each letter. Hence, a BCI allowing two different cognemes must obtain four of these binary cognemes to specify one of sixteen elements. If a user can choose between four cognemes, only two of these quaternary cognemes are needed to specify one of sixteen elements. Such a BCI may be more difficult to use than a BCI allowing two cognemes, but this has not been well explored. The much more serious obstacle to allowing more cognemes is that this requires a pattern recognition system capable of distinguishing more cognemes. If a BCI cannot discriminate among four cognemes with the same accuracy as one using binary cognemes, it may or may not be superior; accuracy can be determined using the formula in the legend of Fig. 17 from Wolpaw et al. (2000). This issue has been explored in mu BCIs by Wolpaw and his colleagues, who first designed a BCI allowing two cognemes reflecting vertical cursor movement (Wolpaw et al. 1991) and later described a system allowing four cognemes capable of horizontal and

142 vertical movement. Though both BCIs worked, subjects found it more difficult to generate four unique cognemes. Wolpaw found that overall throughput was higher when only two cognemes were available, and thus all subsequent versions of the Wadsworth BCI have allowed only two cognemes (Wolpaw, 2002). The third category includes the “holy grail” of BCI research: improved categorization of cognemes. Substantial BCI research has been aimed at improving throughput via improved preprocessing and pattern classification. This approach has no drawback and is transparent to the user22. Despite the significant work already done, there remains no clear consensus on the best approach toward EEG categorization for any category of BCIs, and it is highly likely that ongoing research will continue to yield performance improvements. The third study of this dissertation explores a preprocessing approach, independent component analysis (ICA), to determine whether it could be helpful as a means of preprocessing P300 data in a BCI. A similar approach is to ensure that the cognemes used by a subject produce maximally discriminable EEG signatures. This issue has been best explored in some papers describing “mental task” BCIs; Pfurtscheller and his colleagues showed that no pair of five mental tasks could be discriminated more accurately than any other pair (Obermaier et al., 2001). However, different categories of BCIs or different mental tasks may not share this characteristic.

It is possible that extremely complex means of categorizing cognemes could require so much computation as to produce a noticeable delay. This is not likely with conventional PCs, but BCIs with less powerful processors (such as very small ones designed to be highly portable) may not be capable of rapidly processing demanding algorithms. In this instance, simpler categorization may be advisable. 22

143 Acquiring more information about the user’s EEG by using more electrodes may enable better performance, but at the expense of increased cost, preparation time, and inconvenience. In the case of implanted systems, a larger electrode montage may increase the risk of infection. Similarly, improved electrodes capable of acquiring cleaner data contribute to EEGs that are more distinct. As electrode technologies improve, it is likely that the signal to noise ratio will improve without requiring increased preparation time. The best way to minimize signal to noise ratio today is to prep the subject properly; otherwise, electrode impedances may be unnecesarily high. Active electrodes, which do some signal processing at the electrode, show considerable promise in providing cleaner signals. The tradeoff of using less noisy electrodes is that they may entail more cost, preparation time, or discomfort for the user. Dry electrodes, which do not require gel, are more comfortable for the user but provide a noisier signal. Improved electrodes should enable improved EEG BCIs, and ongoing advancements in other means of measuring brain function such as MEG or fMRI would eventually allow for BCIs utilizing additional means of recognizing the distinct mental signature associated with a cogneme.

3.5.4. BCIs and Other Factors While information throughput is an important consideration, many other factors are also relevant when considering the best BCI for a particular user for a given task. Many BCI studies have utilized a healthy subject population using a BCI for a limited time in a laboratory setting with little or no attention to subjective report. A typical BCI user today is severely disabled, dependent on the BCI for long periods, and probably at

144 least as concerned with subjective factors such as comfort, fatigue, and ease of use as the objective measure of information throughput. A variety of factors may prevent certain types of BCIs for being feasible for specific patients. Implanted BCIs are only viable for individuals with deficits severe enough to justify surgery. BCIs that present very fast flashes may not be feasible for epileptics. BCIs requiring motor imagery may not work with individuals paralyzed from birth, who may not be capable of motor imagery. Deficits in attention, language, and other abilities may also constrain the domain of available BCIs. Further research in the effects of long-term use is essential. Similarly, there has been no research yet on the side effects of chronic BCI use. If a user is creating different cognemes in the process of learning and using a BCI, this may well have side effects. Might individuals who use a mu BCI be better, or worse, at certain motor tasks?23 Would chronic use of a P300 BCI make people more or less distractible? The ideal BCI for a given user must function well after years of use without negative side effects. Another major consideration is the effort required to operate the BCI. The subject who reported thinking of “nothing” while using the Kennedy BCI nonetheless found the experience exhausting, and was not capable of sustained BCI use. Subjects using an SCP BCI have also described it as exhausting. However, this phenomenon has not been reported by users of mu, P300, or VEP BCIs. This important issue has not been explored. A BCI that can be used for extended periods without fatiguing the user is usually preferable to an exhausting BCI.

The aforementioned subject in the mu experiment who played tennis claimed that he played tennis better after a session of mu training. 23

145 To date, BCI researchers have generally assumed that the user is attending primarily or exclusively to the BCI. This may not always be the case. People may wish to use a BCI while watching television, talking with friends, writing theses, or using other types of interfaces, perhaps including other BCIs. No one has explored the extent to which BCI use limits the ability to attend to other tasks. Some groups are beginning to explore the use of “combined” BCIs utilizing more than one type of signal. BCIs may also be compared according to the training time required to attain acceptable performance. On this axis, evoked BCIs are superior, as they typically require no training at all. Spontaneous systems such as a mu BCI require hours of training, while an SCP BCI requires months of training. A closely related issue is that of the equipment and environment necessary for a BCI, which places evoked BCIs at a disadvantage. By definition, they require a user to respond to a specific external event. Hence, they require the user to utilize a means of generating these stimuli and pacing himself according to it. In theory, a spontaneous system could be used without any display or stimuli, though spontaneous BCIs usually present the user with feedback. A myriad of other factors are also important in evaluating BCIs. These include cost, risk (in considering implanted systems), the ease of modifying the BCI around the user’s needs, short and long term effects of BCI use, noise in different environments, computing power required, the extent to which trained personnel are needed to prepare someone for BCI use and, of course, ease of use and user preferences.

3.6. Future directions

146 How can BCIs best be improved? Many possibilities are apparent from the previous discussion; for example, improving the speed – accuracy tradeoff, reducing training time, or improving ease of use. There also needs to be more attention to interand intra- subject differences in BCI training and use as well as effects of long-term use. “Combined” BCIs that allow the user to send multiple control signals simultaneously are in their infancy, but are being actively pursued by many groups and will greatly advance the field. The prospect of improving training with TMS or other means such as novel instructions to subjects, drugs, different tasks or environments, and training parameters customized to each user is also justifiably receiving significant attention. Fortunately, the pace of BCI research is increasing. The ongoing improvements to BCIs, and their validation in a variety of environments with both healthy and disabled subjects, are drawing more researchers and, more importantly, more funds. The BCI conferences held in 1999 and 2002 facilitated numerous new directions and inter – laboratory and interdisciplinary collaborations. Ongoing developments in a myriad of relevant fields such as mathematics, electronics, and cognitive neuroscience also enable better BCIs. Another factor likely to accelerate BCI research is the availability of a universal platform called BCI 2000. This system, developed at the Wadsworth center, allows BCI designers to customize all five components of a BCI. This makes it much easier to design a new BCI than ever before, and greatly facilitates comparisons of different components and collaborations between research groups. It also encourages the development of “combined” BCIs. BCI 2000, and its availability to academic researchers, constitutes one of the most important developments in the field of BCI research to date.

147 BCI research is also affected by public perception. If BCIs are viewed in a positive light, more people and funds will be available for public and private BCI development. If not, people will be less inclined to design, test, purchase, and use BCIs. Given the widespread perception in science fiction of a BCI as a much more sophisticated device than is currently possible (e.g., Star Trek, Firefox, Neuromancer, Strange Days, and The Matrix), some lay people may be disappointed by the relatively poor capabilities of modern BCIs. Worse, many are fearful of the possibility of involuntary mind – reading, especially those who lump BCIs together with “brainwave fingerprinting” or other seemingly Orwellian technologies. This is exacerbated by some examples of sloppy journalistic reporting. For these reasons, it is crucial that members of the BCI community behave responsibly in dealing with the media and prioritize publication in peer – reviewed journals over other avenues for presenting research. It is appropriate to close with a comment about the importance of interdisciplinary cooperation in BCI development. “Future progress hinges on attention to a number of crucial factors. These include: recognition that BCI development is an interdisciplinary problem, involving neurobiology, psychology, engineering, mathematics, computer science,

and

clinical

rehabilitation

...

(Wolpaw

et

al.

2002).”

Linguistics,

communications, and HCI are also relevant. This combination of disciplines is quintessential cognitive science and thus the burden must fall upon cognitive scientists to take the lead in addressing such issues. This dissertation is one step in that effort.

CHAPTER 4: EFFECTS OF SOA AND FLASH PATTERN MANIPULATIONS ON ERPs, PERFORMANCE, AND PREFERENCE AND IMPLICATIONS FOR A BRAIN COMPUTER INTERFACE (BCI) SYSTEM

4.1: Introduction A brain computer interface (BCI) is a realtime communication system designed to allow users to voluntarily send messages or commands to an external device without sending them through the brain’s normal output pathways. While BCIs currently offer much slower information throughput than conventional interfaces, like keyboards or mice, they may be the only means of communication for severely disabled individuals unable to use interfaces requiring motor activity. Because BCIs are very slow, most BCI research to date is aimed at improving information throughput. There are two classes of BCIs described in the literature. Asynchronous or noncue-based BCIs allow a user to send information independent of any external event (see Birbaumer et al. 2000). Synchronous or cue-based BCIs allow the user to send messages or commands by producing one of several different mental responses to discrete events. P300 BCIs are a category of synchronous BCIs in which the user conveys interest in the target by choosing to attend to the target while ignoring other stimulus events. Since attended stimuli yield larger P300s than ignored stimuli, P300 BCIs can determine which

148

149 of several recently presented events was the target by determining which one produced the largest P300 and communicating that stimulus to some output device. In both types of BCIs, a user communicates by repeatedly engaging in specific mental activities, each creating a distinct EEG signature. These mental activities form the building blocks of a BCI language. Analogous to the word “phoneme,” which describes the smallest meaningful sounds of a language, the term “cogneme” shall refer to the smallest amount of mental activity capable of producing a difference in meaning in a BCI. In a synchronous BCI, a cogneme is the user’s response to each event. For example, a P300 BCI utilizes two cognemes: “/attending to the flash/” and “/ignoring the flash/.” Four P300 BCI systems have been described in the published literature. In the first, users viewed a 6 x 6 grid containing English letters and other characters. Single rows or columns were sequentially flashed, and users were asked to count any flashes containing the target character while ignoring other row or column flashes (Farwell and Donchin, 1988). In Polikoff et al. (1995), users saw four letters (N, E, S, W), each of which represented a compass direction. Each letter was flashed in sequence, and users counted flashes of a target letter. Donchin and colleagues later used the same display as the original “Donchin BCI” and explored the use of a discrete wavelet transform as a preprocessing approach (Donchin et al., 2000). Bayliss (2001) described a P300 BCI in which the user’s vocabulary contained icons instead of letters. That study also utilized improved preprocessing and pattern recognition approaches, resulting in a substantial performance improvement over simpler approaches.

150 Indeed, any BCI is heavily dependent on the pattern recognition approach, as well as the quantity, quality, and informativeness of the EEG data it receives. There are three general avenues toward improving information throughput in a synchronous BCI:

1)

Present stimuli more quickly, thus enabling the user to generate

more cognemes per minute, 2)

Require fewer cognemes to convey a message, as could be done

my obtaining more information from each cogneme, and 3)

Categorize cognemes more quickly and accurately, as might be

done by creating more robust differences between the EEGs associated with each cogneme or developing improved pattern classification algorithms.

The present study addresses each of these three avenues toward improving BCI information throughput using a system similar to the “Donchin P300 BCI.” The study explores the relationship between two independent variables, SOA and flash patterns, and three dependent variables, EEG measures, performance, and subject report.

P300

amplitude is proportional to flash speed and target probability, both of which influence target-to-target interval (TTI). Stimuli presented more quickly, whether due to a faster SOA or a higher target probability, may generate more cognemes per minute. However, they also produce a reduction in P300 amplitude, which results in less robust EEG differences between conditions. Hence, the SOA manipulation reflects a tradeoff between the first (stimulus speed) and third (signal robustness) avenues.

151 “Flash patterns” refer to the style and number of stimuli flashed at any one time. In Donchin’s BCI, only one row or column is flashed at any one time, called the “single flash” approach (see figs. 1-3). This approach means that there is always a low probability of the target character flashing; thus, TTI is low and P300 amplitude is large. However, it also means that a large number of flashes are needed to identify the target character. In an 8 x 8 grid containing 64 stimuli, sixteen flashes would be required to identify the target stimulus; eight flashes are needed to identify the row, and eight more are needed to identify the target column. The present study compared the “single flash” approach to a novel “multiple flashes” approach, in which half of the stimuli on the screen are flashed at any one time (see figs. 4-5). Since a P300 BCI allows one of either two cognemes, the maximum amount of information available from each flash is one bit. Hence, identifying which of eight rows is the target could be possible using only three binary events (Attneave, 1959). Similarly, it should also be possible to recognize the target column using only three flashes. Thus, using the multiple flashes approach introduced in this study, only six events are required to identify the target character.

Figure 4-1: The 8 x 8 grid used in this study.

152

Figure 4-2: Three of the eight row flashes used in the “single flash” condition

Figure 4-3: Three of the eight column flashes used in the “single flash” condition

Figure 4-4: The three row flashes used in the “multiple flash” condition.

153

Figure 4-5: The three column flashes used in the “multiple flash” condition

P300 amplitude is inversely proportional to target probability, and the single flash approach ensures that target probability will always be low. Hence, flash patterns present a tradeoff between the second (number of cognemes) and third (signal robustness) avenues for improvement. Since the multiple flashes approach requires only six events to identify which of 64 grid elements is the target, as opposed to sixteen events for the single flash approach, fewer cognemes are necessary to send a message. However, the multiple flashes approach also increases the mean target probability, thus reducing TTI and hence P300 amplitude. While the primary objective of this study was to explore the relationship between SOA, flash patterns, and ERP measures, two other important variables were also examined. Because subjects were asked to count targets, it was possible to compare performance across different conditions, providing an objective measure of task performance. Counting accuracy may also be relevant to ERP measures. If subjects are not counting accurately, it may imply they did not see each individual target, and hence would not generate robust ERPs. Subjects were also given brief questionnaires before

154 and after the study to explore the potential role of background and lifestyle factors, as well as their subjective preferences for different conditions. Subjective reports are an important issue in BCIs, especially those meant for the severely disabled, who may need to use their BCI for several hours a day.

4.2: Methods 4.2.1 Subjects Subjects were 13 undergraduate students at UC San Diego (6 female, age range 18-20 years, mean = 18.9, SD = 0.7).

All subjects were free of neurological or

psychiatric disorders and were rested and alert. Subjects signed an informed consent approved by the University’s Institutional Review Board and were awarded course credit for participation in the study. Subjects completed a brief questionnaire before EEG preparation and after the study (see below).

4.2.2 EEG Recording EEG activity was recorded from F3, Fz, F4, C3, Cz, C4, P300, Pz, P4, T3, T4, T5, T6, O1, and O2 sites of the International 10-20 System using Ag/AgCl electrodes prepositioned in a standard recording cap. Active electrode sites were referenced to linked mastoids with a forehead ground. The filter bandpass was 0.1-100 Hz. Eye activity was recorded by one electrode placed over the right orbit filtered at 0.3-100 Hz. All impedances were kept below 5 kΩ, except the eye and forehead sites, which were below

155 10 kΩ. EEG data were sampled at 256 Hz and analyzed offline using the ADAPT system and SCAN 4.2.

4.2.3 Experimental Paradigm Following EEG prep, subjects were seated in a comfortable chair in an acoustic isolation chamber. They viewed a monitor containing an 8 X 8 matrix of green characters against a black background. The matrix occupied the central 9 cm of a monitor placed 95 cm from the subject, subtending the central 2.7 degrees of user-centered space. Elements in the matrix consisted of uppercase and lowercase English letters and twelve common symbols (see Figure 1). Before a trial began, subjects were visually cued as to which element was the target for that trial. Once the subject was ready, pseudorandomly selected rows or columns were flashed sequentially, with the constraint that the same row or column flash never occurred twice in sequence. A flash consisted of a brief color change from green to yellow lasting 100 ms. Subjects counted the number of times the target flashed while ignoring other flashes. Each trial consisted of about 240 total flashes. The total number of flashes was varied within 10% of 240 to ensure that the correct target count varied with each trial. Subjects participated in six blocks, each of which contained six trials. Each trial utilized a different target. Each of the six blocks utilized a different combination of SOA and flash pattern, and the order in which blocks were presented was determined randomly. Subjects were given a brief break after the third trial of each block and after each block. Two independent variables were manipulated across the blocks: mean SOA

156 (125, 250, and 500 ms) and flash pattern (single and multiple). The SOA between each flash varied randomly within 10% of the mean. In the single flash condition, target probability was always the same: one in eight, or 12.5%. The multiple flashes condition allowed for seven different target probabilities: 0% (none of the six flashes illuminated the target; this applies only to the “=” character on the bottom right); 17%, 33%; 50%; 67%; 83%; and 100% (all of the six flashes illuminated the target; this applied only to the “A” character on the top left). The six target letters in any one trial block were chosen a priori to ensure a good distribution of different target probabilities. The 0% probable and 100% probable stimuli were designated as targets less often than the other stimuli, as there was only one of each such stimulus.

4.2.4 Data Analysis Data were sorted and averaged using the ADAPT software package. Each ERP consisted of the period from 100 ms before to 900 ms after the onset of each flash. ERPs were created in response to each flash, and any trials in which the voltage on any channel exceeded +/- 50 mV were rejected from further analysis. Approximately 15% of trials were thus rejected. Attended and ignored trials for each flash pattern were then averaged separately. Because the P300 and other ERP measures are affected by target probability, the ERPs evoked by different probability targets in the multiple flashes condition were grouped in separate bins for later analysis. That is, ERPs evoked by 17% probable targets were analyzed separately from ERPs evoked by 33% probable targets, and so on. As a result, the single flash bin contained six times more trials than the bins for each multiple

157 flashes probability. To compensate for this, only one sixth of the single trial ERPs in the single flash condition was selected for statistical analysis and for display purposes. Figures 8 - 16 below show grand averaged responses for attended and ignored flashes across different bins. Data were scored using SCAN 4.2. The N1 was scored as the most negative peak from 140 to 200 ms after stimulus onset; the P2 was the most positive peak from 170 to 260 ms; the N2 was the most negative peak from 230 to 320 ms; and the P300 was the most positive peak from 300 to 500 ms. The scored data were then analyzed with SPSS 9.0 using three ANOVAs with the Bonferroni correction for degrees of freedom. The single flash condition was compared with the multiple 17%, multiple 33%, and multiple 50% conditions, respectively. Each analysis consisted of four factors: conditions (single and multiple); SOA (125, 250, and 500 ms); attend (attend or ignore); electrode (Fz, Cz, and Pz). Counting accuracy was examined using a three factor ANOVA with Bonferroni correction. Factors consisted of trial position (whether the trial was the first, second, third, fourth, fifth, or sixth letter of the trial block), SOA (125, 250, or 500 ms) and flash type (single and multiple).

4.3 Results and Discussion

4.3.1 Behavioral Results: Counting accuracy

158 As shown in Fig. 1, counting accuracy, in terms of the mean error rate, showed a statistically significant main effect of trial position (first letter = 5.6 + 2.3%; second letter = 5.4 + 1.1%; third letter = 8.2 + 1.4%; fourth letter = 6.1 + 2.0%; fifth letter = 7.9 + 2.3%; sixth letter = 9.8 + 2.6%, p = .018). Errors generally increased throughout the trial block; the improved performance on the fourth letter most likely occurred because subjects received a break after the third letter. This suggests that the increasing error rate is likely due to mild fatigue rather than exhaustion, as performance declined only within trial blocks and not throughout the study. It also suggests that subjects would benefit from frequent breaks, although it may be that BCI users with more experience and motivation than subjects in this study would likely perform better over extended periods. Counting accuracy also declined with faster presentation speeds such that the smallest error rate occurred for the slowest speed - 500 SOA (3.7 + 1.4%); intermediate for 250 SOA (6.2 + 2.2%); and largest error rate for the fastest speed - 125 SOA (11.5 + 2.8%), (p = .037).

159

Mean Error Rate (%)

Counting Accuracy 14 12 10 8 6 4 2 0 1

2

3

4

5

6

Letter Position within Block

Figure 4-6. Counting accuracy as a function of trial position. Note that accuracy is worst just before a break.

Counting accuracy was better in the single flash condition (mean error rate = 4.3 + 0.02%) compared to multiple flashes (10 + 0.2%), (p = .001). As illustrated in Fig. 2, the flash type x SOA interaction also reached statistical significance (p = .020; means and std. errors in Table 1 below.) This interaction highlights two phenomena not apparent in the main effects. First, counting accuracy was excellent in the slowest SOA in both flash conditions. Second, faster flashes resulted in more serious penalties in the multiple flashes than single flash condition. While subjects in the single flash condition could count well at all speeds, subjects in the multiple flashes condition had more difficulty with faster speeds. As noted above, poor counting accuracy may occur because subjects did not see each individual flash. In this case, P300 amplitude would be reduced, as it occurs only in

160 response to detected targets. However, poor accuracy may also occur if subjects detect target flashes but lose count; in this case, P300 amplitude should remain large. In fact, four subjects reported losing count in the multiple - 125 SOA condition, and one of these subjects also reported this problem in the multiple – 250 SOA condition. If subjects lost count, they were instructed to continue noticing target flashes and estimate the count to the best of their abilities.

Flash Type x SOA 20 18

Mean Error Rate (%)

16 14 12 10 8 6 4 2 0 125

250

500

SOA (ms) Single

Multiple

Figure 4-7: Interaction of flash type and SOA

161 Table 4-1: Flash type x SOA interaction for counting accuracy. The percentages under “counting errors” reflect the percentage deviation between the subject’s count and the actual target count.

Flash type

SOA

Single

500

0.9%

0.5%

250

3.1%

2.2%

125

4.0%

2.2%

500

1.8%

1.0%

250

9.4%

2.7%

125

18.9%

3.6%

M 17%

Counting errors

Std. error

4.3.2: Behavioral Results: Subjective Report Most of the subjects’ feedback presented via the exit questionnaires reflected observations also apparent from counting accuracy. All subjects stated that the faster speeds were more difficult and required more of their attention. Twelve of thirteen subjects reported the multiple – fast condition as more difficult and absorbing than the single – fast condition, and three subjects stated that they disliked the multiple-fast condition. It is, therefore, likely that subjects would perform better with fast displays after more experience with them. Eleven of thirteen subjects reported that targets on the sides and corners of the matrix were easier to detect that central targets, while the other two subjects voiced no preference. This preference for peripheral targets is likely due to flanking effects; central

162 targets were always surrounded by four nontarget distractors, while targets on a corner or side were surrounded by only two or three nontargets. This suggests that future P300 BCI designers should place commonly chosen letters in the sides and corners to maximize ease of use. Six of thirteen subjects reported that punctuation marks were easiest to detect, while one felt letters were easier to detect and the remaining six voiced no preference. This may have occurred because punctuation marks are more distinct icons than letters; hence, P300 BCIs should strive toward using distinct icons if possible. It can be assumed that even more visually distinct grid elements, such as pictures, words, or different colored elements, would be easier to discriminate. Subjects were asked how tired they felt before and after the study. Seven subjects reported feeling more tired after the study than before it, while four voiced no change in fatigue. However, all subjects stated that they would be able to perform additional trials if sufficiently motivated. Thus, while the use of a P300 BCI in a non-distracting environment may be boring and even tiring (as suggested by decreasing counting accuracy within blocks), it is not so exhausting as to necessitate rest. This is an advantage over other BCIs, such as those used by the Kennedy or Birbaumer groups, where users often report the need for a break after using them (Kennedy et al., 2000; Parker, 2003) One interesting relationship between background and performance was apparent from the questionnaires. All subjects were asked how many hours per day they played computer games or otherwise actively used software with rapidly changing video displays. Eleven subjects reported playing less than one hour per day, and two subjects reported playing for more than three hours per day. It was observed that these two subjects’ counting accuracy was better than the mean in all conditions, especially in the

163 fast conditions. To further explore this intriguing relationship, an additional assessment was performed in which five new subjects were presented with the multiple – fast condition and asked whether they had difficulty identifying individual flashes. Three of the five subjects (all of whom are experts with arcade-style computer games) reported that this was not difficult, while the other two subjects (both unfamiliar with computer games) reported this as being more difficult. This leads to two useful conclusions for future BCI designers. First, individuals with prior experience with rapidly changing displays may be better suited to BCIs using faster stimulus presentation. Second, users can be trained to recognize faster flashes. No study has yet explored the effects of longterm use of P300 BCI systems, but it is likely that most users could be trained to perform well with the multiple – fast condition. Thus, while some subjects in this study disliked and performed poorly in the multiple - fast condition, their preference and performance may well have changed had they used the BCI for longer periods.

4.3.3: Electrophysiology 4.3.3.1: P300 amplitude P300 amplitude showed a statistically significant main effect of attention in all three comparisons (single vs. M17%: F(1,12) = 61.319, p = .000; single vs. M33%: F(1,12) = 95.223, p = .000; single vs. M50%: F(1,12) = 67.171, p= .000; means and std. errors in Table 2 below). The condition x attention interaction was also significant in all three comparisons (single vs. M17%: F(2,24) = 7.801, p = .016; single vs. M33%: F(2,24) = 10.380, p = .012; single vs. M50%: F(2,24) = 15.295, p= .004; means and std.

164 errors in Table 2 below). Attend vs. ignore differences were most pronounced in the M17% condition, even though the S12.5% condition had a lower target probability. Thus, some other facet of the multiple flashes condition resulted in larger P300 responses, perhaps the increased stimulus and task complexity. Attend vs. ignore differences were less pronounced in the multiple condition at higher probabilities.

165 Table 4-2: P2, N2, and P300 amplitude and attention for different flash bins

Condition

Attention

P2 amp.

Std. error

N2 amp.

Std. error

P3 amp.

Std. error

S 12.5%

Attend

4.158

.492

.373

.764

5.382

.511

Ignore

1.900

.416

-.853

.273

1.082

.314

Attend

5.984

1.552

.629

1.137

8.099

1.840

Ignore

1.265

.287

-1.312

.279

.799

.310

Attend

4.653

.477

.717

.543

3.721

.273

Ignore

2.128

.306

-.893

.201

1.557

.199

Attend

3.693

.595

.602

.364

3.637

.612

Ignore

2.311

.402

-.750

.348

2.411

.455

M 17%

M 33%

M 50%

P300 amplitude showed a statistically significant condition x electrode interaction in two comparisons (single vs. M33%: F(2,24) =7.561, p = .018; single vs. M50%: F(2,24) = 10.604, p = .008; means and std. errors in Table 3 below). In both the single and M17% conditions, the P300 shows a classic distribution, increasing in amplitude from anterior to posterior sites. However, at higher probabilities, the posterior P300 or P300b becomes smaller, while the anterior P300 does not, producing the opposite distribution. Thus, different probabilities produced different scalp distributions. This could be useful to a BCI uncertain whether a particular flash was attended or ignored, as discussed in “General Discussion” below. Table 4-3: P2 and P300 amplitude and electrode for different flash bins

166

Condition

Single 12.5%

M 17%

M 33%

M 50%

Electrode

Fz

P2 amplitude

3.026

Std. error

P3 amplitude

Std. error

.364

2.405

.287

Cz

3.402

.397

3.272

.347

Pz

2.560

.519

4.020

.407

Fz

3.121

.705

3.659

.748

Cz

3.916

.878

4.682

.945

Pz

3.837

.928

5.007

.904

Fz

3.446

.465

2.914

.366

Cz

3.642

.390

2.640

.168

Pz

3.082

.348

2.363

.274

Fz

3.081

.482

3.285

.606

Cz

3.121

.484

3.049

.499

Pz

2.803

.554

2.738

.421

P300 amplitude varied significantly as a function of SOA in two of three comparisons, with the third comparison approaching marginal significance (single vs. M17%: F(2,24) = 5.989, p = .031; single vs. M33%: F(2,24) = 3.812, p = .076; single vs. M50%: F(2,24) = 5.302, p= .040; means and std. errors in Table 4 below). In the single and M17% conditions, P300 amplitude varied as expected with SOA. That is, faster flashes produced smaller P300s. However, this effect was not apparent in the M33% and M50% bins, in which P300 amplitude was actually larger in response to faster

167 flashes. The multiple flashes condition thus has a slight advantage over the single flash condition during fast speeds, in that a BCI using the multiple-fast condition may produce EEGs that are more robust.

168 Table 4-4: N1, P2, N2, and P300 amplitude and SOA for different flash bins. SE = std. error.

Condition

SOA

N1 amp.

SE

P2 amp.

SE

N2 amp.

SE

P3 amp.

SE

S 12.5%

500

-1.315

.394

3.754

.367

.470

.535

3.787

.364

250

-1.459

.515

3.374

.529

.354

.512

3.471

.373

125

-1.182

.375

1.959

.468

-1.544

.442

2.439

.478

500

.231

.638

4.975

1.188

.719

1.565

5.908

1.572

250

-1.870

.646

2.908

.609

-.851

1.064

4.654

1.323

125

-.624

.846

2.990

.535

-.893

.756

2.787

.409

500

.993

.568

4.835

.775

.976

.657

3.443

.512

250

-1.897

.554

2.364

.388

-.741

.399

1.720

.380

125

-.337

.218

2.972

.549

-.499

.358

2.755

.447

500

.109

.858

4.679

.909

1.293

.835

4.621

.936

250

-2.601

.393

1.386

.523

-1.550

.172

1.311

.525

125

-.660

.150

2.940

.463

.0354

.450

3.139

.541

M 17%

M 33%

M 50%

169

P300 Responses

Mean Amplitude (uV)

7 6 5 4 3

Single M17% M33% M50%

2 1 0 125

250

500

SOA (ms) Figure 4-8: Relationship between SOA, flash type, and P300 amplitude

4.3.3.1: P300 latency P300 latency varied significantly with SOA in all three comparisons (single vs. M17%: F(2,24) = 6.274, p = .026; single vs. M33%: F(2,24) = 7.986, p = .016; single vs. M50%: F(2,24) = 11.516, p= .006; means and std. errors in Table 5 below). On average, the P300 was typically about 40 ms slower in the fast condition than in the other conditions. P300 latency is often correlated with task difficulty. Subjects in this study indicated in their questionnaires that they perceived the fast condition as more difficult.

170 Table 4-5: P300 latency and SOA for different flash bins

Condition

SOA

P3 latency

Std. error

Single 12.5%

500

355.946

9.078

250

358.551

6.930

125

403.892

13.208

500

368.359

10.780

250

370.963

18.165

125

394.010

8.112

500

356.814

11.699

250

337.862

11.737

125

399.132

14.588

500

360.576

10.408

250

350.231

6.064

125

404.123

11.297

M 17%

M 33%

M 50%

P300 latency varied significantly with electrode in all three comparisons (single vs. M17%: F(2,24) = 7.441, p = .015; single vs. M33%: F(2,24) = 13.905, p = .004; single vs. M50%: F(2,24) = 10.683, p= .007; means and std. errors in Table 6 below). P300 latency was typically longest at parietal sites and shortest at frontal sites. While this distinct scalp distribution is widely reported in the literature, it has not yet been used to discriminate different P300s, as could be useful in a BCI.

171

172 Table4-6: N1, N2, and P300 latency and Electrode for different flash bins. SE = std. error.

Condition

Electrode

N1 latency

SE

N2 latency

SE

P3 latency

S 12.5%

Fz

154.238

3.240

287.123

4.305

357.364

9.209

Cz

154.672

2.580

282.927

4.868

384.201

9.001

Pz

158.940

3.388

274.174

4.888

376.823

8.772

Fz

160.416

2.692

266.145

4.466

366.405

10.936

Cz

153.385

4.706

267.707

2.811

379.166

11.540

Pz

156.119

2.493

271.743

5.229

387.760

11.410

Fz

155.931

4.058

270.948

7.581

351.245

5.444

Cz

150.939

3.052

283.173

6.503

364.265

8.349

Pz

164.901

2.206

285.778

4.595

378.299

7.343

Fz

149.927

3.656

267.620

8.787

359.925

5.989

Cz

149.710

4.173

280.497

6.067

371.355

6.852

Pz

164.828

5.598

279.918

6.152

383.651

7.890

M 17%

M 33%

M 50%

SE

4.3.3.3: N1 amplitude N1 amplitude was marginally significant as a function of SOA in two of the three comparisons (single vs. M33%: F(2,24) = 4.129, p = .065; single vs. M50%: F(2,24) = 4.251, p= .062; means and std. errors in Table 4 above). There was also a statitically significant interaction between SOA and condition in all three comparisons (single vs. M17%: F(2,24) = 4.533 p = .048; single vs. M33%: F(2,24) = 5.336, p = .039; single vs.

173 M50%: F(2,24) = 4.496, p = .055; means and std. errors in Table 4 above). N1 amplitude was always larger in the 250 SOA condition, particularly in the multiple flashes conditions.

4.3.3.4: N1 latency N1 latency differed significantly with electrode in two of three comparisons with the third bordering on marginally significant (single vs. M17%: F(2,24) = 7.986, p = .016; single vs. M33%: F(2,24) = 15.193, p = .003; single vs. M50%: F(2,24) = 3.923, p = .072; means and std. errors in table 6 above). N1 latencies were generally longer toward posterior areas, as would be expected from the literature. This effect was slight in the single condition. It was barely visible in the multiple 17% condition, yet became more pronounced in higher probabilities. Hence, N1 latency exhibits a more distinct spatiotemporal distribution in the multiple flashes condition than in the single flash condition, which may be useful to a pattern recognition system. N1 latency differed significantly with attention in all the three comparisons (single vs. M17%: F(1,12) = 20.380, p = .017; single vs. M33%: F(1,12) = 21.990, p = .002; single vs. M50%: F(1,12) = 9.374, p = .016; means and std. errors in table 7 below). N1 latency was roughly 10 ms longer in the ignore condition in all four conditions studied; this is also to be expected from earlier N1 work.

174 Table 4-7: N1 latency and Attention for different flash bins

Condition

Attention

N1 latency

Std. error

Single 12.5%

Attend

150.038

4.112

Ignore

161.862

2.945

Attend

148.436

3.098

Ignore

164.843

5.040

Attend

149.420

3.417

Ignore

165.094

3.953

Attend

151.397

3.161

Ignore

158.246

4.829

M 17%

M 33%

M 50%

4.3.3.5: P2 amplitude P2 amplitude showed main effects of SOA, electrode, and attention in all three comparisons (SOA comparisons: single vs. M17%: F(2,24) = 7.090, p = .019; single vs. M33%: F(2,24) = 6.334, p = .026; single vs. M50%: F(2,24) = 5.954, p = .031; means and std. errors in table 4 above. Electrode comparisons: single vs. M17%: F(2,24) = 6.008, p = .030; single vs. M33%: F(2,24) = 10.999, p = .007; single vs. M50%: F(2,24) = 4.766, p = .049; means and std. errors in table 3 above. Attention comparisons: single vs. M17%: F(1,12) = 46.229, p = .000; single vs. M33%: F(1,12) = 61.932, p = .000; single vs. M50%: F(1,12) = 39.631, p = .000; means and std. errors in table 2 above.) P2

175 amplitude generally decreased with faster SOAs, and was typically larger over frontal and central than parietal sites. Mean P2 amplitude was larger in all three multiple flash bins than in the single flash bin, though this effect was not significant. The condition x attention interaction was marginally significant in one of the three comparisons (single vs. M50%: F(2,24) = 5.269, p = .051. All means and std. errors in Table 2 above.) In all bins, P2 amplitude was larger to attended flashes than ignored flashes. This difference became less pronounced at higher probabilities, as attended P2 amplitude decreased and ignored P2 amplitude increased, similar to the results for P300 amplitude. In the M17% and M33% bins, the P2 amplitude difference is greater than in the single flash condition. Since the attend vs. ignore difference is more pronounced in the multiple 33% condition than the single flash condition, despite the relatively low probability of single flashes (12.5%), P2 amplitude does not vary with probability alone. The increased P2 amplitude difference in multiple flashes conditions is likely due to the increased task and stimulus complexity seen in that condition.

4.3.3.6: P2 latency No statistically significant effects of P2 latency were apparent. Though P2 latency was longer in ignored than attended trials in all three bins, as would be expected from precedents in the literature, this effect was only significant in the single vs. M33% comparison (F(1,12) = 6.784 p = .031).

4.3.3.7: N2 amplitude

176 N2 amplitude varied significantly with SOA in two of the three comparisons (single vs. M33%: F(2,24) = 7.503, p = .018; single vs. M50%: F(2,24) = 5.876, p = .032; means and std. errors in table 4 above). The SOA x condition interaction was also significant in two of three comparisons (single vs. M17%: p = .128; single vs. M33%: p = .038; single vs. M50%: p = .002; means and std. errors in table 4 above). N2 amplitude varied significantly with attention in two of three comparisons (single vs. M33%: F(1,12) =5.506, p = .047; single vs. M50%: F(1,12) = 6.019, p = .040; means and std. errors in Table 2 above). Ignored events produced more negative N2s than attended events. This may have occurred because the N2 overlaps two components known to be more positive in attended trials: the P2 and P300. It is conceivable that an approach such as ICA could separate the individual N2 effects from those of overlapping components.

4.3.3.8: N2 latency The condition x electrode interaction was significant in two of three comparisons (single vs. M33%: F(2,24) = 9.332, p = .011; single vs. M50%: F(2,24) = 6.536, p = .025; means and std. errors in Table 6 above). N2 latency was generally longer in posterior sites in the multiple flashes condition, while the opposite trend appeared in the single flash condition. A BCI utilizing either flash pattern could utilize the corresponding spatial distribution pattern to discriminate a real N2 from artifact.

177 4.3.4: General Discussion The purpose of this study was to explore two avenues toward improving information throughput in a synchronous BCI: stimulus speed (by varying SOA) and number of cognemes (by varying flash patterns). These manipulations affected signal robustness, subject performance, and subjective evaluation of the BCI. The study was conducted using a novel flash methodology, called the “multiple flashes” approach, at different SOAs, compared to the conventional “single flash” approach. The results of both counting accuracy and subjective report indicate that a few subjects found the multiple flashes approach difficult to use at faster speeds. These subjects were new to BCIs, and would probably have found the multiple flashes approach easier to use at fast speeds with practice. The most important aspect of a P300 BCI is the difference between attended and ignored events. The more apparent this difference is to an artificial pattern recognition system, the greater the system’s accuracy in identifying targets. The present results provide strong support for the multiple flashes approach as the basis for a P300 BCI. Based on visual inspection and statistical results, the attend vs. ignore differences are more apparent after multiple flashes than single flashes with low target probabilities. This difference is less apparent in the multiple flashes with high target probabilities. Thus, the ERP results alone do not make a strong case for the superiority of either flash approach. However, because only six cognemes are needed per grid element in the multiple flash condition, compared to sixteen cognemes in the single flash condition, a BCI using the multiple flashes approach should be able to operate 2-3 times more quickly than a BCI using the single flash approach.

178 Furthermore, the variable probability seen in the multiple flashes condition could be of value to a pattern recognition system. Consider the following situation: a user wishes to send the “%” icon. This icon is only illuminated in one of the six flashes (see Figs. 4 and 5); thus, the BCI’s pattern recognition system should expect a very robust attend vs. ignore difference. The flash containing the “%” icon produces a large difference, but another flash produces an indeterminate result. The BCI thus must decide whether the user attended to one of the six flashes only, meaning the “%” is the target, or whether the user attended to two of the flashes, meaning the “$” is the target. Because the ERPs evoked by 17% probable and 33% probable targets look distinctly different, the BCI could make that determination. This is not possible with the single flash approach, as the probability never changes and thus the ERPs evoked by different flashes all look very similar. Thus, the multiple flashes approach allows a new avenue for resolving uncertainty. This approach also appears to accentuate early components across some probabilities, which may be useful to a pattern recognition system in two ways. A simple pattern recognition system may perform better simply because the P2 difference provides it with a longer period of discriminable activity. A more sophisticated approach specialized for detecting temporal changes (such as a Markov model) may also utilize the fact that a valid P300 should be preceded by a P2. These comments apply to single flashes as well; early components varied with attention in both flash patterns, though this was more pronounced with multiple flashes. Consistent with earlier work, this study also showed that the P300 and other components that vary with attention exhibit a distinct spatial distribution. No BCI has yet

179 accounted for this distribution in a pattern recognition system; in fact, most P300 BCIs base their classification on data from only one electrode site, Pz. While the P300 is typically largest over this site, it is apparent in other sites as well. A pattern recognition approach with prior knowledge of the spatial and temporal distribution of an attended vs. ignored ERP could perform substantially better than a simpler system that either examines only one site, or accounts for multiple sites individually without looking for patterns apparent across multiple sites. Further, an approach such as independent component analysis (ICA) may be effective in isolating early components from other activity. This may be especially useful for the N2, which overlaps the P2 and P300 and produced anomalous results when studied with classical analysis approaches. As noted above, the N2 should be more robust after attended flashes, and this appears to be the case when the grand averages are studied visually (see. Fig. 9). However, the attended N2 showed a more positive peak after attended flashes. Another question addressed in this study is the importance of SOA. As expected, there exists a tradeoff between flash speed and ERP measures. Faster flashes allow more cognemes per minute, but produce less distinct ERPs. A few subjects had trouble with the multiple flashes approach at the fastest speed. The question of which SOA is ideal depends on two factors: the user’s preference and/or facility for fast flashes, and the pattern recognition approach used. One pattern recognition approach might perform best when given the large attend vs. ignore difference apparent after one or two slow flashes, while another might be better able to identify the smaller difference seen after four or eight fast flashes.

180 This leads to another essential consideration in P300 BCIs – signal averaging. The information throughput of any BCI is directly proportional to the number of trials that must be averaged to attain an acceptable accuracy threshold. While all P300 BCIs to date have been able to identify targets based on single trials above chance, accuracy remains poor. The most recent implementation of the Donchin BCI (Donchin, 2003) utilizes averages of 15 trials in order to identify targets accurately. The challenge of single trial recognition is difficult to appreciate from the grand averages in Figures 8 - 16, each of which represents thousands of trials. While the attend vs. ignore differences are apparent to the untrained eye in these grand averages, these differences are harder to detect in single trials. Figure 20 shows averaged responses from a subject responding to one letter in the multiple flashes condition. The subject attended to the “v,” which is illuminated by only one of the six flashes. The top graph shows attended trials (responses to the flash pattern containing the “v”) and the bottom graph shows ignored trials (responses to one of the other flash patterns). Each average contains approximately 35 single trials. These two averages were chosen because the attend vs. ignore difference is obvious to the naked eye. Figures 21-24 each display two graphs of single trial data. In each figure, the top six graphs show an attended flash, and the bottom six graphs show an ignored flash. In Figures 21 and 22, it is relatively easy to recognize that the top graphs contain an attended ERP, while the bottom graphs do not. Both the P300 and earlier components are apparent in the top graphs only. The single trial P300 even shows a classic spatial distribution, with larger components over posterior sites, as apparent from the bottom right attended graph that displays only midline sites. In Figures 23 and 24, the attend vs.

181 ignore differences are not apparent to the naked eye. In both figures, the P300 and early components are less obvious in the attended graphs, and the ignored graphs both exhibit activity that could be mistaken for a P300. These figures underscore the challenges inherent in single trial recognition. In particular, they highlight the serious problem of within subject variability. These four single trials were all recorded within two minutes of each other, in response to the same letter, flash type, and SOA, yet show considerable variability. One solution to this problem was presented in Bayliss (2001), who utilized a variable averaging approach. Her BCI first examines a single trial. If it determines that there is not enough information in that single trial for an accurate discrimination, it presents a second trial, averages it with the first, presents a third trial if necessary, and so on. While it is likely that P300 BCIs will eventually be capable of reliable single trial recognition as pattern recognition technology develops, the variable averaging approach represents a useful methodology until then. These figures also show two potential advantages of single trials. Figures 21 and 22 clearly show early component activity and contain substantial alpha activity, which does vary with attention and thus could provide useful information to a pattern recognition system. Neither of these is apparent from the averages in Figure 20. It is difficult to overstate the need for development of improved pattern recognition approaches for P300 BCIs, especially approaches that account for the known characteristics of ERPs evoked by attended vs. ignored events and perform effectively with a minimum number of trials. There exist many other avenues for improving the information throughput of P300 BCIs. One unexplored question is which mental task yields the largest signal. In all

182 P300 BCIs, users have been instructed to attend and count target flashes. It is possible that different instructions could affect ERP measures, and this would likely exhibit tradeoffs with both performance and subjective report. A follow up study to explore this issue is currently being designed. Another possible improvement currently being studied is the use of an alternative flash approach called splotches (see figures 25 and 26). These flash patterns do not group flashed elements in rows or columns, as has been done in all P300 BCIs that flash more than one element at any one time. Splotches may offer three advantages. First, people are used to seeing information grouped in rows and columns; the splotch arrangement may appear more novel, producing an increase in the P300, especially the anterior P300. Second, the use of splotches reduces the number of flashed elements flanking any attended target; in the multiple flashes condition, a target is sometimes surrounded by eight flashing nontargets. Subjects in this study stated that they preferred targets that were not surrounded by flashing nontargets. Third, several subjects in pilot studies to date report that splotches were easier to detect at fast presentation speeds. The results of the present study indicate that the multiple flashes approach has some advantages over the single flash approach, but the decline in P300 amplitude at higher probabilities is a concern. One possibility worth exploring is the use of an intermediate multiple flashes approach, in which between one and four columns are illuminated at once. This would require fewer flashes than the single flash approach, and may produce the early component enhancement seen in the multiple flashes condition in this study. More flashes would be needed than in the multiple approach used in this study, but target probability could be kept below 50%. This tradeoff may be worthwhile;

183 like many other avenues toward improved BCI information throughput, this issue can only be addressed through further experimentation.

184

Figure 4-9: Grand average ERPs evoked in the single – 500 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the x-axis reflects time from stimulus onset in milliseconds.

185

Figure 4-10: Grand average ERPs evoked in the multiple 17% - 500 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

186

Figure 4-11: Grand average ERPs evoked in the multiple 33% - 500 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

187

Figure 4-12: Grand average ERPs evoked in the multiple 50% - 500 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

188

Figure 4-13: Grand average ERPs evoked in the single - 250 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the x-axis reflects time from stimulus onset in milliseconds.

189

Figure 4-14: Grand average ERPs evoked in the multiple 17% - 250 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

190

Figure 4-15: Grand average ERPs evoked in the multiple 33% - 250 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

191

Figure 4-16: Grand average ERPs evoked in the multiple 50% - 250 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

192

Figure 4-17: Grand average ERPs evoked in the single - 125 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the x-axis reflects time from stimulus onset in milliseconds.

193

Figure 4-18: Grand average ERPs evoked in the multiple 17% - 125 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

194

Figure 4-19: Grand average ERPs evoked in the multiple 33% - 125 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

195

Figure 4-20: Grand average ERPs evoked in the multiple 50% - 125 SOA bin. Blue lines represent ERPs evoked by attended flashes, and red lines show ERPs evoked by ignored flashes. The y-axis reflects amplitude in microvolts, and the xaxis reflects time from stimulus onset in milliseconds.

196

Figure 4-21: Average ERPs evoked by about 35 flashes in subject JM. The top graph shows responses to attended trials, and the bottom graph displays ignored trials.

197

198 Figure 4-22: Two single trials from the averages in figure 4-21. The top graph shows the ERP to an attended trial, and the bottom graph displays an ignored ERP.

199

200 Figure 4-23: Two single trials from the averages in figure 4-21. The top graph shows the ERP to an attended trial, and the bottom graph displays an ignored ERP.

201

202 Figure 4-24: Two single trials from the averages in figure 4-21. The top graph shows the ERP to an attended trial, and the bottom graph displays an ignored ERP.

203

204 Figure 4-25: Two single trials from the averages in figure 4-21. The top graph shows the ERP to an attended trial, and the bottom graph displays an ignored ERP.

205 Figure 4-26: Two splotch patterns currently being used in a follow up study. These splotches each highlight eight grid elements, analogous to the single flash approach used in this study. Multicolored flashes are also being explored.

206 Figure 4-27: Two splotch patterns currently being used in a follow up study. These splotches each highlight 32 grid elements, analogous to the multiple flash approach used in this study. Multicolored flashes are being explored and are most helpful in fast speeds, when subjects have trouble detecting individual flashes.

CHAPTER 5: ERPs EVOKED BY DIFFERENT MATRIX SIZES: IMPLICATIONS FOR A BRAIN COMPUTER INTERFACE (BCI) SYSTEM

Brendan Z. Allison and Jaime A. Pineda

Abstract— A brain computer interface (BCI) system may allow a user to communicate by selecting one of many options, often presented in a matrix. Larger matrices allow a larger vocabulary, but require more time for each selection. In this study, subjects were asked to perform a target detection task using matrices appropriate for a BCI. The study sought to explore the relationship between matrix size and EEG measures, target detection accuracy, and user preferences. Results indicated that larger matrices evoked a larger P300 amplitude, and that matrix size did not significantly affect performance or preferences.

Index Terms— attention, BCI, matrix, P300

207

208

5-1: Introduction A brain computer interface (BCI) is a realtime communication

system in which

messages or commands sent by the user do not pass through the brain’s natural output pathways. Some BCIs, called evoked BCIs, depend on the user’s response to specific sensory events, while spontaneous BCIs do not [16]. One class of evoked BCIs explored in the literature has been called a “P300 BCI” [1]-[3],[5]. In the first such BCI described in the literature, subjects saw a 6 x 6 matrix containing English letters and other matrix elements. Subjects were asked to choose one of the elements and designate it as the target. Individual rows or columns were then flashed sequentially, and the user was asked to count the number of times the target was flashed while ignoring flashes which did not illuminate the target. Target flashes produced a robust P300 response, while nontarget flashes did not, as expected from prior work on the P300 [8],[14],for review see [11]. It was, therefore, possible to determine which element of the matrix the subject intended as the target simply by determining which row flash and which column flash produced a large P300 [3],[5]. The decision by Donchin and his colleagues to use six rows and six columns was based on two factors. First, the probability of the target being flashed was 0.17, and such improbable events had been shown to produce robust P300s [4],[9],[10],[12]. Second, a six by six matrix allows for 36 matrix elements, a convenient number to represent the 26 English letters and 10 additional elements. However, it is entirely possible that a larger or smaller matrix would have been preferable. For instance, a larger matrix could allow for a much larger vocabulary, in which matrix elements might include groups of letters, words,

209 phrases, symbols, or pictures. The drawback is that more time is required to isolate the target element since more flashes are required. A smaller matrix requires fewer flashes, but offers a limited vocabulary. A very small matrix, such as a 4 x 4, might be most useful as the first layer of a “menu selection” BCI, in which the matrix elements do not represent a message per se but instead allow the user to choose the type of information presented in a second matrix [15]. There have been no parametric studies in the BCI literature investigating the effect of matrix size. In one study, users were allowed to indicate their interest in specific electronic devices by attending to flashes of pictures of those devices. Stimuli were presented one at a time, rather than in a matrix, and afforded the user a very small number of choices [2]. Another study utilized an 8 x 8 matrix [1] but also did not include a comparison between different matrix sizes. The purpose of the current study was to explore the relationship between matrix size and three dependent variables: ERP measures (P300 and N100 amplitude/latency), counting accuracy, and subject preference as determined by a questionnaire.

5-2: Methods Subjects were 15 undergraduate students at UC San Diego (8 female, age range 18-21 years, mean = 19.2, SD = 0.83).

All subjects were free of neurological or

psychiatric disorders and were rested and alert. Subjects signed an informed consent approved by the University’s Institutional Review Board and were awarded course credit for participation in the study.

210 EEG activity was recorded from Fz, Cz, and Pz sites of the International 10-20 system using Ag/AgCl electrodes referenced to linked mastoids with a forehead ground. The filter bandpass was 0.1-100 Hz. Eye activity was recorded by an electrode placed over the right orbit filtered at 0.3-100 Hz. All impedances were kept below 5 kΩ, except the eye and forehead sites which were below 10 kΩ. All data were sampled at 256 Hz and analyzed offline using the ADAPT system and SCAN 4.2. Following EEG prep, subjects were seated in a comfortable chair in an acoustic isolation chamber. They viewed a monitor containing green letters against a black background. Each element in the matrix consisted of a pair of English letters, called a digram. Before a trial began, subjects were cued as to which digram was the target for that trial. Once the subject was ready, single rows or columns were flashed sequentially. During a flash, the digrams of the flashed row or column changed from green to yellow for 100 ms. The delay between flashes varied randomly between 450-550 ms. Subjects counted the number of times the target digram flashed while ignoring other flashes. Subjects were exposed to three conditions, each with a different matrix size: 4 x 4, 8 x 8, and 12 x 12 (see Fig. 1). Digrams were the same size in all grids; hence, the larger grids appeared larger overall than smaller grids. However, all matrices subtended less than eight degrees of user centered space. Subjects participated in five trials during each matrix size condition, with the order of the fifteen trials determined pseudorandomly. Trials were long enough to allow each row and each column to be flashed between 13 and 17 times, and thus each trial lasted about 1-3.5 minutes. Subjects were allowed a brief rest after each trial. An ERP consisted of the period 100 ms before and 900 ms after each flash. Trials on which the EEG exceeded +/- 40 microvolts were rejected; approximately 10-15% of trials were rejected across the different conditions. The N100 was scored as the most

211 negative peak between 140 and 200 ms, and the P300 as the most positive peak from 300-500 ms.

Figure 5-1: The three matrices used in this study. Note that all digrams were the same size in the study, and thus the larger matrices appeared larger on the screen.

212

5-3: Results A three way ANOVA (matrix size x attention x site) was performed on the scored ERPs; only results significant to P < .05 are reported here. Figure 2 shows grand average ERPs along midline sites evoked by the different size matrices.

Figure 5-2: Grand average ERP responses across all subjects over midline sites. The left three graphs show responses to the 12 x 12 grid, the middle three graphs show ERPs evoked by the 8 x 8 grid, and the right three graphs show reponses evoked by the 4 x 4 grid. Thick line = ignored flashes, thin line = attended flashes. The x axis reflects time in milliseconds from stimulus onset, and the y axis reflects amplitude in microvolts.

P300 amplitude showed a highly significant effect of attention (8.182 µV for attended flashes vs. 1.527 µV for ignored flashes, p = .000). There was a marginally significant effect of site in which posterior sites showed larger amplitudes (Fz: 4.230 µV, Cz: 5.004 µV, Pz: 5.331 µV, p = .058). There was a highly significant interaction of size

213 and attention in which the largest differences between attended and ignored events occurred in the 12 x 12 condition (see Table 1; p = .007). There was also a significant interaction between size and site, with larger matrices producing larger posterior P300s (see Table 2; p = .039), and a highly significant interaction between attention and site; consistent with prior work, the attend vs. ignore difference was larger over posterior sites. (see Table 3; p = .000). P300 latency showed a main effect of matrix size, with the shortest latencies to the smallest matrix (12 x 12: 366.0 ms, 8 x 8: 370.4 ms, and 4 x 4: 348.7 ms, p = .047). It also showed a marginally significant main effect of site, with shorter latencies to anterior sites (Fz mean = 348.6 ms, Cz mean = 357.8 ms, Pz mean = 378.8 ms, p = .059). There was a marginally significant interaction between size and site (see Table 2; p = .057). Attended flashes produced larger N100 amplitudes than ignored flashes (.990 for attended flashes vs. .008 µV for ignored flashes, p = .007), while N100 amplitude increased from anterior to posterior sites (Fz: .003 µV , Cz: .483 µV, and Pz: 1.151 µV, p = .024). There was also a significant attention by site interaction (see Table 3; p=.045). N100 latency showed significant effects of both attention and site but not matrix size. Attended flashes produced longer N100 latencies (145.9 ms to ignored flashes vs. 152.0 ms for attended flashes, p = .016). N100 latency was shortest over the Cz site (Fz: 151.5 ms, Cz: 146.1 ms, Pz: 149.3 ms, p = .008). However, there were no significant interactive effects. Counting accuracy did not vary significantly with matrix size. The subjects’ target count was always within 10% of the correct count. Subjects did not voice a strong preference for any particular matrix size in questionnaires presented after the study. However, when asked if matrix elements near the sides and corners were easier to count, eight subjects answered yes, two answered no, and five voiced no preference.

214 Table 5-1: Size x Attention interaction for P300 amplitude

Size

Attend

Amplitude (µV)

SE

12 x 12

Attended

9.164

1.00

12 x 12

Ignored

1.164

.45

8x8

Attended

7.730

1.02

8x8

Ignored

1.536

.46

4x4

Attended

7.652

.99

4x4

Ignored

1.883

.50

215 Table 5-2: Size x Site interaction for P300 amplitude and latency

Size

Site

Amp. (µV)

SE

Latency (ms)

SE

12 x 12

Fz

4.401

.550

349.6

11.66

12 x 12

Cz

5.244

.643

369.8

8.55

12 x 12

Pz

5.847

.567

378.6

10.38

8x8

Fz

3.910

.720

358.9

11.96

8x8

Cz

4.815

.740

364.3

10.80

8x8

Pz

5.174

.649

388.2

9.49

4x4

Fz

4.379

.551

337.4

7.06

4x4

Cz

4.952

.668

339.3

5.77

4x4

Pz

4.973

.610

369.5

11.21

216 Table 5-3: Site x Attention interaction for P300 amplitude and latency

Site

Attend

N100 Amp. (µV)

SE

P300 Amp. (µV)

SE

Fz

Attended

.460

.389

6.283

.87

Fz

Ignored

-.398

.336

2.176

1.09

Cz

Attended

.006

.308

8.841

1.06

Cz

Ignored

-1.027

.286

1.167

.42

Pz

Attended

-.754

.441

9.422

.22

Pz

Ignored

-1.547

.274

1.240

.33

5-4: Discussion Like other “P300 BCIs” described in the literature, the BCI described here is critically dependent on the difference between attended and ignored events. The more this difference is apparent to the pattern recognition system used to discriminate attended from ignored events, the better the performance. Results from this study suggest that this difference can be maximized by using a larger matrix.

While matrix size did not

significantly affect N100 measures, nor have any relevant effect on P300 latency since the attend vs. ignore difference did not significantly differ with matrix size, it did affect P300 amplitude. Since larger matrices produced larger differences in P300 amplitude for attended vs. ignored events than smaller matrices, a larger matrix would appear to be preferable to a smaller one.

217 All the effects reported are consistent with prior literature on the N100 and P300. The longer P300 latency and the increase in posterior P300 amplitude seen with larger matrices is likely because attending to a target in the larger matrix represented a more difficult task in a more complex and distracting environment than in a smaller matrix. The overall increase in the difference between attended and ignored P300 amplitude in larger matrices – which reflects both relatively larger P300s to attended flashes and smaller responses to ignored flashes - is probably due to the reduced target probability and increased nontarget probability of larger matrices. P300 amplitude is inversely proportional to target probability. The unexpected observation that the majority of subjects preferred peripheral targets can best be explained by the fact that targets in the center of a matrix were surrounded by four nontargets, while targets in the periphery were not. Thus, subjects may have found peripheral targets less distracting than central targets. This result suggests that designers of similar BCIs should place more frequently chosen elements near the periphery to improve ease of use. While the ERP data presented here suggest that larger grids are preferable because they produce more robust P300 differences between attended and ignored events, future BCI designers should consider three other factors in deciding the optimal matrix size: User factors: The overriding concerns in the design of any BCI are the needs and desires of the user. While subjects in this study did not voice any strong preference for any matrix size, an individual required to use such a system as the only means of communication over a period of years may be more selective. Factors such as attentional

218 or visual deficits, distractions in the subject’s environment, the number of matrix elements the user prefers and preferences for menu based BCIs, and personal preference may outweigh ERP factors in choosing the best size. Early components: The N100 showed a slight but significant difference due to attention. This difference, like other differences in early ERP components, varied less with attention than did the P300. Other changes in the display and task accentuate differences in earlier components such as the P200 and N200 in a similar BCI [1]. Therefore, it is possible that changes in matrix size in such a BCI would produce different changes in early components that would not support a larger matrix. Novel preprocessing and pattern recognition approaches: It is likely that newer preprocessing techniques such as independent component analysis (ICA) will be able to uncover characteristics of the P300 not apparent to conventional analysis. For example, recent work [6],[7] showed that ICA can separate components which vary with attention from those which do not. Similarly, different pattern recognition approaches will be more or less sensitive to different aspects of the attend vs. ignore difference, and thus the P300 amplitude results discussed here may be more relevant to, for example, a neural network than a hidden Markov model. It is also possible that different preprocessing and/or pattern recognition approaches perform differently as a function of matrix size. The information transfer rate of any BCI is largely dependent on two factors: the robustness of the EEG differences associated with each unique mental event and the effectiveness of a pattern recognition system at recognizing this difference. This study examined the first of these factors and found differences in the EEGs evoked by different

219 matrix sizes. However, these differences may be more or less informative to different preprocessing and pattern recognition approaches. For example, a simple approach such as SWDA may be influenced primarily by a larger P300 amplitude, while approaches such as a Hidden Markov Model (HMM) or neural network may be more sensitive to early component differences, spatial distribution, or other factors. While the question of how EEGs differ across matrix sizes has been answered, it is not known which types of subsequent processing can best detect this difference. This is being addressed in a study in progress. ACKNOWLEDGMENTS

We would like to thank Benjamin Chi and Drs. Andrey Vankov and John Polich for their contributions to the study.

REFERENCES [1] Allison BZ, Vankov AV, and Pineda JA. Toward a faster, better BCI. Soc Neurosci Abstr 2001; 31: 741.13. [2] Bayliss JD and Ballard DH. A virtual reality testbed for brain-computer interface research. IEEE trabsactions on Rehabilitation Engineering 2000; 8: 188190. [3] Donchin E, Spencer KM, and Wijesinge R. The mental prosthesis: assessing the speed of a P300-based brain-computer interface. IEEE trabsactions on Rehabilitation Engineering 2000; 8: 174-179.

220 [4] Duncan – Johnson CC and Donchin E. The P300 component of the event related potential as an index of information processing. Biological Psychology 1982; 14: 1-52. [5] Farwell LA and Donchin E. Talking off the top of your head: toward a mental prosthesis utilizing event related brain potentials. Electroencephalography and Clinical Neurophysiology 1988; 70: 510-523. [6] Makeig S, Westerfield M, Jung TP, Cov ington J, Townsend J, Sejnowski TS, and Courchesne E. Functionally independent components of the late positive event-related potential during visual spatial attention. J Neurosci 1999; 19: 26652680. [7] Makeig S, Westerfield M, Jung T-P, Enghoff S, Townsend J, Courchesne E, Sejnowski TJ. Dynamic brain sources of visual evoked responses. Science 2002; 295: 690-694 [8] Johnson R. A triarchic model of P300 amplitude. Psychophysiology 1986; 23: 367-384. [9] Polich J. Attention, probability, and task demands as determinants of P300 latency from auditory stimuli. Encephalography and Clinical Neurophysiology 1986; 63: 251-259. [10] Polich J. Task difficulty, probability, and inter-stimulus interval as determinants of P300 from auditory stimuli. Encephalography and Clinical Neurophysiology 1987; 68: 311-320. [11] Polich J. P300 clinical utility and control of variability. Journal of Clinical Neurophysiology 1998; 15: 14-33. [12] Ruchkin DS, Sutton S, and Tueting P. Emitted and evoked P300 potentials and variation in stimulus probability. Psychophysiology 1975; 12:591-595.

221 [13] Sutter EE. The brain response interface: communication through visually – induced electrical brain responses. Journal of Microcomputing Applications 1992; 15: 31-45. [14] Sutton S, Braren M, Zubin J, and John ER. Evoked-potential correlates of stimulus uncertainty. Science 1965; 150: 1187-1188. [15] Vaughan TM, McFarland DJ, Schalk G, Sarnacki WA, Robinson L, and Wolpaw JR. EEG-based brain-computer interface: development of a speller. Soc Neurosci Abstr 2001; 27:167. [16] Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G, and Vaughan TM.

Brain-computer

interfaces

for

communication

and

control.

Clinical

Neurophysiology 2002; 113: 767-791.

NOTE ON CHAPTER FIVE

The material presented in chapter 5 has been accepted for publication in IEEE Transactions on Neural Systems and Rehabilitation Engineering, and is currently in press. The dissertation author was the primary researcher and was responsible for all aspects of the study. The co – author, Jaime Pineda, supervised the research that forms the basis for this chapter.

CHAPTER 6: INDEPENDENT COMPONENT ANALYSIS (ICA) AND ITS POTENTIAL VALUE IN A P300 BCI SYSTEM

6-1: Introduction In any BCI, a user communicates by thinking specific cognemes. P300 BCIs utilize two cognemes: /attending to the flash/ or /ignoring the flash/. Each cogneme has a distinct EEG signature such that attended flashes tend to produce a large P300, as well as earlier ERP components, while ignored flashes are relatively flat. To be practical, BCIs must be able to distinguish these EEG signatures quickly and accurately. Unfortunately, classic approaches toward identifying salient elements in the EEG, such as the use of peak amplitude and latency, often miss important information. Independent component analysis (ICA) is a relatively new technique that may be useful to BCI researchers for two reasons. First, ICA can be used to remove unwanted artifacts (e.g., eye movements, alpha activity, etc) and thus enhance signal-to-noise ratio, and second, it can decompose the EEG waveforms into independent components for further analysis. The result of applying ICA to EEG data in a BCI would be improved information throughput. ICA was designed as an approach to the problem of blind source separation, which is a problem present in a myriad of fields. ICA assumes that the perceived signal at

222

223 any recording site (such as a satellite, microphone, or electrode) consists of a combination of numerous overlapping sources. The locations of these sources are unknown, and the objective is to try to isolate the contribution of each of these independent sources based on the observed data at each site. The received signal at each site i, called (x ), consists of a linear combination of several sources, s1 … sn. No information is available about the location or nature of the sources, nor about the process by which the signals from each source are mixed to form the observed signal x via a mixing matrix called A. It is assumed that the sources are combined linearly to produce the perceived signal.

224

Figure 6-1: Graphical representation of the blind source separation problem and ICA. The signals received by sensors (x) are a linear combination of the source signals (s) mixed by an unknown matrix A. ICA strives to find the inverse of matrix A, called W, which will allow the signal x to be translated back to the original source data.

ICA strives to reverse this process, thereby identifying the contribution of the sources s from the perceived signal x. This is done by estimating the inverse of the mixing matrix A, called A-1 or W, thereby allowing the original signal to be reconstructed. Note that this technique does NOT solve the aforementioned “inverse problem” of determining the locations of each signal contributing to it. The challenge facing the ICA algorithm is to most accurately determine the unmixing matrix W. Several approaches to this problem have been proposed. One

225 commonly used method is the information maximization approach (Bell and Sejnowski 1995), which was shown (Gaeta and Lacoume 1990; Pham 1992) by several authors (Pearlmutter and Parra 1996; MacKay 1996, Cardoso 1997) to be identical to the previously proposed maximum likelihood estimation approach. This approach strives to maximize the mutual information that an output Y has about an input X. This is expressed mathematically as I(Y,X) = H(Y) – H (Y|X). I refers to the shared information, H(Y) is the entropy of the output, and the H(Y|X) represents the entropy of the output that did not stem from the input. Ideally, H(Y|X) is zero, as would be the case if no noise were present. ICA is often compared to Principal Component Analysis (PCA), a simpler technique that remains of value in processing EEG. PCA strives to find orthogonal activation patterns, while ICA does not. It is often the case that researchers are interested in separating EEG patterns that are not very different (such as responses to similar shapes or tones, or minor changes in EEG across different recording sessions). PCA is not helpful in such cases, as the activations of interest are not orthogonal and are probably very similar. PCA also removes second order dependencies; ICA aims to remove all higher dependencies. ICA does have its limitations. Early ICA algorithms look for a linear transform giving full independence, but are not guaranteed to find it (if it does exist). The ICA transform is not even guaranteed to be independent of second order dependencies, while PCA will. However, more recent ICA approaches have been independent of at least second order.

226 One fundamental assumption of ICA is that the brain sources are temporally independent and spatially overlapping. Makeig (1998) wrote:

Is the brain a collection of physically discrete neural networks which pass information to each other by (occasional) neural impulses, as we send letters or email to each other? Or is the brain a dynamically shifting collection of interpenetrating, distributed, and possibly transient neural networks that communicate via some form(s) of mass action? The first viewpoint is that of classical anatomy and physiology. The second is that of a slowly emerging dynamic systems perspective on neuroscience. ICA is a tool for discovering independent sources of spatial dependencies in multichannel data. It asks, and answers, a different problem than so-called "source localization." The two questions are complementary, hence the answers they produce may be complementary parts of "the whole story" of "how brains work." In truth, brain networks are most probably never wholly autonomous. They are neither physically wholly isolated from one another, nor do they act wholly independently! Attempts to solve the inverse problem (source localization) may assume the first (at some level). ICA assumes the second. Might decomposition algorithms be derived which mediate between these two extremes? Probably so, if some intermediate set of assumptions were developed!

227 This lack of agreement on how the brain operates poses a serious problem for any EEG pattern recognition technique. Another fundamental assumption necessary for ICA is that the mixing of different brain signals be linear. This is probably not the case; if nothing else, the skull itself provides considerable and uneven electrical resistance that can seriously and nonlinearly distort EEG data. ICA has a couple more problems magnified in small data sets. First, the number of independent components that ICA can isolate is at most equal to the number of sensors. This is not a problem for some recording experiments, in which at least 16 electrode sites are generally used. However, an objective of many BCIs is to minimize the number of electrodes required. Many BCIs use only two electrodes, providing a maximum of two components. Since ICA can find at most one independent component per electrode, a BCI with two electrodes would yield only two components. Second, in order for two basis vectors to be maximally independent (no shared information at all), they need to be of infinite length. When less data are available to ICA, the resulting components will have a significant amount of overlap. Despite its limitations, ICA is flexible, straightforward, effective, and has already shown its value in processing both EEG (e.g., Jung et al. 1995, 1997a, 1998a, 1998b, 2000a, 2000b, 2001; Makeig et al. 1996, 1997, 1999a, 1999b, 2000, 2002; Vigario, 1997) and fMR data (e. g., McKeown 1998). This study aims to explore whether ICA could be an effective means of preprocessing data before sending it to a BCI’s pattern recognition system. To be effective, ICA must be able to isolate components that vary with selective attention from those that do not. In other words, it must be able to discriminate ERPS evoked by target flashes from those evoked by nontarget flashes.

228

6.2: Methods Data collected from the second study of this dissertation were used as input for ICA decomposition. Those data were recorded from subjects viewing a matrix containing numerous digrams (letter pairs). Subjects were told to designate one digram as the target and count the number of times it flashed while ignoring other events. In each trial, subjects viewed approximately 360 flashes, about 30 of which illuminated the target24. Subjects participated in five trials, each of which had a different target digram. ICA analysis was performed with Matlab 6p5 (Mathworks, Inc.) using EEGLab 4.0825. Each raw data file contained one trial. These files were loaded into EEGLab and epochs were created for target and nontarget flashes. Each epoch consisted of the period from 100 ms prior to 1 second following each target flash. The 100 ms prior to each flash was used to baseline each epoch. Next, all attended flashes from all five letters were grouped into one bin, and all ignored flashes were grouped into a second bin. This was done for all subjects, resulting in 24 bins (12 subjects with two bins each). The “runica” algorithm was then applied to each bin. This yielded 15 independent components for each bin, as 15 EEG sensors were used. These components are rank ordered according to how much variance they account for in the original signal.

6-3: Results

24

study.

See chapter 5 for a more detailed discussion of the methods used in the second

229 Figures 2 and 3 show typical examples of component scalp projections and ERPs for the four components accounting for most of the variance in the data. Figure 2 reflects attended (target) trials, and figure 3 shows nontarget (ignored) trials. The remaining data for all subjects are presented in Figures 6-21. The top image in these figures shows the scalp projections. This reflects the extent to which the ERP in the bottom image contributed to the signal seen at different sites. For example, the second component seen in target trials by subject NJ has a scalp projection indicating that it contributed most to attend ERPs at central and posterior sites. The ERP of this component, seen in the bottom image, indicates that the component made relatively little contribution during the first 200 ms following the flash, but then made a strong positive contribution for the following 400 ms, peaking at about 380 ms. Taken together, these two images demonstrate that the neural generators responsible for component 2 were most active in the time period and regions associated with a P300. In contrast to the attended trials, ICA analysis for ignored trials shown in Figure 2 shows a component 2 that is clearly earlier in latency and with a frontal distribution. This is, therefore, most likely not a P300. On the other hand, component 3 shows the scalp distribution of a P300 at about the right latency but of much smaller magnitude.

25

This software is available from sccn.ucsd.edu/EEGLab.

230

Figure 6-2: Component scalp projections and ERPs evoked by attended (target) flashes for subject NJ.

231

Figure 6-3: Component scalp projections and ERPs evoked by ignored (nontarget) flashes for subject NJ.

232 In some figures, a component map appears to be opposite in polarity from that component’s ERP. For example, in Figure 4, component 1 appears to be a predominantly negative (blue) P300 in the scalp maps, yet is positive in the component ERPs. This is because EEGLab 4.08 does not account for the polarity of each component when creating the scalp maps. Hence, the predominantly positive activity seen in component 1’s ERPs is correct.

233

Figure 6-4: Component scalp projections and ERPs evoked by attended (target) flashes for subject CC.

234 Another type of figure presented displays component properties (see Fig. 5). These figures each contain three images. The top left image shows the map of the component’s scalp projection, with the black dots indicating electrode sites. The top right image shows the voltage changes seen in the single trials comprosing that average. Each horizontal line represents a single trial. Positive voltages are show in yellow and red, while negative voltages are green and blue. The bottom graph shows a spectral decomposition of the component. It is clear from this figure that the P300 was apparent in nearly all of the single trials comprising this average.

Figure 6-5. The component properties of component 1 in MJ’s attended trials. P3-like activity is visible in most single trials during the period from about 250-500 following stimulus onset.

235 Another component ranked very high in terms of accounting for most of the variance in most subjects was a frontally distributed component, such as component 2 of both ignored and attended events in subject CC (Figs. 7-8). This typically appears in both attended and ignored trials. In some subjects, this appears larger in ignored trials; in others, it appears larger in attended trials. It always peaks earlier in ignored trials. This component is probably associated with eyeblink, because it appears in a small minority of trials yet shows high amplitude when present (Figs.22-24). Thus, though it does vary with attention, this component would be of little value in a P300 BCI meant for individuals with limited motor control. This component should be removed from the data. All component maps showed at least one component that appeared to originate from the left or right side of the scalp and then grow weaker as it traversed the head (e.g, components 8 and 11 of the attended component maps for BS). Figures 25 and 26 show the properties of components 8 and 11. These components are most likely EMG artifact since their spatial distribution is not consistent with any known brainwave. Their temporal distribution indicates erratic, high frequency activity that is consistent with EMG. Each ICA map also contained at least one component that reflected lateralized mu (8-12 Hz) activity. The spatial component maps in figures 6-21 indicate that the activity originates in central electrode sides to the left or right of the midline. This area roughly corresponds to the C3 or C4 electrode sites (for sources located to the left or right of center, respectively). These are the sites where mu activity is most clearly seen. In some subjects, mu activity from both hemispheres was apparent. Figures 27-30 show components 6 and 9 of subject BS’s attended trials and components 6 and 8 of BS’s

236 ignored trials. Figures 31 through 34 show components 10 and 11 of subject JS’s attended trials and components 6 and 9 of JS’s ignored trials. The spectral breakdown seen in these eight figures is typical of mu activity – a strong peak around 12 Hz, and a smaller peak around 24 HZ. Since mu activity tends to be strongest when the subject did not perform or imagine movements, and the task performed did not involve movement, it is reasonable to expect moderate to strong mu activity. While the ERP images for some single trials appear to be different for target and nontarget mu, these differences are apparent in a minority of single trials and thus would be of little value in determining whether a recently presented flash contained the target. Finally, many subjects exhibited posterior alpha (8-12 Hz) activity. This component tends to be strongest over occipital sites and is located at or near midline, as would be expected of occipital alpha activity. Some subjects had three distinct components that each reflected alpha activity from left, midline, and right occipital areas. Figures 35 - 40 show properties of these posterior alpha components in attended and ignored trials for two subjects. In subject JS, attended trials contain more alpha activity than ignored trials. In subject PK, ignored trials contain more alpha activity. This could be used to discriminate attended from ignored events. In both subjects, ignored trials were relatively flat, while some attended trials showed changes of 3 or 4 mV from baseline. This observation could be of value in single trial categorization – if the posterior alpha component ever deviates form baseline by more than 3 mV, the evoking flash was likely a target flash. Although the differences in posterior alpha due to attention are not reliable across subjects, they do not need to be. Since a BCI using ICA as a preprocessing approach would presumably be trained on one subject’s data, components

237 must only be consistent within a subject, not across subjects, to be informative to a P300 BCI.

6.4: Discussion ICA was successful in isolating EEG activity related to attention, EEG activity unrelated to attention, and non-EEG noise stemming from artifact. P300-like components and posterior alpha vary with attention. Lateralized mu activity is EEG activity that does not vary with attention. The components reflecting eyeblink and EMG are non-EEG sources of artifact. For the purposes of a P300 BCI, any activity that does not reliably vary between target and nontarget ERPs is of no value. Hence, ICA’s ability to isolate components that vary with attention could be of significant value in a BCI. Based on previous work, it was expected that the component that would vary most reliably and substantially with attention is the P300. ICA identified a component in all subjects’ attended trials with the spatial and temporal distribution characteristic of a P300. This was always a major component, often ranked first or second. Because of its unique spatial and temporal distribution, and the fact that it is always one of the top three components, this component could be easily recognized by an artificial pattern recognition system. Ignored trials exhibited a component with a P300-like spatial distribution that accounted for much of the variance in the data. However, the associated component ERPs exhibited much smaller magnitude and peaked earlier than the attended P300 component. Hence, one straightforward approach to categorizing ICA data could involve finding the component with a P300-like map and then examining the latency and amplitude of its

238 peak. If the peak is small and early, the flash was ignored; if large and late, it was attended. One concern with ICA is that it requires a large number of electrodes to be maximally effective. While ICA was effective with the 15 sites used in this study, it is not clear whether it would function well with fewer electrodes. However, 15 electrodes are reasonable for a BCI. As sensor technologies develop and preparation becomes faster and easier, larger electrode montages may become more feasible, potentially enhancing the value of ICA preprocessing in P300 BCIs. How could ICA actually work in a P300 BCI? The ICA unmixing matrix must first be trained. To explore the minimum number of trials needed to train ICA with 15 channel data reduced raw data files containing fewer single trials were presented to the ICA algorithm for training. 30 trials were usually enough for ICA training, and 50 trials were always sufficient. 50 trials can be obtained in less than one minute. Hence, training time should not be a problem for the ICA algorithm used in a P300 BCI. ICA must also be capable of categorizing new single trials based on its unmixing matrix. There is no reason why ICA would be unable to do this, and a follow up study will explore the idea parameters for single trial categorization. Finally, an AI system must recognize different components and label them for a pattern recognition approach. This will also be explored in a follow up study. Another issue that needs to be explored is how ICA would function over extended periods, since users’ EEGs should change with chronic use. Should the unmixing matrix be retrained after a specific number of trials? If so, how many? If the characteristics of these EEG changes over time are known, could this information be used to enhance ICA

239 retraining? Can subjects be trained to produce EEG activity that can be more easily discriminated by ICA and subsequent pattern recognition systems? If so, how? There are many different ICA training algorithms. This study utilized the most common one, runica. However, other algorithms such as extended ICA, infomax, and fastICA could perform better. It is also possible that additional preprocessing before or after ICA could be helpful. Similarly, it remains unclear which pattern classification approaches perform best with ICA-preprocessed data. Though unanswered questions remain about the practical implementation of ICA in a P300 BCI, this study clearly shows that ICA could be a useful preprocessing methodology for P300 BCIs. By isolating components that vary with attention, it provides a cleaner signal to a pattern recognition system. This should improve accuracy, and may improve speed by enabling recognition based on fewer trials. Future work in exploring the best parameters for both ICA and other facets of pattern classification software will likely result in further improvements to BCI performance.

240

Figure 6-6: Component scalp maps and ERPs for subject BS. These show responses to attended (target) flashes.

241

Figure 6-7: Component scalp maps and ERPs for subject BS. These show responses to ignored (nontarget) flashes.

242

Figure 6-8: Component scalp maps and ERPs for subject CC. These show responses to attended (target) flashes.

243

Figure 6-9: Component scalp maps and ERPs for subject CC. These show responses to ignored (nontarget) flashes.

244

Figure 6-10: Component scalp maps and ERPs for subject EW. These show responses to attended (target) flashes.

245

Figure 6-11: Component scalp maps and ERPs for subject EW. These show responses to ignored (nontarget) flashes.

246

Figure 6-12: Component scalp maps for subject JC. These show responses to attended (target) flashes. (Component ERPs for subject JC’s target flashes are not available due to a technical problem.)

247

Figure 6-13: Component scalp maps and ERPs for subject JC. These show responses to ignored (nontarget) flashes.

248

Figure 6-14: Component scalp maps and ERPs for subject JS. These show responses to attended (target) flashes.

249

Figure 6-15: Component scalp maps and ERPs for subject JS. These show responses to ignored (nontarget) flashes.

250

Figure 6-16: Component scalp maps and ERPs for subject MJ. These show responses to attended (target) flashes.

251

Figure 6-17: Component scalp maps and ERPs for subject MJ. These show responses to ignored (nontarget) flashes.

252

Figure 6-18: Component scalp maps and ERPs for subject MT. These show responses to attended (target) flashes.

253

Figure 6-19: Component scalp maps and ERPs for subject MT. These show responses to ignored (nontarget) flashes.

254

Figure 6-20: Component scalp maps and ERPs for subject PK. These show responses to attended (notarget) flashes.

255

Figure 6-21: Component scalp maps and ERPs for subject PK. These show responses to ignored (nontarget) flashes.

256

Figure 6-22: The component properties of component 1 in NJ’s attended trials. Note that the component is active in a minority of single trials. This suggests that it is eyeblink.

257

Figure 6-23: The component properties of component 1 in JS’s ignored trials. Like the figure above, the component is active in a minority of single trials. This component probably also reflects eye activity.

258

Figure 6-24: The component properties of component 1 in JS’s ignored trials. Like figures 20 and 21, the component is active in a minority of single trials, suggesting it reflects eye activity.

259

Figure 6-25: Component 8 from BS’s attended trials. This is EMG activity.

260

Figure 6-26: Component 11 from BS’s attended trials. This is EMG activity.

261

Figure 6-27: The component properties of component 6 in BS’s attended trials. This component reflects mu activity from the right hemisphere. The power spectrum peaks at about 12 Hz, consistent with mu activity. Another, smaller peak appears at about 24 Hz. This second peak is also typically seen with mu activity.

262

Figure 6-28: The component properties of component 9 in BS’s attended trials. This component reflects mu activity from the left hemisphere. The power spectrum peaks at about 12 Hz, consistent with mu activity. Another, smaller peak appears at about 24 Hz. This second peak is also typically seen with mu activity.

263

Figure 6-29: The component properties of component 6 in BS’s ignored trials, showing right mu activity. The power spectrum peaks at about 12 Hz, consistent with mu activity. Another, smaller peak appears at about 24 Hz, as is typically seen with mu activity.

264

Figure 6-30: The component properties of component 8 in BS’s ignored trials, showing left hemisphere mu activity. The power spectrum peaks at about 12 Hz, with another, smaller peak at about 24 Hz; both of these peaks are typically seen with mu activity.

265

Figure 6-31: The component properties of component 11 in JS’s attended trials, showing left hemisphere mu activity. The power spectrum peaks at about 12 Hz, with another, smaller peak at about 24 Hz; both of these peaks are typically seen with mu activity.

266

Figure 6-32: The component properties of component 10 in JS’s attended trials, showing right hemisphere mu activity. The power spectrum peaks at about 12 Hz, with another, smaller peak at about 24 Hz; both of these peaks are typically seen with mu activity.

267

Figure 6-33: The component properties of component 9 in JS’s ignored trials, showing left hemisphere mu activity. The power spectrum peaks at about 12 Hz, with another, smaller peak at about 24 Hz; both of these peaks are typically seen with mu activity.

268

Figure 6-34: The component properties of component 6 in JS’s ignored trials, showing right hemisphere mu activity. The power spectrum peaks at about 12 Hz, with another, smaller peak at about 24 Hz; both of these peaks are typically seen with mu activity.

269

Figure 6-35: Component 13 from JS’s attended trials, showing posterior alpha activity from the left occipital area.

270

Figure 6-36: Component 12 from JS’s attended trials, showing posterior alpha activity from the occipital midline area.

271

Figure 6-37: Component 14 from JS’s attended trials, showing posterior alpha activity from the right occipital area.

272

Figure 6-38: Component 11 from JS’s attended trials, showing posterior alpha activity from the occipital midline area.

273

Figure 6-39: Component 13 from PK’s attended trials, showing posterior alpha activity from the occipital midline area.

274

Figure 6-40: Component 11 from PK’s ignored trials, showing posterior alpha activity from the occipital midline area.

CHAPTER 7: CONCLUSIONS

Brain computer interface (BCI) systems provide an exciting and promising new venue for communication. If developed intelligently and used responsibly, BCIs could enable communication for severely disabled individuals and greatly facilitate it for the general populace. Communication with other people and with the growing number of electronic devices in modern life is an essential part of the human experience, and research directed toward improved BCIs is critical. Because BCIs are currently limited primarily by their speed and accuracy, research directed toward improving information throughput is extremely important. Such improvement can occur within three general avenues:

1) obtain more cognemes per unit of time; 2) require fewer cognemes per message; and 3) discriminate the EEGs evoked by each cogneme more rapidly and accurately.

This dissertation explored four manipulations relevant to BCI information throughput in three studies. The first study manipulated stimulus onset asynchronies (SOAs), thereby affecting the number of cognemes sent per minute. It also manipulated flash patterns of stimulus presentation, thereby affecting the number of cognemes

275

276 required to convey each element and hence a message. The second study manipulated set size, which affected the number of cognemes required to send a message. All three manipulations in the first two studies also affected EEG measures (HOW?) it’s in the following paragraph, which are relevant to the third avenue of improvement. The third study explored a preprocessing approach called independent component analysis (ICA) capable of highlighting the EEG differences associated with selective attention, which could enable more accurate discrimination of attended vs. ignored events. Results from the first study show that the multiple flashes approach produces more distinct attend vs. ignore differences than the conventional single flash approach at lower probabilities, but less distinct differences at higher probabilities. Multiple flash ERPs also contain more substantive differences between early ERP components than the single flash approach. Since multiple flashes convey 267% more information than single flashes, the multiple flashes approach is preferable to the single flash approach overall. Because information throughput is also dependent on the pattern recognition system used, some approaches may be able to utilize the early component information seen in the multiple flashes condition better than others. Simple pattern recognition approaches that do not account for the temporal information present in the P300, such as those used in all P300 BCIs built to date, could not take maximum advantage of early component information. Attended ERPs, compared to ignored ERPs, also have a distinct spatial distribution and may show less alpha activity. Hence, another important implication of the first study is that sophisticated pattern recognition systems that take advantage of the unique information present in attended events could substantially improve BCI information throughput.

277 Another result of the first study is that some subjects found the multiple flashescondition difficult and unpleasant at the fast speed. This effect was not seen in individuals who played computer games, suggesting that training with fast displays can improve performance. This underscores the need to explore the effects of training and long-term use. All P300 BCI studies to date have examined subjects during their first hour of BCI use; this is analogous to drawing conclusions about typing based on subjects new to keyboards. Experienced subjects would probably be more comfortable with faster displays. Studies have also shown that P300 amplitude declines as people perform the same task repeatedly. It is not clear whether this would happen with chronic BCI use, in which subjects receive feedback and are trying to produce more discriminable P300s. The first study also showed that subjects preferred grid elements located on the corners and sides, rather than the center. This probably occurred because central targets are surrounded by nontargets, while peripheral targets are not. This suggests that future BCI designers should place commonly used elements at the sides and corners, and that flash patterns in which targets are flanked by few flashing nontargets may be preferable to the single and multiple flash approaches used in the first study. A follow up study using splotches is currently being performed to explore this (see Figures 25 and 26 of Chapter 4). The second study showed that the ERPs evoked by larger matrices exhibit more robust attend vs. ignore differences. This difference is more robust for two reasons: ERPs evoked by targets in larger grids are more postitive than attended ERPs in smaller grids, and ERPs evoked by nontargets in larger grids are more negative that nontargets in small grids. This difference is not dramatic and subjects did not perform significantly better on

278 nor voice a strong for any grid size. Hence, while future BCI designers should generally utilize larger grids, other design factors or user preferences may outweigh the ERP differences seen across grid sizes. The third study showed that ICA is an effective preprocessing technique for two reasons. First, ICA separated artifact from real data. Second, ICA isolated components that vary with attention from those that do not, thus providing a purer signal for a pattern classification system. ICA was effective in all subjects. A follow up study planned will explore ICA in more depth and explore pattern classification systems based on ICA. Another issue with ICA requiring further attention is the integration of ICA data with later processes in a pattern recognition approach. AI software must be able to recognize which ICA components of the P300 reflect voluntary attention and which do not. The first two studies of this dissertation have significantly elucidated the relationship between display factors and EEG measures, performance, and subjective report. The third study demonstrated the value of ICA as a preprocessing technique for BCIs. These studies lay the groundwork for a faster, better BCI and have also created a huge database of EEGs that can be used in future studies of optimal pattern classification approaches. The two most obvious next steps are to use these developments to build a P300 BCI and use it to identify which pattern recognition approaches function best in a BCI environment. Two new studies currently being designed will address these developments.

FUTURE WORK

279 Many future directions for improving information throughput in P300 BCIs also apply to other types of BCIs. A menuing system, for example, could allow for a larger vocabulary. Error correction mechanisms (based on spelling errors and/or error related EEG activity) could reduce the need to issue corrections. Letter prediction software could reduce or eliminate the cognemic demands for predicted letters by either temporarily reducing the vocabulary to include only legal letters (thus reducing the number of events needed to choose from the smaller vocabulary) or filling in letters. Improved displays, including the use of different flash colors26, could make it easier to recognize individual flashes at faster speeds. One possibility already being explored by some labs is the development of a “combined” BCI, in which different types of BCIs utilizing different cognemes and resulting EEGs are combined. There is no reason why a P300 BCI system could not be combined with, for example, a mu BCI. This system might theoretically outperform a P300 or mu BCI, but it is not yet known whether such a system would be overly taxing or unpleasant. A more extensive discussion of avenues toward improved BCI information throughput can be found in Chapter 3. Although improving information throughput is of paramount importance in BCIs, there are many other facets of BCIs requiring improvement. Modern BCIs are not only slow, they are also expensive, somewhat clunky, distracting, and difficult to use. Preparation for a BCI session requires time and the use of unpopular gooey gel. The “interface” side of BCIs has not been well explored; most displays used in BCIs are

An improved display was developed for a follow-up to the first study of this thesis, in which letters were flashed yellow, red, purple, or orange instead of yellow only. Five of five pilot subjects reported that recognizing individual flashes at higher speeds was easier when multiple flash colors were used. 26

280 simple and boring. Many laypeople today would not be inclined to use BCIs for other reasons, such as the need for unsightly gear on the scalp or concerns that BCIs could read information from the EEG that the user would rather keep private. Other means of improving BCIs include:

-- Improved sensors could provide a cleaner signal, thereby improving information throughput. They could also improve ease of use by reducing the number of electrodes required, reducing or eliminating preparation time, and eliminating the need for gel.

-- A better understanding of the factors that lead to differences both within and between subjects could result in improved pattern recognition approaches, and make it possible for users to easily find the type of BCI best for them. This could also lead to customized strategies and training, which improves information throughput and ease of use as well as reduces training time.

-- There has been little attention to the best display parameters and computing environments for BCIs. BCIs with more engaging, informative, and easy to use displays could be used for longer periods with less discomfort. Another potential contributrion of interface designers is the interaction between BCIs and other interfaces. BCIs could well be used in concert with an eye tracker, keyboard, mouse, voice recognition system, even

281 another BCI. How could two or more interfaces best be combined, both physically and operationally? How could users best be trained in these complex interfaces?

-- Even less attention has been paid to the fact that BCIs enable a new medium of communication and require a new language. What cognemes are best for BCIs? Are different users able to generate different cognemes more easily than other? Does acquisition of a BCI language exhibit parallels to other types of language learning? Do lesions to language areas impair BCI performance? What differences and similarities exist between cognemes and phonemes or graphemes? Answers to these questions can improve BCI operation and training, and are of considerable theoretical interest.

-- Advances in electronics are producing smaller, cheaper, more powerful chips. These not only result in smaller, cheaper, more powerful BCIs, but also make BCIs easier to conceal or integrate with clothing or other electronic devices. This may make BCIs more fashionable and readily accepted, in addition to allowing increasingly complex pattern recognition approaches to be used on smaller chips.

-- As BCIs become more powerful, flexible, easy to use, and common, they will inevitably draw increased media attention. Unless this media attention is very negative, publicity will further increase the pace of BCI research by increasing the demand for BCIs and encouraging more individuals to devote themselves toward BCI improvement. This underscores the importance of responsible interaction with the media; in a worst case scenario, BCIs may be viewed in the same light as cloning, another therapeutic

282 technique with considerable promise that raises much larger ethical questions. While a partial ban on BCI research is unlikely, the strong negative stigma surrounding cloning and its research may affect funding and researchers’ enthusiasm for the field. This long list of avenues toward BCI improvement, plus the ones described in Chapter 3, may seem to create an insurmountably large challenge. Fortunately, BCI research is an enterprise of waxing pith and moment. As of 1995, only four published articles described working BCIs; there are now dozens of BCI articles. Articles descrbing working BCIs have also begun to appear in the popular media in the last few years; magazines such as Scientific American and New Yorker have portrayed BCIs in a positive light, and TV shows such as Saturday Night Live and ER have described BCI developments. While BCIs are not new in science fiction, they received no attention from the mainstream media until recently. The first BCI conference was held in 1999, and another in 2002. These highly successful conferences have produced numerous collaborations and new directions that are already resulting in new developments and publications. The introduction of BCI 2000, a universal platform for BCI development available free for academic research, makes BCI development much easier and is being used by several labs. All of these factors synergize with each other, drawing ever more attention, resources, and researchers toward BCI development. BCI developments involve many disciplines, such as engineering, cognitive neuroscience, linguistics, mathematics, clinical studies, signal processing, and interface design. It is striking to note that nearly all avenues toward building a better BCI involve facets of cognitive science. This is apparent from the discussion above and in Chapters 2 and 3 about avenues toward improving BCI information throughput. This exemplifies the

283 need for contributions from individuals with a variety of different backgrounds. While individuals from all relevant disciplines are encouraged to design and research better BCIs, those with backgrounds in cognitive science may be especially empowered to contribute. Hopefully, others in this department, as well as this discipline and related disciplines, will find BCI research as important and rewarding as the author.

APPENDIX: THE EFFECTS OF SELF-MOVEMENT, OBSERVATION, AND IMAGINATION ON Mu RHYTHMS AND READINESS POTENTIALS (RPs): TOWARDS A BRAIN-COMPUTER INTERFACE (BCI)

J. A. Pineda, B.Z. Allison, and A. Vankov

Department of Cognitive Science University of California, San Diego La Jolla, CA 92093

Please send correspondence to: Jaime A. Pineda, Ph.D. Department of Cognitive Science 0515 University of California, San Diego La Jolla, CA 92093 (858) 534-7087 (office) [email protected]

284

285 ABSTRACT

Current movement-based BCIs utilize spontaneous EEG rhythms associated with movement, such as the mu rhythm, or responses time-locked to movements that are averaged across multiple trials, such as the Readiness Potential (RP), as control signals. In one study, we report that the mu rhythm is not only modulated by the expression of self-generated movement but also by the observation and imagination of movement. In another study, we show that simultaneous self-generated multiple limb movements exhibit properties distinct from those of single limb movements. Identification and classification of these signals with pattern recognition techniques provides the basis for the development of a practical BCI.

INTRODUCTION

The concept of a direct interface between the human brain and a sophisticated artificial system, such as a computer, is not a new one. In recent years, there have been advances in a number of fields that make the design and development of a practical brain computer interface (BCI) possible. Such a BCI would be capable of quickly and reliably extracting meaningful information from the human electroencephalogram (EEG) or other recordable electrical potentials, such as the electromyogram (EMG), electrocardiogram (EKG), etc. Over the past decade, several working BCI systems have been described in the literature [2, 3, 6, 7, 8]. These systems use a variety of data collection mechanisms, pattern recognition approaches, and interfaces, and require different types of cognitive activity on the part of the user.

286 One type of BCI that has been examined extensively derives information from a user’s movements or the imagination of movement. Many of these movement-based BCIs recognize changes in the human mu rhythm, which is an EEG oscillation recorded in the 8-13 Hz range from the central region of the scalp overlying the sensorimotor cortices [4]. This rhythm is large when a subject is at rest, and is known to be blocked or attenuated by self-generated movement.

Indeed, the mu wave is hypothesized to

represent an “idling” rhythm of motor cortex that is interrupted when movement occurs. The free-running EEG shows characteristic changes in muactivity, which are unique for the movement of different limbs [9]. These findings have and will continue to be useful in the construction of BCI systems. The performance of a movement is also generally accompanied by a readiness potential (RP; also called Bereitshaftspotential or BP) which is most prevalent over cortical motor areas. A similar response can be elicited if the movement is imagined. The RP is a time-locked response to the movement event, or event related potential (ERP), that is extracted from the ongoing EEG using signal averaging techniques across a number of trials. The primary goal of the two studies we report was to characterize mu and RP signals in simple, straightforward tasks. The recognition and discrimination of these signals could then provide a basis for the development of a practical BCI, one that would be useful to both normal and disabled individuals.

STUDY 1

287

In this study, we show that the mu rhythm is significantly attenuated by self-generated movement. Furthermore, some attenuation occurs when a subject observes the movement or imagines making the same, self-generated movement. According to Rizzolatti and colleagues, the responsiveness of the mu wave to visual input may be the human electrophysiologic analog of a population of neurons in area F5 of the monkey premotor cortex [1,5]. These cells respond both when the monkey performs an action and when the monkey observes a similar action made by another monkey or by an experimenter. Other studies have reported that mulike waves are blocked by thinking about moving [10].

The blocking of the mu rhythm by visual and imagery input may

have implications for understanding movement-related responses and for the rehabilitation of movement-related neurological conditions.

METHODS

Subjects in this study were 17 healthy volunteers (10 men, 7 women, ranging in age between 19-58, with a mean of 27.7 years). Most subjects were students or employees at the University of California, San Diego (UCSD) and naive to the purposes of the experiment. Only 10 subjects were used for statistical analysis because of problems with noise, such as movement artifact or too much blinking. All subjects signed a consent form that was approved by the UCSD Human Subjects IRB committee. EEG signals were recorded from 6 sites on an electrode cap placed over frontal (F3, F4), central (C3, C4), and occipital (O1, O2) areas according to the standard

288 10-20 International Electrode Placement System.

Blinks and eye movements were

monitored with an electrode in the bony orbit dorsolateral to the right eye. Trials contaminated with eye movement artifact were rejected and not included in the averages. EEG was amplified by a Grass model 7D polygraph using 7P5B pre-amplifiers with bandpass of 1-35 Hz. For computerized data collection and analysis, the ADAPT (  A. Vankov, 1997) scientific software was used. EEG was digitized on-line for two minutes during each condition at a sampling rate of 256 Hz.

All electrode sites showed

impedance of less than 5 kOhms. Subjects participated in four conditions: 1) rest: in which subjects sat in a comfortable chair inside an acoustic chamber, but no particular task was required; 2) selfgenerated movement: subjects were asked to move their opposing thumb to middle fingers of the right hand (making a “duck” movement); 3) observation: subjects watched a confederate of the experimenter perform the “duck” movement; and 4) imagination: subjects were instructed to imagine performing the self-generated “duck” movement without actually doing it.

The confederate faced the subject who was seated

approximately four feet away throughout all conditions of the experiment.

The power

spectrum was calculated for each second of the EEG, and mean power within the mu range (8-13 Hz) was calculated for each condition over two minutes.

RESULTS

The data were analyzed using a repeated measures analysis of variance (ANOVA) with factors of condition (4) and electrode site (7). During the rest condition,

289 subjects exhibited significant power in the 8-13 Hz frequency range.

This rhythm

showed statistically significant changes during the various conditions (F(3,27)=4.98, P