{"title": "Methods Towards Invasive Human Brain Computer Interfaces", "book": "Advances in Neural Information Processing Systems", "page_first": 737, "page_last": 744, "abstract": null, "full_text": "             Methods Towards Invasive Human\n                     Brain Computer Interfaces\n\n\n              Thomas Navin Lal1, Thilo Hinterberger2, Guido Widman3,\n                Michael Schroder4, Jeremy Hill1, Wolfgang Rosenstiel4,\n          Christian E. Elger3, Bernhard Scholkopf1 and Niels Birbaumer2,5\n\n         1 Max-Planck-Institute for Biological Cybernetics, Tubingen, Germany\n                       {navin,jez,bs}@tuebingen.mpg.de\n              2 Eberhard Karls University, Dept. of Medical Psychology and\n                       Behavioral Neurobiology, Tubingen, Germany\n     {thilo.hinterberger,niels.birbaumer}@uni-tuebingen.de\n            3 University of Bonn, Department of Epileptology, Bonn, Germany\n           {guido.widman,christian.elger}@ukb.uni-bonn.de\n         4 Eberhard Karls University, Dept. of Computer Engineering, Tubingen,\n   Germany {schroedm,rosenstiel}@informatik.uni-tuebingen.de\n             5 Center for Cognitive Neuroscience, University of Trento, Italy\n\n\n                                         Abstract\n\n         During the last ten years there has been growing interest in the develop-\n         ment of Brain Computer Interfaces (BCIs). The field has mainly been\n         driven by the needs of completely paralyzed patients to communicate.\n         With a few exceptions, most human BCIs are based on extracranial elec-\n         troencephalography (EEG). However, reported bit rates are still low. One\n         reason for this is the low signal-to-noise ratio of the EEG [16]. We are\n         currently investigating if BCIs based on electrocorticography (ECoG) are\n         a viable alternative. In this paper we present the method and examples\n         of intracranial EEG recordings of three epilepsy patients with electrode\n         grids placed on the motor cortex. The patients were asked to repeat-\n         edly imagine movements of two kinds, e.g., tongue or finger movements.\n         We analyze the classifiability of the data using Support Vector Machines\n         (SVMs) [18, 21] and Recursive Channel Elimination (RCE) [11].\n\n\n1 Introduction\n\nCompletely paralyzed patients cannot communicate despite intact cognitive functions. The\ndisease Amyotrophic Lateral Sclerosis (ALS) for example, leads to complete paralysis of\nthe voluntary muscular system caused by the degeneration of the motor neurons. Birbaumer\net al. [1, 9] developed a Brain Computer Interface (BCI), called the Thought Translation\nDevice (TTD), which is used by several paralyzed patients. In order to use the interface,\npatients have to learn to voluntary regulate their Slow Cortical Potentials (SCP). The system\nthen allows its users to write text on the screen of a computer or to surf the web. Although it\npresents a major breakthrough, the system has two disadvantages. Not all patients manage\n\n\f\nFigure 1: The left picture schematically shows the position of the 8x8 electrode grid of pa-\ntient II. It was placed on the right hemisphere. As shown in the right picture the electrodes\nare connected to the amplifier via cables that are passed through the skull.\n\n\nto control their SCP. Furthermore the bit rate is quite low. A well-trained user requires\nabout 30 seconds to write one character.\nRecently there has been increasing interest on EEG-based BCIs in the machine learning\ncommunity. In contrast to the TTD, in many BCI-systems the computer learns rather than\nthe system's user [2, 5, 11]. Most such BCIs require a data collection phase during which\nthe subject repeatedly produces brain states of clearly separable locations. Machine learn-\ning techniques like Support Vector Machines or Fisher Discriminant are applied to the data\nto derive a classifying function. This function can be used in online applications to identify\nthe different brain states produced by the subject.\nThe majority of BCIs is based on extracranial EEG-recordings during imagined limb\nmovements. We restrict ourselves to mentioning just a few publications [14, 15, 17, 22].\nMovement-related cortical potentials in humans on the basis of electrocorticographical data\nhave also been studied, e.g. by [20]. Very recently the first work describing BCIs based on\nelectrocorticographic recordings was published [6, 13]. Successful approaches have been\ndeveloped using BCIs based on single unit, multiunit or field potentials recordings of pri-\nmates. Serruya et al. taught monkeys to control a cursor on the basis of potentials from 7-30\nmotor cortex neurons [19]. The BCI developed by [3] enables monkeys to reach and grasp\nusing a robot arm. Their system is based on recordings from frontoparietal cell ensembles.\nDriven by the success of BCIs for primates based on single unit or multiunit recordings,\nwe are currently developing a BCI-system that is based on ECoG recordings, as described\nin the present paper.\n\n2 Electrocorticography and Epilepsy\n\nAll patients presented suffer from a focal epilepsy. The epileptic focus - the part of the\nbrain which is responsible for the seizures - is removed by resection. Prior to surgery, the\nepileptic focus has to be localized. In some complicated cases, this must be done by placing\nelectrodes onto the surface of the cortex as well as into deeper regions of the brain. The\nskull over the region of interest is removed, the electrodes are positioned and the incision is\nsutured. The electrodes are connected to a recording device via cables (cf. Figure 1). Over\na period of a 5 to 14 days ECoG is continuously recorded until the patient has had enough\nseizures to precisely localize the focus [10]. Prior to surgery the parts of the cortex that are\ncovered by the electrodes are identified by the electric stimulation of electrodes.\nIn the current setup, the patients keep the electrode implants for one to two weeks. After\nthe implantation surgery, several days of recovery and follow-up examinations are needed.\nDue to the tight time constraints, it is therefore not possible to run long experiments. Fur-\nthermore most of the patients cannot concentrate for a long period of time. Therefore only\na small amount of data could be collected.\n\n\f\nTable 1: Positions of implanted electrodes. All three patients had an electrode grid im-\nplanted that partly covered the right or the left motor cortex.\n      patient        implanted electrodes                     task                trials\n\n            I      64-grid right hemisphere,      left vs. right hand             200\n                   two 4-strip interhemisphere\n            II     64-grid right hemisphere       little left finger vs. tongue    150\n            III    20-grid central,               little right finger vs. tongue 100\n                   four 16-strips frontal\n\n\n3 Experimental Situation and Data Acquisition\n\nThe experiments were performed in the department of epileptology of the University of\nBonn. We recorded ECoG data from three epileptic patients with a sampling rate of\n1000Hz.\nThe electrode grids were placed on the cortex under the dura mater and covered the pri-\nmary motor and premotor area as well as the fronto-temporal region either of the right or\nleft hemisphere. The grid-sizes ranged from 20 to 64 electrodes. Furthermore two of the\npatients had additional electrodes implanted on other parts of the cortex (cf. Table 1). The\nimagery tasks were chosen such that the involved parts of the brain\n\n       were covered by the electrode grid\n\n       were represented spatially separate in the primary motor cortex.\n\nThe expected well-localized signal in motor-related tasks suggested discrimination tasks\nusing imagination of hand, little finger, or tongue movements.\nThe patients were seated in a bed facing a monitor and were asked to repeatedly imagine\ntwo different movements. At the beginning of each trial, a small fixation cross was dis-\nplayed in the center of the screen. The 4 second imagination phase started with a cue that\nwas presented in the form of a picture showing either a tongue or a little finger for patients\nII and III. The cue for patient I was an arrow pointing left or right. There was a short break\nbetween the trials. The images which were used as a cue are shown in Figure 5.\n\n4 Preprocessing\n\nStarting half a second after the visualization of the task-cue, we extracted a window of\nlength 1.5 seconds from the data of each electrode. For every trial and every electrode we\nthus obtained an EEG sequence that consisted of 1500 samples. The linear trend from every\nsequence was removed. Following [8, 11, 15] we fitted a forward-backward autoregressive\nmodel of order three to each sequence. The concatenated model parameters of the channels\ntogether with the descriptor of the imagined task (i.e. +1, -1) form one training point. For\na given number n of EEG channels, a training point (x, y) is therefor a point in R3n \n{-1, 1}.\n\n\n5 Channel Selection\n\nThe number of available training points is relatively small compared to the dimensionality\nof the data. The data of patient III for example, consists of only 100 training points of\n\n\f\nFigure 2: The patients were asked to repeatedly imagine two different movements that are\nrepresented separately at the primary cortex, e.g. tongue and little finger movements. This\nfigure shows two stimuli that were used as a cue for imagery. The trial structure is shown\non the right. The imagination phase lasted four seconds. We extracted segments of 1.5\nseconds from the ECoG recordings for the analysis.\n\n\ndimension 252. This is a typical setting in which features selection methods can improve\nclassification accuracy.\nLal et al. [11] recently introduced a feature selection method for the special case of EEG\ndata. Their method is based on Recursive Feature Elimination (RFE) [7]. RFE is a back-\nward feature selection method. Starting with the full data set, features are iteratively re-\nmoved from the data until a stopping criteria is met. In each iteration a Support Vector\nMachine (SVM) is trained and its weight vector is analyzed. The feature that corresponds\nto the smallest weight vector entry is removed.\nRecursive Channel Elimination (RCE) [11] treats features that belong to the data of a chan-\nnel in a consistent way. As in RFE, in every iteration one SVM is trained. The evaluation\ncriteria that determines which of the remaining channels will be removed is the mean of the\nweight vector entries that correspond to a channel's features. All features of the channel\nwith the smallest mean value are removed from the data. The output of RCE is a list of\nranked channels.\n\n6 Data Analysis\n\nTo begin with, we are interested in how well SVMs can learn from small ECoG data sets.\nFurthermore we would like to understand how localized the classification-relevant infor-\nmation is, i.e. how many recording positions are necessary to obtain high classification\naccuracy. We compare how well SVMs can generalize given the data of different subsets\nof ECoG-channels:\n\n     (i) the complete data, i.e. all channels\n    (ii) the subset of channels suggested by RCE. In this setting we use the list of ranked\n         channels from RCE in the following way: For every l in the range of one to the\n         total number of channels, we calculate a 10-fold cross-validation error on the data\n         of the l best-ranked channels. We use the subset of channels which leads to the\n         lowest error estimate.\n    (iii) the two best-ranked channels by RCE. The underlying assumption used here is\n         that the classification-relevant information is extremely localized and that two cor-\n         rectly chosen channels contain sufficient information for classification purposes.\n    (iv) two channels drawn at random.\n\nThroughout the paper we use linear SVMs. For regularization purposes we use a ridge on\nthe kernel matrix which corresponds to a 2-norm penalty on the slack variables [4].\n\n\f\n           C4\n\n              \n\n           C3\n\n              \n\n           C2\n    muV\n              \n\n           C1\n\n              \n\n              0            500        1000         1500                 2000    2500    3000        3500\n                                                           time [ms]\n\n\nFigure 3: This plot shows ECoG recordings from 4 channels while the patient was imag-\nining movements. The distance of two horizontal lines decodes 100V . The amplitude of\nthe recordings ranges roughly from -100 V to +100 V which is on the order of five to\nten times the amplitude measured with extracranial EEG.\n\n\nTo evaluate the classification performance of an SVM that is trained on a specific subset\nof channels we calculate its prediction error on a separate test set. We use a double-cross-\nvalidation scheme - the following procedure is repeated 50 times:\nWe randomly split the data into a training set (80%) and a test set (20%). Via 10-fold\ncross-validation on the training set we estimate all parameters for the different considered\nsubsets (i)-(iv):\n\n            (i) The ridge is estimated.\n           (ii) On the basis of the training set RCE suggests a subset of channels. We restrict the\n                   training set as well as the test set to these channels. A ridge-value is then estimated\n                   from the restricted training set.\n           (iii) We restrict the training set and the test set to the 2 best ranked channels by RCE.\n                   The ridge is then estimated on the restricted training set.\n           (iv) The ridge is estimated.\n\nWe then train an SVM on the (restricted) training set using the estimated ridge. The trained\nmodel is tested on the (restricted) test set. For (i)-(iv) we obtain 50 test error estimates from\nthe 50 repetitions for each patient. Table 2 summarizes the results.\n\n7 Results\n\nThe results in Table 2 show that the generalization ability can significantly be increased by\nRCE. For patient I the error decreases from 38% to 24% when using the channel subsets\nsuggested by RCE. In average RCE selects channel subsets of size 5.8. For patient II the\nnumber of channels is reduced to one third but the channel selection process does not yield\nan increased accuracy. The error of 40% can be reduced to 23% for patient III using in\naverage 5 channels selected by RCE.\nFor patients I and III the choice of the best 2 ranked channels leads to a much lower error\nas well. The direct comparison of the results using the two best ranked channels to two\nrandomly chosen channels shows how well the RCE ranking method works: For patient\nthree the error drops from chance level for two random channels to 18 % using the two\nbest-ranked channels.\nThe reason why there is such a big difference in performance for patient III when compar-\ning (i) and (iii) might be, that out of the 84 electrodes, only 20 are located over or close to\nthe motor cortex. RCE successfully identifies the important electrodes.\nIn contrast to patient III, the electrodes of patient II are all more or less located close to\n\n\f\nTable 2: Classification Results. We compare the classification accuracy of SVMs trained\non the data of different channel subsets: (i) all ECoG-channels, (ii) the subset determined\nby Recursive Channel Elimination (RCE), (iii) the subset consisting of the two best ranked\nchannels by RCE and (iv) two randomly drawn channels. The mean errors of 50 repetitions\nare given along with the standard deviations. The test error can significantly be reduced by\nRCE for two of the three patients. Using the two best ranked channels by RCE also yields\ngood results for two patients. SVMs trained on two random channels show performance\nbetter than chance only for patient II.\n               all channels (i)      RCE cross-val. (ii)    RCE top 2 (iii)    random 2 (iv)\n pat      #channels        error     #channels    error          error             error\n\n  I      74 0.382  0.071           5.8    0.243  0.063    0.244  0.078      chance level\n  II     64       0.257 0.076      21.5 0.268  0.080      0.309  0.086      0.419  0.123\n III     84          0.4 0.1       5.0    0.233 0.13      0.175  0.078      chance level\n\n\nthe motor cortex. This explains why data from two randomly drawn channels can yield\na classification rate better than chance. Furthermore patient II had the fewest electrodes\nimplanted and thus the chance of randomly choosing an electrode close to an important\nlocation is higher than for the other two patients.\n\n8 Discussion\n\nWe recorded ECoG-data from three epilepsy patients during a motor imagery experiment.\nAlthough only few data were collected, the following conclusions can be drawn:\n\n         The data of all three patients is reasonably well classifiable. The error rates range\n          from 17.5% to 23.3%. This is still high compared to the best error rates from BCI\n          based on extracranial EEG which are as low as 10% (e.g. [12]). Please note that\n          we used 1.5 seconds data from each trial only and that very few training points\n          (100-200) were available. Furthermore, extracranial EEG has been studied and\n          developed for a number of years.\n         Recursive Channel Elimination (RCE) shows very good performance. RCE suc-\n          cessfully identifies subsets of ECoG-channels that lead to good classification per-\n          formance. On average, RCE leads to a significantly improved classification rate\n          compared to a classifier that is based on the data of all available channels.\n         Poor classification rates using two randomly drawn channels and high classifica-\n          tion rates using the two best-ranked channels by RCE suggest that classification\n          relevant information is focused on small parts of the cortex and depends on the\n          location of the physiological function.\n         The best ranked RCE-channels correspond well with the results from the electric\n          stimulation (cf. Figure 8).\n\n\n9 Ongoing Work and Further Research\n\nAlthough our preliminary results indicate that invasive Brain Computer Interfaces may\nbe feasible, a number of questions need to be investigated in further experiments. For\ninstance, it is still an open question whether the patients are able to adjust to a trained\nclassifier and whether the classifying function can be transferred from session to session.\nMoreover, experiments that are based on tasks different from motor imaginary need to\n\n\f\n                            X X X\n                        X X X\n                       X     X    X\n\n\n\n\n\nFigure 4: Electric stimulation of the implanted electrodes helps to identify the parts of the\ncortex that are covered by the electrode grid. This information is necessary for the surgery.\nThe red (solid) dots on the left picture mark the motor cortex of patient II as identified\nby the electric stimulation method. The positions marked with yellow crosses correspond\nto the epileptic focus. The red points on the right image are the best ranked channels by\nRecursive Channel Elimination (RCE). The RCE-channels correspond well to the results\nfrom the electro stimulation diagnosis.\n\n\nbe implemented and tested. It is quite conceivable that the tasks that have been found to\nwork well for extracranial EEG are not ideal for ECoG. Likewise, it is unclear whether our\npreprocessing and machine learning methods, originally developed for extracranial EEG\ndata, are well adapted to the different type of data that ECoG delivers.\n\nAcknowledgements\n\nThis work was supported in part by the Deutsche Forschungsgemeinschaft (SFB 550, B5\nand grant RO 1030/12), by the National Institute of Health (D.31.03765.2), and by the\nIST Programme of the European Community, under the PASCAL Network of Excellence,\nIST-2002-506778. T.N.L. was supported by a grant from the Studienstiftung des deutschen\nVolkes. Special thanks go to Theresa Cooke.\n\nReferences\n\n [1] N. Birbaumer, N. Ghanayim, T. Hinterberger, I. Iversen, B. Kotchoubey, A. Kubler,\n     J. Perelmouter, E. Taub, and H. Flor. A spelling device for the paralysed. Nature,\n     398:297298, 1999.\n [2] B. Blankertz, G. Curio, and K. Muller. Classifying single trial EEG: Towards brain\n     computer interfacing. In T.K. Leen, T.G. Dietterich, and V. Tresp, editors, Advances\n     in Neural Information Processing Systems, volume 14, Cambridge, MA, USA, 2001.\n     MIT Press.\n [3] J.M. Carmena, M.A Lebedev, R.E Crist, J.E O'Doherty, D.M. Santucci, D. Dimitrov,\n     P.G. Patil, C.S Henriquez, and M.A. Nicolelis. Learning to control a brain-machine\n     interface for reaching and grasping by primates. PLoS Biology, 1(2), 2003.\n [4] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20:273297,\n     1995.\n [5] J. del R. Millan, F. Renkens, J. Mourino, and W. Gerstner. Noninvasive brain-actuated\n     control of a mobile robot by human eeg. IEEE Transactions on Biomedical Engineer-\n     ing. Special Issue on Brain-Computer Interfaces, 51(6):10261033, June 2004.\n\n\f\n [6] B. Graimann, J. E. Huggins, S. P. Levine, and G. Pfurtscheller. Towards a direct brain\n     interface based on human subdural recordings and wavelet packet analysis. IEEE\n     Trans. IEEE Transactions on Biomedical Engineering, 51(6):954962, 2004.\n [7] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene selection for cancer clas-\n     sification using support vector machines. Journal of Machine Learning Research,\n     3:14391461, March 2003.\n [8] S. Haykin. Adaptive Filter Theory. Prentice-Hall International, Inc., Upper Saddle\n     River, NJ, USA, 1996.\n [9] T. Hinterberger, J. Kaiser, A. Kbler, N. Neumann, and N. Birbaumer. The Thought\n     Translation Device and its Applications to the Completely Paralyzed. In Diebner,\n     Druckrey, and Weibel, editors, Sciences of the Interfaces. Genista-Verlag, Tubingen,\n     2001.\n[10] J. Engel Jr. Presurgical evaluation protocols. In Surgical Treatment of the Epilepsies,\n     pages 740742. Raven Press Ltd., New York, 2nd edition, 1993.\n[11] T.N. Lal, M. Schroder, T. Hinterberger, J. Weston, M. Bogdan, N. Birbaumer, and\n     B. Scholkopf. Support Vector Channel Selection in BCI. IEEE Transactions on\n     Biomedical Engineering. Special Issue on Brain-Computer Interfaces, 51(6):1003\n     1010, June 2004.\n[12] S. Lemm, C. Schafer, and G. Curio. BCI Competition 2003 - Data Set III: Proba-\n     bilistic Modeling of Sensorimotor mu-Rhythms for Classification of Imaginary Hand\n     Movements. IEEE Transactions on Biomedical Engineering. Special Issue on Brain-\n     Computer Interfaces, 51(6):10771080, June 2004.\n[13] E. C. Leuthardt, G. Schalk, J. R. Wolpaw, J. G. Ojemann, and D. W. Moran. A\n     braincomputer interface using electrocorticographic signals in humans. Journal of\n     Neural Engineering, 1:6371, 2004.\n[14] D.J. McFarland, L.M. McCane, S.V. David, and J.R. Wolpaw. Spatial filter selection\n     for EEG-based communication. Electroencephalography and Clinical Neurophysiol-\n     ogy, 103:386394, 1997.\n[15] G. Pfurtscheller., C. Neuper amd A. Schlogl, and K. Lugger. Separability of EEG\n     signals recorded during right and left motor imagery using adaptive autoregressive\n     parameters. IEEE Transactions on Rehabilitation Engineering, 6(3):316325, 1998.\n[16] J. Raethjen, M. Lindemann, M. Dumpelmann, R. Wenzelburger, H. Stolze, G. Pfister,\n     C. E. Elger, J. Timmer, and G. Deuschl. Corticomuscular coherence in the 6-15 hz\n     band: is the cortex involved in the generation of physiologic tremor? Experimental\n     Brain Research, 142:3240, 2002.\n[17] H. Ramoser, J. Muller-Gerking, and G. Pfurtscheller. Optimal spatial filtering of sin-\n     gle trial EEG during imagined hand movement. IEEE Transactions on Rehabilitation\n     Engineering, 8(4):441446, 2000.\n[18] B. Scholkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge, USA,\n     2002.\n[19] M.D. Serruya, N.G Hatsopoulos, L. Paninski, M.R. Fellows, and Donoghue J.P. In-\n     stant neural control of a movement signal. Nature, 416:141142, 2002.\n[20] C. Toro, G. Deuschl, R. Thatcher, S. Sato, C. Kufta, and M. Hallett. Event-related\n     desynchronization and movement-related cortical potentials on the ECoG and EEG.\n     Electroencephalography Clinical Neurophysiology, 5:380389, 1994.\n[21] V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons, New York, USA,\n     1998.\n[22] R. Wolpaw and D.J McFarland. Multichannel EEG-based brain-computer communi-\n     cation. Electroencephalography and Clinical Neurophysiology, 90:444449, 1994.\n\n\f\n", "award": [], "sourceid": 2662, "authors": [{"given_name": "Thomas", "family_name": "Lal", "institution": null}, {"given_name": "Thilo", "family_name": "Hinterberger", "institution": null}, {"given_name": "Guido", "family_name": "Widman", "institution": null}, {"given_name": "Michael", "family_name": "Schr\u00f6der", "institution": null}, {"given_name": "N.", "family_name": "Hill", "institution": null}, {"given_name": "Wolfgang", "family_name": "Rosenstiel", "institution": null}, {"given_name": "Christian", "family_name": "Elger", "institution": null}, {"given_name": "Niels", "family_name": "Birbaumer", "institution": null}, {"given_name": "Bernhard", "family_name": "Sch\u00f6lkopf", "institution": null}]}