{"title": "Kernel Feature Spaces and Nonlinear Blind Souce Separation", "book": "Advances in Neural Information Processing Systems", "page_first": 761, "page_last": 768, "abstract": null, "full_text": "Kernel Feature Spaces and\n\nNonlinear Blind Source Separation\n\nStefan Harmeling1(cid:3), Andreas Ziehe1, Motoaki Kawanabe1, Klaus-Robert M\u00fcller1;2\n\n1Fraunhofer FIRST.IDA, Kekul\u00e9str. 7, 12489 Berlin, Germany\n\n2University of Potsdam, Department of Computer Science,\n\nAugust-Bebel-Strasse 89, 14482 Potsdam, Germany\n\n{harmeli,ziehe,kawanabe,klaus}@first.fhg.de\n\nAbstract\n\nIn kernel based learning the data is mapped to a kernel feature space of\na dimension that corresponds to the number of training data points. In\npractice, however, the data forms a smaller submanifold in feature space,\na fact that has been used e.g. by reduced set techniques for SVMs. We\npropose a new mathematical construction that permits to adapt to the in-\ntrinsic dimension and to \ufb01nd an orthonormal basis of this submanifold.\nIn doing so, computations get much simpler and more important our\ntheoretical framework allows to derive elegant kernelized blind source\nseparation (BSS) algorithms for arbitrary invertible nonlinear mixings.\nExperiments demonstrate the good performance and high computational\nef\ufb01ciency of our kTDSEP algorithm for the problem of nonlinear BSS.\n\n1 Introduction\n\nIn a widespread area of applications kernel based learning machines, e.g. Support Vector\nMachines (e.g. [19, 6]) give excellent solutions. This holds both for problems of supervised\nand unsupervised learning (e.g. [3, 16, 12]). The general idea is to map the data xi (i =\n1; : : : ; T ) into some kernel feature space F by some mapping (cid:8) : <n ! F. Performing\na simple linear algorithm in F, then corresponds to a nonlinear algorithm in input space.\nEssential ingredients to kernel based learning are (a) VC theory that can provide a relation\nbetween the complexity of the function class in use and the generalization error and (b) the\nfamous kernel trick\n\nk(x; y) = (cid:8)(x) (cid:1) (cid:8)(y);\n\n(1)\nwhich allows to ef\ufb01ciently compute scalar products. This trick is essential if e.g. F is\nan in\ufb01nite dimensional space. Note that even though F might be in\ufb01nite dimensional the\nsubspace where the data lies is maximally T -dimensional. However, the data typically\nforms an even smaller subspace in F (cf. also reduced set methods [15]). In this work\nwe therefore propose a new mathematical construction that allows us to adapt to the in-\ntrinsic dimension and to provide an orthonormal basis of this submanifold. Furthermore,\nthis makes computations much simpler and provides the basis for a new set of kernelized\nlearning algorithms.\n\n(cid:3)To whom correspondence should be addressed.\n\n\fTo demonstrate the power of our new framework we will focus on the problem of nonlinear\nBSS [2, 18, 9, 10, 20, 11, 13, 14, 7, 17, 8] and provide an elegant kernel based algorithm\nfor arbitrary invertible nonlinearities. In nonlinear BSS we observe a mixed signal of the\nfollowing structure\n\nxt = f (st);\n\n(2)\nwhere xt and st are n (cid:2) 1 column vectors and f is a possibly nonlinear function from <n to\n<n. In the special case where f is an n(cid:2)n matrix we retrieve standard linear BSS (e.g. [8, 4]\nand references therein). Nonlinear BSS has so far been only applied to industrial pulp data\n[8], but a large class of applications where nonlinearities can occur in the mixing process\nare conceivable, e.g. in the \ufb01elds of telecommunications, array processing, biomedical data\nanalysis (EEG, MEG, EMG, : : :) and acoustic source separation. Most research has so far\ncentered on post-nonlinear models, i.e.\n\nxt = f (Ast);\n\n(3)\n\nwhere A is a linear mixing matrix and f is a post-nonlinearity that operates componentwise.\nAlgorithmic solutions of eq.(3) have used e.g. self-organizing maps [13, 10], extensions of\nGTM [14], neural networks [2, 11] or ensemble learning [18] to unfold the nonlinearity\nf. Also a kernel based method was tried on very simple toy signals; however with some\nstability problems [7]. Note, that all existing methods are of high computational cost and\ndepending on the algorithm are prone to run into local minima. In our contribution to the\ngeneral invertable nonlinear BSS case we apply a standard BSS technique [21, 1] (that\nrelies on temporal correlations) to mapped signals in feature space (cf. section 3). This is\nnot only mathematically elegant (cf. section 2), but proves to be a remarkably stable and\nef\ufb01cient algorithm with high performance, as we will see in the experiments on nonlinear\nmixtures of toy and speech data (cf. section 4). Finally, a conclusion is given in section 5.\n\n2 Theory\n\nAn orthonormal basis for a subspace in F\n\nIn order to establish a linear problem in feature space that corresponds to some nonlin-\near problem in input space we need to specify how to map inputs x1; : : : ; xT 2 <n into\nthe feature space F and how to handle its possibly high dimensionality. In addition to\nthe inputs, consider some further points v1; : : : ; vd 2 <n from the same space, that will\nlater generate a basis in F. Alternatively, we could use kernel PCA [16]. However, in\nthis paper we concentrate on a different method. Let us denote the mapped points by\n(cid:8)x := [(cid:8)(x1) (cid:1) (cid:1) (cid:1) (cid:8)(xT )] and (cid:8)v := [(cid:8)(v1) (cid:1) (cid:1) (cid:1) (cid:8)(vd)]. We assume that the columns of\n(cid:8)v constitute a basis of the column space1 of (cid:8)x, which we note by\n\nspan((cid:8)v) = span((cid:8)x) and rank((cid:8)v) = d:\n\n(4)\n\nMoreover, (cid:8)v being a basis implies that the matrix (cid:8)>\nexists. So, now we can de\ufb01ne an orthonormal basis\n\nv (cid:8)v has full rank and its inverse\n\n(cid:4) := (cid:8)v((cid:8)>\n\nv (cid:8)v)(cid:0) 1\n\n2\n\n(5)\n\nthe column space of which is identical to the column space of (cid:8)v. Consequently this basis\n(cid:4) enables us to parameterize all vectors that lie in the column space of (cid:8)x by some vectors\nin <d. For instance for vectors PT\ni=1 (cid:11)(cid:8)i(cid:8)(xi), which we write more compactly as (cid:8)x(cid:11)(cid:8),\nand (cid:8)x(cid:12)(cid:8) in the column space of (cid:8)x with (cid:11)(cid:8) and (cid:12)(cid:8) in <T there exist (cid:11)(cid:4) and (cid:12)(cid:4) in <d\nsuch that (cid:8)x(cid:11)(cid:8) = (cid:4)(cid:11)(cid:4) and (cid:8)x(cid:12)(cid:8) = (cid:4)(cid:12)(cid:4). The orthonormality implies\n\n(cid:11)>\n\n(cid:8) (cid:8)>\n\nx (cid:8)x(cid:12)(cid:8) = (cid:11)>\n\n(cid:4) (cid:4)>(cid:4)(cid:12)(cid:4) = (cid:11)>\n\n(cid:4) (cid:12)(cid:4)\n\n(6)\n\n\finput space\n\n<n\n\nfeature space\n\nF\n\nspan((cid:4))\n\nparameter space\n<d\n\nFigure 1:\nInput data are mapped to some submanifold of F which is in the span of some d-\ndimensional orthonormal basis (cid:4). Therefore these mapped points can be parametrized in <d. The\nlinear directions in parameter space correspond to nonlinear directions in input space.\n\nwhich states the remarkable property that the dot product of two linear combinations of the\ncolumns of (cid:8)x in F coincides with the dot product in <d. By construction of (cid:4) (cf. eq.(5))\nthe column space of (cid:8)x is naturally isomorphic (as vector spaces) to <d. Moreover, this\nisomorphism is compatible with the two involved dot products as was shown in eq.(6). This\nimplies that all properties regarding angles and lengths can be taken back and forth between\nthe column space of (cid:8)x and <d. The space that is spanned by (cid:4) is called parameter space.\nFigure 1 pictures our intuition: Usually kernel methods parameterize the column space of\n(cid:8)x in terms of the mapped patterns f(cid:8)(xi)g which effectively corresponds to vectors in\n<T . The orthonormal basis from eq.(5), however enables us to work in <d i.e. in the span\nof (cid:4), which is extremely valuable since d depends solely on the kernel function and the\ndimensionality of the input space. So d is independent of T .\n\nMapping inputs\n\nHaving established the machinery above, we will now show how to map the input data to\nthe right space. The expressions\n\n((cid:8)>\n\nv (cid:8)v)ij = (cid:8)(vi)>(cid:8)(vj) = k(vi; vj) with i; j = 1 : : : d\n\nare the entries of a real valued d (cid:2) d matrix (cid:8)>\nv (cid:8)v that can be effectively calculated using\nthe kernel trick and by construction of v1; : : : ; vd, it has full rank and is thus invertible.\nSimilarly we get\n\n((cid:8)>\n\nv (cid:8)x)ij = (cid:8)(vi)>(cid:8)(xj ) = k(vi; xj ) with\n\ni = 1 : : : d;\n\nj = 1 : : : T ;\n\nwhich are the entries of the real valued d (cid:2) T matrix (cid:8)>\ncompute \ufb01nally the parameter matrix\n\nv (cid:8)x. Using both matrices we\n\n(cid:9)x := (cid:4)>(cid:8)x = ((cid:8)>\n\nv (cid:8)v)(cid:0) 1\n\n2 (cid:8)>\n\nv (cid:8)x\n\n(7)\n\n1The column space of (cid:8)x is the space that is spanned by the column vectors of (cid:8)x, written\n\nspan((cid:8)x).\n\n\f2 is symmetric. Regarding\nwhich is also a real valued d (cid:2) T matrix; note that ((cid:8)>\ncomputational costs, we have to evaluate the kernel function O(d 2) + O(dT ) times and\neq.(7) requires O(d3) multiplications; again note that d is much smaller than T . Further-\nmore storage requirements are cheaper as we do not have to hold the full T (cid:2) T kernel\nmatrix but only a d (cid:2) T matrix. Also, kernel based algorithms often require centering in\nF, which in our setting is equivalent to centering in <d. Fortunately the latter can be done\nquite cheaply.\n\nv (cid:8)v)(cid:0) 1\n\nChoosing vectors for the basis in F\n\nSo far we have assumed to have points v1; : : : ; vd that ful\ufb01ll eq.(4) and we presented\nthe bene\ufb01cial properties of our construction. In fact, v1; : : : ; vd are roughly analogous\nto a reduced set in the support vector world [15]. Note however that often we can only\napproximately ful\ufb01ll eq.(4), i.e.\n\nspan((cid:8)v) (cid:25) span((cid:8)x):\n\n(8)\n\nIn this case we strive for points that provide the best approximation.\n\nObviously d is \ufb01nite since it is bounded by T , the number of inputs, and by the dimension-\nality of the feature space. Before formulating the algorithm we de\ufb01ne the function rk(n)\nfor numbers n by the following process: randomly pick n points v1; : : : ; vn from the inputs\nand compute the rank of the corresponding n (cid:2) n matrix (cid:8)>\nv (cid:8)v. Repeating this random\nsampling process several times (e.g. 100 times) stabilizes this process in practice. Then we\ndenote by rk(n) the largest achieved rank; note that rk(n) (cid:20) n. Using this de\ufb01nition we\ncan formulate a recipe to \ufb01nd d (the dimension of the subspace of F): (1) start with a large\nd with rk(d) < d. (2) Decrement d by one as long as rk(d) < d holds. As soon as we\nhave rk(d) = d we found the d. Choose v1; : : : ; vd as the vectors that achieve rank d. As\nan alternative to random sampling we have also employed k-means clustering with similar\nresults.\n\n3 Nonlinear blind source separation\n\nTo demonstrate the use of the orthonormal basis in F, we formulate a new nonlinear BSS\nalgorithm based on TDSEP [21]. We start from a set of points v1; : : : ; vd, that are provided\nby the algorithm from the last section such that eq.(4) holds. Next, we use eq.(7) to compute\n\n(cid:9)x[t] := (cid:4)>(cid:8)(x[t]) = ((cid:8)>\n\nv (cid:8)v)(cid:0) 1\n\n2 (cid:8)>\n\nv (cid:8)(x[t]) 2 <d:\n\nHereby we have transformed the time signals x[t] from input space to parameter space sig-\nnals (cid:9)x[t] (cf. Fig.1). Now we apply the usual TDSEP ([21]) that relies on simultaneous\ndiagonalisation techniques [5] to perform linear blind source separation on (cid:9)x[t] to obtain\nd linear directions of separated nonlinear components in input space. This new algorithm is\ndenoted as kTDSEP (kernel TDSEP); in short, kTDSEP is TDSEP on the parameter space\nde\ufb01ned in Fig.1. A key to the success of our algorithm are the time correlations exploited\nby TDSEP; intuitively they provide the \u2018glue\u2019 that yields the coherence for the separated\nsignals. Note that for a linear kernel functions the new algorithm performs linear BSS.\nTherefore linear BSS can be seen as a special case of our algorithm.\nNote that common kernel based algorithms which do not use the d-dimensional orthonor-\nmal basis will run into computational problems. They need to hold and compute with a\nkernel matrix that is T (cid:2) T instead of d (cid:2) T with T (cid:29) d in BSS problems. A further\nproblem is that manipulating such a T (cid:2) T matrix can easily become unstable. Moreover\nBSS methods typically become unfeasible for separation problems of dimension T .\n\n\f\u0001\u0010\u000f\u001b\u0013\n\n\u000f\u0012\b\u001b\u0011\u0010\u0005\u001c\u0018\u000b\f\n\n\u0019\u0012\u001d\u001e\u0003\u0006\u0005\n\n\u0001\u0010\u000f\r\u0013\n\n\u000f\u0015\b\u0016\u0011\u0010\u0005\u0006\u0017\u0016\b\u0010\u0018\u000b\f\n\n\u0019\u0016\f\n\n\u000f\u0012\u000e\n\n\u0019\r\u001f\n\n\u0019\u0010 \n\n\u000f\u0012\b\u0016\u0011\u0010\u0005\u0006\u0017\u001b\b\u0010\u0018\u000b\f\n\n\u0019\u0006\f\n\n\u000f\u0012\u000e\n\n\u0002\u0001\u0004\u0003\u0006\u0005\n\n\u0007\t\b\u000b\n\r\f\n\n\u000e\u0010\u000f\u0012\u0011\u0004\u0013\n\n\t\u001f\n\nFigure 2: Scatterplot of x1 vs x2 for nonlinear mixing and demixing (upper left and right) and linear\ndemixing and true source signals (lower left and right). Note, that the nonlinear unmixing agrees very\nnicely with the scatterplot of the true source signal.\n\n\u0004 \n\n4 Experiments\n\nIn the \ufb01rst experiment the source signals s[t] = [s1[t] s2[t]]> are a sinusoidal and a saw-\ntooth signal with 2000 samples each. The nonlinearly mixed signals are de\ufb01ned as (cf. Fig.2\nupper left panel)\n\nx1[t] = exp(s1[t]) (cid:0) exp(s2[t])\nx2[t] = exp((cid:0)s1[t]) + exp((cid:0)s2[t]):\n\nA dimension d = 22 of the manifold in feature space was obtained by kTDSEP using\na polynomial kernel k(x; y) = (x>y + 1)6 by sampling from the inputs. The basis-\ngenerating vectors v1; : : : ; v22 are shown as big dots in the upper left panel of Figure\n2. Applying TDSEP to the 22 dimensional mapped signals (cid:9)x[t] we get 22 components\nin parameter space. A scatter plot with the two components that best match the source\nsignals are shown in the right upper panel of Figure 2. The left lower panel also shows for\ncomparison the two components that we obtained by applying linear TDSEP directly to the\nmixed signals x[t]. The plots clearly indicate that kTDSEP has unfolded the nonlinearity\nsuccessfully while the linear demixing algorithm failed.\n\nIn a second experiment two speech signals (with 20000 samples, sampling rate 8 kHz) that\nare nonlinearly mixed by\n\nx1[t] = s1[t] + s3\nx2[t] = s3\n\n2[t]\n\n1[t] + tanh(s2[t]):\n\nThis time we used a Gaussian RBF kernel k(x; y) = exp((cid:0)jx (cid:0) yj2). kTDSEP identi\ufb01ed\nd = 41 and used k-means clustering to obtain v1; : : : ; v41. These points are marked as\n\u2019+\u2019 in the left panel of \ufb01gure 4. An application of TDSEP to the 41 dimensional parameter\n\n\u0014\n\f\n\u001a\n\f\n\u0014\n\f\n\b\n!\n\fmixture\n\nx1\n0.56\n0.63\n\nx2\n0.72\n0.46\n\nkTDSEP\nu2\nu1\n0.07\n0.89\n0.04\n0.86\n\nTDSEP\n\nu1\n0.09\n0.31\n\nu2\n0.72\n0.55\n\ns1\ns2\n\nTable 3: Correlation coef\ufb01cients for the signals shown in Fig.4.\n\nspace yields nonlinear components whose projections to the input space are depicted in the\nright lower panel. We can see that linear TDSEP (right middle panel) failed and that the\ndirections of best matching kTDSEP components closely resemble the sources.\n\nTo con\ufb01rm this visual impression we calculated the correlation coef\ufb01cients of the kTDSEP\nand TDSEP solution to the source signals (cf. table 3). Clearly, kTDSEP outperforms the\nlinear TDSEP algorithm, which is of course what one expects.\n\n5 Conclusion\n\nOur work has two main contributions. First, we propose a new formulation in the \ufb01eld of\nkernel based learning methods that allows to construct an orthonormal basis of the subspace\nof kernel feature space F where the data lies. This technique establishes a highly useful\n(scalar product preserving) isomorphism between the image of the data points in F and a\nd-dimensional space <d. Several interesting things follow: we can construct a new set of\nef\ufb01cient kernel-based algorithms e.g. a new and eventually more stable variant of kernel\nPCA [16]. Moreover, we can acquire knowledge about the intrinsic dimension of the data\nmanifold in F from the learning process.\nSecond, using our new formulation we tackle the problem of nonlinear BSS from the view-\npoint of kernel based learning. The proposed kTDSEP algorithm allows to unmix arbitrary\ninvertible nonlinear mixtures with low computational costs. Note, that the important ingre-\ndients are the temporal correlations of the source signals used by TDSEP. Experiments on\ntoy and speech signals underline that an elegant solution has been found to a challenging\nproblem.\nApplications where nonlinearly mixed signals can occur, are found e.g. in the \ufb01elds of\ntelecommunications, array processing, biomedical data analysis (EEG, MEG, EMG, : : :)\nand acoustic source separation. In fact, our algorithm would allow to provide a software-\nbased correction of sensors that have a nonlinear characteristics e.g. due to manufacturing\nerrors. Clearly kTDSEP is only one algorithm that can perform nonlinear BSS; kernelizing\nother ICA algorithms can be done following our reasoning.\nAcknowledgements The authors thank Benjamin Blankertz, Gunnar R\u00e4tsch, Sebastian\nMika for valuable discussions. This work was partly supported by the EU project (IST-\n1999-14190 \u2013 BLISS) and DFG (JA 379/9-1, MU 987/1-1).\n\n\f4\n\n2\n\n0\n\n\u22122\n\n\u22124\n\n\u22126\n\n1\n\n2\n\n1\n\n2\n\n1\n\n2\n\n\u22126\n\n\u22124\n\n\u22122\n\n0\n\n2\n\n4\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\nFigure 4: A highly nonlinear mixture of two speech signals: Scatterplot of x1 vs x2 and the waveforms of the true source signals (upper panel) in comparison to the\nbest matching linear and nonlinear separation results are shown in the middle and lower panel, respectively.\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\n1\n\n1.2\n\n1.4\n\n1.6\n\n1.8\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\n1\n\n1\n\n1.2\n\n1.4\n\n1.6\n\n1.8\n\n1.2\n\n1.4\n\n1.6\n\n1.8\n\n2\nx 104\n\n2\nx 104\n\n2\nx 104\n\n\fReferences\n[1] A. Belouchrani, K. Abed Meraim, J.-F. Cardoso, and E. Moulines. A blind source separation\ntechnique based on second order statistics. IEEE Trans. on Signal Processing, 45(2):434\u2013444,\n1997.\n\n[2] G. Burel. Blind separation of sources: a nonlinear neural algorithm. Neural Networks,\n\n5(6):937\u2013947, 1992.\n\n[3] C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Knowledge\n\nDiscovery and Data Mining, 2(2):121\u2013167, 1998.\n\n[4] J.-F. Cardoso. Blind signal separation: statistical principles. Proceedings of the IEEE,\n\n9(10):2009\u20132025, 1998.\n\n[5] J.-F. Cardoso and A. Souloumiac.\n\nJacobi angles for simultaneous diagonalization. SIAM\n\nJ.Mat.Anal.Appl., 17(1):161 ff., 1996.\n\n[6] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge\n\nUniversity Press, Cambridge, UK, 2000.\n\n[7] C. Fyfe and P. L. Lai. ICA using kernel canonical correlation analysis. In Proc. Int. Workshop\non Independent Component Analysis and Blind Signal Separation (ICA2000), pages 279\u2013284,\nHelsinki, Finland, 2000.\n\n[8] A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component Analysis. Wiley, 2001.\n[9] T.-W. Lee, B.U. Koehler, and R. Orglmeister. Blind source separation of nonlinear mixing\n\nmodels. In Neural Networks for Signal Processing VII, pages 406\u2013415. IEEE Press, 1997.\n\n[10] J. K. Lin, D. G. Grier, and J. D. Cowan. Faithful representation of separable distributions.\n\nNeural Computation, 9(6):1305\u20131320, 1997.\n\n[11] G. Marques and L. Almeida. Separation of nonlinear mixtures using pattern repulsion. In Proc.\nInt. Workshop on Independent Component Analysis and Signal Separation (ICA\u201999), pages 277\u2013\n282, Aussois, France, 1999.\n\n[12] K.-R. M\u00fcller, S. Mika, G. R\u00e4tsch, K. Tsuda, and B. Sch\u00f6lkopf. An introduction to kernel-based\n\nlearning algorithms. IEEE Transactions on Neural Networks, 12(2):181\u2013201, 2001.\n\n[13] P. Pajunen, A. Hyv\u00e4rinen, and J. Karhunen. Nonlinear blind source separation by self-\nIn Proc. Int. Conf. on Neural Information Processing, pages 1207\u20131210,\n\norganizing maps.\nHong Kong, 1996.\n\n[14] P. Pajunen and J. Karhunen. A maximum likelihood approach to nonlinear blind source sepa-\nration. In Proceedings of the 1997 Int. Conf. on Arti\ufb01cial Neural Networks (ICANN\u201997), pages\n541\u2013546, Lausanne, Switzerland, 1997.\n\n[15] B. Sch\u00f6lkopf, S. Mika, C.J.C. Burges, P. Knirsch, K.-R. M\u00fcller, G. R\u00e4tsch, and A.J. Smola.\nInput space vs. feature space in kernel-based methods. IEEE Transactions on Neural Networks,\n10(5):1000\u20131017, September 1999.\n\n[16] B. Sch\u00f6lkopf, A.J. Smola, and K.-R. M\u00fcller. Nonlinear component analysis as a kernel eigen-\n\nvalue problem. Neural Computation, 10:1299\u20131319, 1998.\n\n[17] A. Taleb and C. Jutten. Source separation in post-nonlinear mixtures. IEEE Trans. on Signal\n\nProcessing, 47(10):2807\u20132820, 1999.\n\n[18] H. Valpola, X. Giannakopoulos, A. Honkela, and J. Karhunen. Nonlinear independent com-\nponent analysis using ensemble learning: Experiments and discussion. In Proc. Int. Workshop\non Independent Component Analysis and Blind Signal Separation (ICA2000), pages 351\u2013356,\nHelsinki, Finland, 2000.\n\n[19] V.N. Vapnik. The nature of statistical learning theory. Springer Verlag, New York, 1995.\n[20] H. H. Yang, S.-I. Amari, and A. Cichocki. Information-theoretic approach to blind separation\n\nof sources in non-linear mixture. Signal Processing, 64(3):291\u2013300, 1998.\n\n[21] A. Ziehe and K.-R. M\u00fcller. TDSEP\u2014an ef\ufb01cient algorithm for blind separation using time\nIn Proc. Int. Conf. on Arti\ufb01cial Neural Networks (ICANN\u201998), pages 675\u2013680,\n\nstructure.\nSk\u00f6vde, Sweden, 1998.\n\n\f", "award": [], "sourceid": 2094, "authors": [{"given_name": "Stefan", "family_name": "Harmeling", "institution": null}, {"given_name": "Andreas", "family_name": "Ziehe", "institution": null}, {"given_name": "Motoaki", "family_name": "Kawanabe", "institution": null}, {"given_name": "Klaus-Robert", "family_name": "M\u00fcller", "institution": null}]}