{"title": "Constrained Independent Component Analysis", "book": "Advances in Neural Information Processing Systems", "page_first": 570, "page_last": 576, "abstract": null, "full_text": "Constrained Independent Component \n\nAnalysis \n\nWei Lu and Jagath C. Rajapakse \n\nSchool of Computer Engineering \n\nNanyang Technological University, Singapore 639798 \n\nemail: asjagath@ntu.edu.sg \n\nAbstract \n\nThe paper presents a novel technique of constrained independent \ncomponent analysis (CICA) to introduce constraints into the clas(cid:173)\nsical ICA and solve the constrained optimization problem by using \nLagrange multiplier methods. This paper shows that CICA can \nbe used to order the resulted independent components in a specific \nmanner and normalize the demixing matrix in the signal separation \nprocedure. It can systematically eliminate the ICA's indeterminacy \non permutation and dilation. The experiments demonstrate the use \nof CICA in ordering of independent components while providing \nnormalized demixing processes. \nKeywords: Independent component analysis, constrained indepen(cid:173)\ndent component analysis, constrained optimization, Lagrange mul(cid:173)\ntiplier methods \n\n1 \n\nIntroduction \n\nIndependent component analysis (ICA) is a technique to transform a multivari(cid:173)\nate random signal into a signal with components that are mutually independent \nin complete statistical sense [1]. There has been a growing interest in research for \nefficient realization of ICA neural networks (ICNNs). These neural algorithms pro(cid:173)\nvide adaptive solutions to satisfy independent conditions after the convergence of \nlearning [2, 3, 4]. \n\nHowever, ICA only defines the directions of independent components. The magni(cid:173)\ntudes of independent components and the norms of demixing matrix may still be \nvaried. Also the order of the resulted components is arbitrary. In general, ICA has \nsuch an inherent indeterminacy on dilation and permutation. Such indetermina(cid:173)\ntion cannot be reduced further without additional assumptions and constraints [5]. \nTherefore, constrained independent component analysis (CICA) is proposed as a \nway to provide a unique ICA solution with certain characteristics on the output by \nintroducing constraints: \n\n\u2022 To avoid the arbitrary ordering on output components: statistical measures \ngive indices to sort them in order, and evenly highlight the salient signals. \n\n\f\u2022 To produce unity transform operators: normalization of the demixing chan(cid:173)\n\nnels reduces dilation effect on resulted components. It may recover the exact \noriginal sources. \n\nWith such conditions applied, the ICA problem becomes a constrained optimization \nproblem. In the present paper, Lagrange multiplier methods are adopted to provide \nan adaptive solution to this problem. It can be well implemented as an iterative \nupdating system of neural networks, referred to ICNNs. Next section briefly gives an \nintroduction to the problem, analysis and solution of Lagrange multiplier methods. \nThen the basic concept of ICA will be stated. And Lagrange multiplier methods are \nutilized to develop a systematic approach to CICA. Simulations are performed to \ndemonstrate the usefulness of the analytical results and indicate the improvements \ndue to the constraints. \n\n2 Lagrange Multiplier Methods \n\nLagrange multiplier methods introduce Lagrange multipliers to resolve a constrained \noptimization iteratively. A penalty parameter is also introduced to fit the condition \nso that the local convexity assumption holds at the solution. Lagrange multiplier \nmethods can handle problems with both equality and inequality constraints. \n\nThe constrained nonlinear optimization problems that Lagrange multiplier methods \ndeal take the following general form: \n\nminimize f(X), subject to g(X) ~ 0, h(X) = \u00b0 \n\n(1) \nwhere X is a matrix or a vector of the problem arguments, f(X) is an objective \nfunction, g(X) = [9l(X)\u00b7\u00b7\u00b7 9m(X)jT defines a set of m inequality constraints and \nh(X) = [hl (X) ... hn(X)jT defines a set of n equality constraints. Because La(cid:173)\ngrangian methods cannot directly deal with inequality constraints 9i(X) ~ 0, it \nis possible to transform inequality constraints into equality constraints by intro(cid:173)\nducing a vector of slack variables z = [Zl ... zmjT to result in equality constraints \nPi(X) = 9i(X) + zl = 0, i = 1\u00b7 .. m. \nBased on the transformation, the corresponding simplified augmented Lagrangian \nfunction for problem (1) is defined as: \n\nwhere f-L = [f-Ll ... f-LmjT and A = [Al ... AnjT are two sets of Lagrange multipliers, \"I \nis the scalar penalty parameter, 9i(X) equals to f-Li+\"I9i(X), 11\u00b711 denotes Euclidean \nnorm, and !\"III . 112 is the penalty term to ensure that the optimization problem \nis held at the condition of local convexity assumption: 'V5cx\u00a3 > 0. We use the \naugmented Lagrangian function in this paper because it gives wider applicability \nand provides better stability [6]. \n\nFor discrete problems, the changes in the augmented Lagrangian function can be \ndefined as ~x\u00a3(X, f-L, A) to achieve the saddle point in the discrete variable space. \nThe iterative equations to solve the problem in eq.(2) are given as follows: \n\nX(k + 1) = X(k) - ~x\u00a3(X(k),f-L(k),A(k)) \nf-L(k + 1) = f-L(k) + \"Ip(X(k)) = max{O,g(X(k))} \nA(k + 1) = A(k) + \"Ih(X(k)) \n\n(3) \n\nwhere k denotes the iterative index and g(X(k)) = f-L(k) + \"I g(X(k)). \n\n\f3 Unconstrained ICA \n\nLet the time varying input signal be x = (Xl, X2, . .. , XN)T and the interested signal \nconsisting of independent components (ICs) be c = (CI, C2, ... , CM) T, and gener(cid:173)\nally M ~ N. The signal x is considered to be a linear mixture of independent \ncomponents c: x = Ac, where A is an N x M mixing matrix with full column rank. \n\nThe goal of general rcA is to obtain a linear M x N demixing matrix W to recover \nthe independent components c with a minimal knowledge of A and c, normally \nM = N. Then, the recovered components u are given by u = Wx. \nIn the present paper, the contrast function used is the mutual information (M) of \nthe output signal which is defined in the sense of variable's entropy to measure the \nindependence: \n\n(4) \n\nwhere H(Ui) is the marginal entropy of component Ui and H(u) is the output \njoint entropy. M has non-negative value and equals to zero when components are \ncompletely independent. \nWhile minimizing M, the learning equation for demixing matrix W to perform rcA \nis given by [1]: \n\n~ W ex W-T + * 0, super-Gaussian), Y2 is 0.02 (~ 0, Gaussian) \nand Y3 is -1.27 \u00ab 0, sub-Gaussian). The final performance index value of 0.28 \nand output components' average SNR value of 15dB show all three independent \ncomponents well separated too. \n\n5.2 Demixing Matrix Normalization \n\nThree deterministic signals and one Gaussian noise were simulated in this experi(cid:173)\nment. All signals were independently generated with unit variance and mixed with \na random mixing matrix. All input mixtures were preprocessed by a whitening pro(cid:173)\ncess to have zero mean and unit variance. The signals were separated using both \nunconstrained lCA and constrained lCA as given by eq.(5) and (16) respectively. \n\nTable 1 compares their resulted demixing matrix, row norms, variances of separated \ncomponents and SNR values. The dilation effect can be seen from the difference \n\nuncons. \n\nlCA \n\ncons. \nlCA \n\ny \nYl \nY2 \nY3 \nY4 \nYl \nY2 \nY3 \nY4 \n\nDemixing Matrix W \n\n0.90 \n0.08 \n-0.06 1.11 \n0.07 \n0.07 \n1.04 \n0.08 \n0.43 \n0.65 \n-0.37 \n0.91 \n-0.04 \n0.01 \n0.65 \n0.07 \n\n-0.12 \n-0.07 \n1.47 \n0.04 \n-0.02 \n0.05 \n1.00 \n0.02 \n\n-0.82 \n0.07 \n-0.09 \n1.16 \n-0.61 \n0.20 \n-0.04 \n0.76 \n\nNorms Variance \n1.23 \n1.11 \n1.47 \n1.56 \n0.99 \n1.01 \n1.00 \n1.00 \n\n1.50 \n1.24 \n2.17 \n2.43 \n0.98 \n1.02 \n1.00 \n1.00 \n\nSNR \n4.55 \n10.88 \n21.58 \n16.60 \n4.95 \n13.94 \n25.04 \n22.56 \n\nTable 1: Comparison of the demixing matrix elements, row norms, output variances \nand resulted components' SNR values in lCA, and ClCA with normalization. \n\n\famong components' variances caused by the non-normalized demixing matrix in \nunconstrained ICA. The CICA algorithm with normalization constraint normalized \nrows of the demixing matrix and separated the components with variances remained \nat unit. Therefore, the source signals are exactly recovered without any dilation. \nThe increment of separated components' SNR values using CICA also can be seen \nin the table. Their source components, input mixture, separated components using \nnormalization are given in figure 2. It shows that the resulted signals from CICA \nare exactly match with the source signals in the sense of waveforms and amplitudes. \n\n(a) \n\n(b) \n\n(c) \n\nFigure 2: (a) Four source deterministic components with unit variances, (b) mixture \ninputs and (c) resulted components through normalized demixing channel W. \n\n6 Conclusion \n\nWe present an approach of constrained ICA using Lagrange multiplier methods to \neliminate the indeterminacy of permutation and dilation which are present in clas(cid:173)\nsical ICA. Our results provide a technique for systematically enhancing the ICA's \nusability and performance using the constraints not restricted to the conditions \ntreated in this paper. More useful constraints can be considered in similar manners \nto further improve the outputs of ICA in other practical applications. Simulation \nresults demonstrate the accuracy and the usefulness of the proposed algorithms. \n\nReferences \n\n[1] Jagath C. Rajapakse and Wei Lu. Unified approach to independent component \n\nnetworks. In Second International ICSC Symposium on NEURAL COMPUTA(cid:173)\nTION (NC'2000), 2000. \n\n[2] A. Bell and T. Sejnowski. An information-maximization approach to blind sep(cid:173)\n\naration and blind deconvolution. Neurocomputing, 7:1129-1159,1995. \n\n[3] S. Amari, A. Chchocki, and H. Yang. A new learning algorithm for blind signal \n\nseparation. In Advances in Neural Information Processing Systems 8, 1996. \n\n[4] T-W. Lee, M. Girolami, and T. Sejnowski. Independent component analysis us(cid:173)\n\ning an extended informax algorithm for mixed sub-gaussian and super-gaussian \nsources. Neural Computation, 11(2):409-433, 1999. \n\n[5] P. Comon. Independent component analysis: A new concept? Signal Processing, \n\n36:287- 314, 1994. \n\n[6] Dimitri P. Bertsekas. Constrained optimization and Lagrange multiplier meth(cid:173)\n\nods. New York: Academic Press, 1982. \n\n[7] A. Hyvii.rinen and Erkki Oja. Simple neuron models for independent component \n\nanalysis. Neural Systems, 7(6):671- 687, December 1996. \n\n\f", "award": [], "sourceid": 1935, "authors": [{"given_name": "Wei", "family_name": "Lu", "institution": null}, {"given_name": "Jagath", "family_name": "Rajapakse", "institution": null}]}*