{"title": "Using Bayesian Dynamical Systems for Motion Template Libraries", "book": "Advances in Neural Information Processing Systems", "page_first": 297, "page_last": 304, "abstract": "Motor primitives or motion templates have become an important concept for both modeling human motor control as well as generating robot behaviors using imitation learning. Recent impressive results range from humanoid robot movement generation to timing models of human motions. The automatic generation of skill libraries containing multiple motion templates is an important step in robot learning. Such a skill learning system needs to cluster similar movements together and represent each resulting motion template as a generative model which is subsequently used for the execution of the behavior by a robot system. In this paper, we show how human trajectories captured as multidimensional time-series can be clustered using Bayesian mixtures of linear Gaussian state-space models based on the similarity of their dynamics. The appropriate number of templates is automatically determined by enforcing a parsimonious parametrization. As the resulting model is intractable, we introduce a novel approximation method based on variational Bayes, which is especially designed to enable the use of efficient inference algorithms. On recorded human Balero movements, this method is not only capable of finding reasonable motion templates but also yields a generative model which works well in the execution of this complex task on a simulated anthropomorphic SARCOS arm.", "full_text": "Using Bayesian Dynamical Systems for\n\nMotion Template Libraries\n\nSilvia Chiappa, Jens Kober, Jan Peters\n\nMax-Planck Institute for Biological Cybernetics\nSpemannstra\u00dfe 38, 72076 T\u00fcbingen, Germany\n\n{silvia.chiappa,jens.kober,jan.peters}@tuebingen.mpg.de\n\nAbstract\n\nMotor primitives or motion templates have become an important concept for both\nmodeling human motor control as well as generating robot behaviors using imi-\ntation learning. Recent impressive results range from humanoid robot movement\ngeneration to timing models of human motions. The automatic generation of skill\nlibraries containing multiple motion templates is an important step in robot learn-\ning. Such a skill learning system needs to cluster similar movements together and\nrepresent each resulting motion template as a generative model which is subse-\nquently used for the execution of the behavior by a robot system. In this paper,\nwe show how human trajectories captured as multi-dimensional time-series can be\nclustered using Bayesian mixtures of linear Gaussian state-space models based on\nthe similarity of their dynamics. The appropriate number of templates is automat-\nically determined by enforcing a parsimonious parametrization. As the resulting\nmodel is intractable, we introduce a novel approximation method based on varia-\ntional Bayes, which is especially designed to enable the use of ef\ufb01cient inference\nalgorithms. On recorded human Balero movements, this method is not only ca-\npable of \ufb01nding reasonable motion templates but also yields a generative model\nwhich works well in the execution of this complex task on a simulated anthropo-\nmorphic SARCOS arm.\n\n1 Introduction\n\nHumans demonstrate a variety and versatility of movements far beyond the reach of current anthro-\npomorphic robots. It is widely believed that human motor control largely relies on a set of \u201cmental\ntemplates\u201d [1] better known as motor primitives or motion templates. This concept has gained in-\ncreasing attention both in the human motor control literature [1, 2] as well as in robot imitation\nlearning [3, 4]. The recent suggestion of Ijspeert et al.\n[3] to use dynamical systems as motor\nprimitives has allowed this approach to scale in the domain of humanoid robot imitation learning\nand has yielded a variety of interesting applications as well as follow-up publications. However, up\nto now, the focus of motion template learning has largely been on single template acquisition and\nself-improvement. Future motor skill learning systems on the other hand need to be able to observe\nseveral different behaviors from human presenters and compile libraries of motion templates directly\nfrom these examples with as little predetermined structures as possible.\n\nAn important part of such a motor skill learning system is the clustering of many presented move-\nments into different motion templates. Human trajectories are recorded as multi-dimensional time-\nseries of joint angles as well as joint velocities using either a marker-based tracking setup (e.g.,\na VICONT M setup), a sensing suit (e.g., a SARCOS SenSuit) or a haptic interface (e.g., an an-\nthropomorphic master arm). Inspired by Ijspeert et al. [3], we intend to use dynamical systems\nas generative models of the presented trajectories, i.e., as motion templates. Our goal is to cluster\n\n1\n\n\fthese multi-dimensional time-series automatically into a small number of motion templates without\npre-labeling of the trajectories or assuming an a priori number of templates. Thus, the system has\nto discover the underlying motion templates, determine the number of templates as well as learn the\nunderlying skill suf\ufb01ciently well for robot application.\n\nIn principle, one could use a non-generative clustering approach (e.g., a type of K-means) with a\nmethod for selecting an appropriate number of clusters and, subsequently, \ufb01t a generative model to\neach cluster. Here we prefer to take a different approach in which the clustering and learning of the\nunderlying time-series dynamics are performed at the same time. This way we aim at ensuring that\neach obtained cluster can be modeled well by its representative generative model.\n\nTo date the majority of the work on time-series clustering using generative models has focused on\nstatic mixture models. Clustering long or high-dimensional time-series is hard when approached\nwith static models, such that collapsing the trajectories to a few relevant features is often required.\nThis problem would be severe for a high-dimensional motor learning system where the data needs\nto be represented at high sampling rates in order to ensure the capturing of all relevant details for\nmotor skill learning. In addition, it is dif\ufb01cult to ensure smoothness when the time-series display\nhigh variability and, therefore, to obtain accurate generative models with static approaches.\n\nA natural alternative is to use mixtures of temporal models which explicitly model the dynamics of\nthe time-series. In this paper, we use Mixtures of Linear Gaussian State-Space Models (LGSSMs).\nLGSSMs are probabilistic temporal models which, despite their computational simplicity, can repre-\nsent many natural dynamical processes [5]. As we will see later in this paper, LGSSMs are powerful\nenough to model our time-series suf\ufb01ciently accurately.\n\nFor determining the number of clusters, most probabilistic approaches in the past used to train a sep-\narate model for each possible cluster con\ufb01guration, and then select the one which would optimize\nthe trade-off between accuracy and complexity, as measured for example by the Bayesian Informa-\ntion Criterion [6, 7]. The drawback of these approaches is that training many separate models can\nlead to a large computational overhead, such that heuristics are often needed to restrict the number\nof possible cluster con\ufb01gurations [7].\n\nA less computationally expensive alternative is offered by recent Bayesian approaches where the\nmodel parameters are treated as random variables and integrated out yielding the marginal likelihood\nof the data. An appropriate prior distribution can be used to enforce a sparse representation, i.e., to\nselect the smallest set of parameters that explains the data well by making the remaining parameters\ninactive. As a result, the structure selection can be achieved within the model, without the need to\ntrain and compare several separate models.\n\nAs a Bayesian treatment of the Mixtures of Linear Gaussian State-Space Models is intractable, we\nintroduce a deterministic approximation based on variational Bayes. Importantly, our approximation\nis especially designed to enable the use of standard LGSSM inference methods for the hidden state\nvariables, which has the advantage of minimizing numerically instabilities.\n\nAs a realistically dif\ufb01cult scenario in this \ufb01rst step towards large motor skill libraries, we have\nselected the game of dexterity Balero (also known as Ball-In-A-Cup or Kendama, see [8]) as an\nevaluation platform. Several substantially different types of movements exist for performing this\ntask and humans tend to have a large variability in movement execution [9]. From a robotics point\nof view, Balero can be considered suf\ufb01ciently complex as it involves movements in all major seven\ndegrees of freedom of a human arm as well as an anthropomorphic robot arm. We are able to show\nthat the presented method gives rise to a reasonable number of clusters representing quite distinct\nmovements and that the resulting generative models can be used successfully as motion templates\nin physically realistic simulations.\n\nIn the remainder of the paper, we will proceed as follows. We will \ufb01rst introduce a generative\napproach for clustering and modeling multi-dimensional time-series with Bayesian Mixtures of\nLGSSMs and describe how this approach can be made tractable using a variational approximation.\nWe will then show that the resulting model can be used to infer the motion templates underlying a\nset of human demonstrations, and give evidence that the generative model representing each motion\ntemplate is suf\ufb01ciently accurate for control in a mechanically plausible simulation of the SARCOS\nMaster Arm.\n\n2\n\n\f2 Bayesian Mixtures of Linear Gaussian State-Space Models\n\nOur goal is to model both human and robot movements in order to build motion template libraries. In\nthis section, we describe our Bayesian modeling approach and discuss both the underlying assump-\ntions as well as how the structure of the model is selected. As the resulting model is not tractable for\nanalytical solution, we introduce an approximation method based on variational Bayes.\n\n2.1 Modeling Approach\n\nIn our Bayesian approach to Mixtures of Linear Gaussian State-Space Models (LGSSMs), we are\ngiven a set of N time-series1 v1:N\n1:T of length T for which we de\ufb01ne with the following marginal\nlikelihood\n\np(v1:N\n\n1:T | \u02c6\u03981:K, \u03b3) = Xz1:NZ\u03981:K\n\np(v1:N\n\n1:T |z1:N , \u03981:K)p(\u03981:K| \u02c6\u03981:K)Z\u03c0\n\np(z1:N |\u03c0)p(\u03c0|\u03b3),\n\nwhere zn \u2208 {1, . . . , K} indicates which of a set of K LGSSMs generated the sequence vn\n1:T . The\nparameters of LGSSM k are denoted by \u0398k and have a prior distribution depending on hyperparam-\neters \u02c6\u0398k. The K-dimensional vector \u03c0 includes the prior probabilities of the time-series generation\nfor each LGSSM and has prior distribution hyperparameter \u03b3.\n\nThe optimal hyperparameters are estimated by type-II maximum likelihood [10], i.e., by maximizing\nthe marginal likelihood over \u02c6\u03981:K and \u03b3. Clustering can be performed by inferring the LGSSM that\nmost likely generated the sequence vn\n\n1:T by computing arg maxk p(zn = k|v1:N\n\n1:T , \u02c6\u03981:K, \u03b3).\n\n1:N\n\n1:N\n\n1:T |z\n\n, \u02c6\u03981:K). As a generative temporal model for each time-series, we em-\nModeling p(v\nploy a Linear Gaussian State-Space Model [5] that assumes that the observations v1:T , with vt \u2208 \u211cV ,\nare generated from a latent Markovian linear dynamical system with hidden states h1:T , with\nht \u2208 \u211cH , according to2\n\nvt = Bht + \u03b7v\n\nt , \u03b7v\n\nt \u223c N (0V , \u03a3V ),\n\nht = Aht\u22121 + \u03b7h\n\nt , \u03b7h\n\nt \u223c N (\u00b5t, \u03a3H ) .\n\n(1)\n\nStandard LGSSMs assume a zero-mean hidden-state noise (\u00b5t \u2261 0H ). In our application the use of\na time-dependent mean \u00b5t 6= 0H leads to a superior modeling accuracy. A probabilistic formulation\nof the LGSSM is given by\n\np(v1:T , h1:T |\u0398) = p(v1|h1, \u0398)p(h1|\u0398)\n\np(vt|ht, \u0398)p(ht|ht\u22121, \u0398),\n\nT\n\nYt=2\n\nwith p(ht|ht\u22121, \u0398) = N (Aht\u22121 + \u00b5t, \u03a3H), p(h1|\u0398) = N (\u00b51, \u03a3), p(vt|ht, \u0398) = N (Bht, \u03a3V ),\nand \u0398 = {A, B, \u03a3H, \u03a3V , \u00b51:T , \u03a3}. Due to the simple structure of the model, performing inference,\nthat is to compute quantities such as p(ht|v1:T , \u0398), can be ef\ufb01ciently achieved in O(T ) operations.\nIn the presented Bayesian approach, we de\ufb01ne a prior distribution p(\u0398| \u02c6\u0398) over the parameters \u0398\nwhere \u02c6\u0398 are the associated hyperparameters. More speci\ufb01cally, we de\ufb01ne zero-mean Gaussians on\nthe elements of A and on the columns of B by3\n\ne\u2212\n\n\u03b2j\n2 BT\n\nj \u03a3\u22121\n\nV Bj ,\n\nH\n\np(cid:0)A|\u03b1, \u03a3\u22121\nH (cid:1) =\n\nYi,j=1\n\n\u03b11/2\nij\n\np2\u03c0 [\u03a3H ]ii\n\n\u03b1ij\n2 [\u03a3\u22121\n\nH ]ii\n\ne\u2212\n\nA2\n\nij , p(cid:0)B|\u03b2, \u03a3\u22121\nV (cid:1) =\n\nH\n\nYj=1\n\n\u03b2V /2\nj\n\np|2\u03c0\u03a3V |\n\nwhere \u03b1 and \u03b2 are a set of hyperparameters which need to be optimized. We make the assumption\nthat \u03a3\u22121\nV and \u03a3\u22121 are diagonal and de\ufb01ne Gamma distributions on them. For \u00b51 we de\ufb01ne\na zero-mean Gaussian prior, while we formally treat \u00b52:T as hyperparameters and determine their\n\nH , \u03a3\u22121\n\n1:T is a shorthand for\n\n1v1:N\n2Here, N (m, S) denotes a Gaussian with mean m and covariance S, and 0X denotes an X-dimensional\n\n1 , . . . , vN\nT\n\nT , . . . , vN\n\nv1\n1 , . . . , v1\n\n.\n\nzero vector. The initial latent state h1 is drawn from N (\u00b51, \u03a3).\n\n3[X]ij and Xj denote the ij-th element and the j-th column of matrix X respectively. The dependency of\n\nthe priors on \u03a3H and \u03a3V is chosen speci\ufb01cally to render a variational implementation feasible.\n\n3\n\n\b\n\n\t\n\n\foptimal values. These choices are made in order to render our Bayesian treatment feasible and to\nobtain a sparse parametrization, as discussed in more details below.\n\nIn the resulting mixture model, we consider a set of K such Bayesian LGSSMs. The joint distribu-\ntion over all sequences given the indicator variables and hyperparameters is de\ufb01ned as\n\np(v1:N\n\n1:T |z1:N , \u02c6\u03981:K) =Z\u03981:K( N\nYn=1\n\np(vn\n\n1:T |zn, \u03981:K)) K\nYk=1\n\np(\u0398k| \u02c6\u0398k),\n\nwhere p(vn\nparameters \u0398k have been employed to generate it.\n\n1:T |zn = k, \u03981:K) \u2261 p(vn\n\n1:T |\u0398k) denotes the probability of time-series vn\n\n1:T given that\n\nModeling p(cid:0)z\n\n1:N |\u03b3(cid:1). As prior for \u03c0, we de\ufb01ne a symmetric Dirichlet distribution\n\nK\n\np(\u03c0|\u03b3) =\n\n\u0393 (\u03b3)\n\n\u0393(\u03b3/K)K\n\n\u03c0\u03b3/K\u22121\nk\n\n,\n\nYk=1\n\nwhere \u0393(\u00b7) is the Gamma function and \u03b3 denotes a hyperparameter that needs to be optimized. This\ndistribution is conjugate to the multinomial, which greatly simpli\ufb01es our Bayesian treatment. To\nmodel the joint indicator variables, we de\ufb01ne\n\np(z1:N |\u03b3) =Z\u03c0( N\nYn=1\n\np(zn|\u03c0)) p(\u03c0|\u03b3), where p(zn = k|\u03c0) \u2261 \u03c0k.\n\nSuch Bayesian approach favors simple model structures. In particular, the priors on Ak and Bk en-\nforce a sparse parametrization since, during learning, many \u03b1k\nj get close to in\ufb01nity whereby\n(the posterior distribution of) Ak\nj get close to zero (see [11] for an analysis of this pruning ef-\nfect). This enables us to achieve structure selection within the model. Speci\ufb01cally, this approach en-\nsures that the unnecessary LGSSMs are pruned out from the model during training (for certain k, all\n1:T , \u02c6\u03981:K, \u03b3) = 0\nelements of Bk are pruned out such that LGSSM k becomes inactive (p(zn = k|v1:N\nfor all n)).\n\nij and Bk\n\nij and \u03b2k\n\n2.2 Model Intractability and Approximate Solution\n\nThe Bayesian treatment of the model is non-trivial as the integration over the parameters \u03981:K and \u03c0\nrenders the computation of the required posterior distributions intractable. This problem results from\nthe coupling in the posterior distributions between the hidden state variables h1:N\n1:T and the parameters\n\u03981:K as well as between the indicators z1:N and \u03c0, \u03981:K . To deal with this intractability, we use a\ndeterministic approximation method based on variational Bayes.\n\nVariational Approximation.\nmake the following approximation4\n\nIn our variational approach we introduce a new distribution q and\n\np(z1:N , h1:N\n\n1:T , \u03981:K|v1:N\n\n1:T , \u02c6\u03981:K, \u03b3) \u2248 q(h1:N\n\n1:T |z1:N )q(z1:N )q(\u03981:K).\n\n(2)\n\nThat is, we approximate the posterior distribution of the hidden variables of the model by one in\nwhich the hidden states are decoupled from the parameters given the indicator variables and in\nwhich the indicators are decoupled from the parameters.\n\nThe approximation is achieved with a variational expectation-maximization algorithm which min-\nimizes the KL divergence between the right and left hand sides of Equation (2), or, equivalently,\n1:T | \u02c6\u03981:K, \u03b3) \u2265 F( \u02c6\u03981:K, \u03b3, q) with\nmaximizes a tractable lower bound on the log-likelihood log p(v1:N\nrespect to q for \ufb01xed \u02c6\u03981:K and \u03b3 and vice-versa. Observation vn\nt is then placed in the most likely\nLGSSM by computing arg maxk q(zn = k).\n\n4Here, we describe a collapsed approximation over \u03c0 [13]. To simplify the notation, we omit conditioning\n1:T , \u02c6\u03981:K , \u03b3 for the q distribution.\n\non v1:N\n\n4\n\n\fFigure 1: This \ufb01gure shows one of the Balero motion templates found by our clustering method,\ni.e., the cluster C2 in Figure 2. Here, a sideways movement with a subsequent catch is performed\nand the uppermost row illustrates this movement with a symbolic sketch. The middle row shows an\nexecution of the movement generated with the LGSSM representing the cluster C2. The lowest row\nshows a recorded human movement which was attributed to cluster C2 by our method. Note that\nmovements generated from LGSSMs representing other clusters differ signi\ufb01cantly.\n\nResulting Updates. While the space does not suf\ufb01ce for complete derivation, we will brie\ufb02y\nsketch the updates for q. Additional details and the updates for the hyperparameters can be found\nin [12]. The updates consist of a parameter update, an indicator variable update and a latent state\nupdate. First, the approximate parameter posterior is given by\n\n\b\n1:T |zn=k)+hlog p(zn=k|z\u00acn,\u03b3)i\t\n\nN\n\nq(\u0398k) \u221d p(\u0398k| \u02c6\u0398k)e\n\nn=1 q(zn=k)hlog p(vn\n\n1:T ,hn\n\n1:T |\u0398k)iq(hn\n\n1:T\n\n|zn=k),\n\nwhere h\u00b7iq denotes expectation with respect to q. The speci\ufb01c choice for p(\u0398k| \u02c6\u0398k) makes the\ncomputation of this posterior relatively straightforward, since q(\u0398k) is a distribution of the same\ntype. Second, the approximate posterior over the indicator variables is given by\n\nq(zn = k) \u221d e\n\nHq(hn\n\nm6=n q(zm ) e\n\nhlog p(vn\n\n1:T ,hn\n\n1:T |\u0398k)iq(hn\n\n1:T\n\n|zn=k)q(\u0398k ) ,\n\nwhere Hq(x) denotes the entropy of the distribution q(x) and z\u00acn includes all indicator variables\n\nexcept for zn. Due to the choice of a Dirichlet prior, the term p(zn = k|z\u00acn, \u03b3) = R\u03c0 p(zn =\n\nk|z\u00acn, \u03c0)p(\u03c0, \u03b3) can be determined analytically. However, the required average over this term is\ncomputationally expensive, and, thus, we approximate it using a second order expansion [13]. The\nthird and most challenging update is the one of the hidden states\n\nq (hn\n\n1:T |zn = k) \u221d e\n\nhlog p(vn\n\n1:T ,hn\n\n1:T |\u0398k)iq(\u0398k).\n\n(3)\n\nWhilst computing this joint density is relatively straightforward, the parameter and indicator variable\nupdates require the non-trivial estimation of the posterior averages hhn\nto this distribution. Following a similar approach to the one proposed in [14] for the Bayesian\nLGSSM, we reformulate the rhs of Equation (3) as proportional to the distribution of an augmented\nLGSSM such that standard inference routines for the LGSSM can be used.\n\nt\u22121(cid:11) with respect\n\nt i and(cid:10)hn\n\nt hn\n\n3 Results\n\nIn this section we show that the model presented in Section 2 can be used effectively both for\ninferring the motion templates underlying a set of human trajectories and for approximating motion\ntemplates with dynamical systems. For doing so, we take the dif\ufb01cult task of Balero, also known\nas Ball-In-A-Cup or Kendama, and collect human executions of this task using a motion capture\n\n5\n\n\f0.3\n\nC1\n\n0.3\n\nC2\n\n0.3\n\nC3\n\nZ\n\nZ\n\nC4\n\n0.3\n\nZ\n\n\u22121\n1\n\nC5\n\n0.3\n\nZ\n\n\u22121\n1\n\nZ\n\n\u22120.2\n\u22120.3\n\nY\n\n\u22120.2\n\u22120.3\n\n0.4\n\n\u22120.2\n\u22120.3\n\n0.4\n\n0.4\n\nY\n\n0.7\n\nY\n\nX\n\n\u22120.6\n\n0\n\nY\n\n\u22120.6\n\n0\n\nX\n\nY\n\n\u22120.6\n\n0\n\nX\n\nX\n\n\u22120.6\n\n\u22120.5\n\n0.7\n\nX\n\n\u22120.6\n\n\u22120.5\n\n0.3\n\nC6\n\n0.3\n\nC7\n\n0.3\n\nC8\n\n0.3\n\nC9\n\nZ\n\nZ\n\nZ\n\nZ\n\n\u22120.2\n\u22120.3\n\nY\n\n\u22120.2\n\u22120.3\n\n0.4\n\n\u22120.2\n\u22120.3\n\n0.4\n\n\u22120.2\n\u22120.3\n\n0.4\n\n0.4\n\nX\n\n\u22120.6\n\n0\n\nY\n\n\u22120.6\n\n0\n\nX\n\nY\n\n\u22120.6\n\n0\n\nX\n\nY\n\n\u22120.6\n\n0\n\nX\n\nFigure 2: In this \ufb01gure, we show nine plots where each plot represents one cluster found by our\nmethod. Each of the \ufb01ve shown trajectories in the respective clusters represents a different recorded\nBalero movement. For better visualization, we do not show joint trajectories here but rather the\ntrajectories of the cup which have an easier physical interpretation and, additionally, reveal the\ndifferences between the isolated clusters. All axes show units in meters.\n\nsetup. We show that the presented model successfully extracts meaningful human motion templates\nunderlying Balero, and that the movements generated by the model are successful in simulation of\nthe Balero task on an anthropomorphic SARCOS arm.\n\n3.1 Data Generation of Balero Motions\n\nIn the Balero game of dexterity, a human is given a toy consisting of a cup with a ball attached by\na string. The goal of the human is to toss the ball into the cup. Humans perform a wide variety of\ndifferent movements in order to achieve this task [9]. For example, three very distinct movements\nare: (i) swing the hand slightly upwards to the side and then go back to catch the ball, (ii) hold the\ncup high and then move very fast to catch the ball, and (iii) jerk the cup upwards and catch the ball\nin a fast downwards movement. Whilst the difference in these three movements is signi\ufb01cant and\ncan be easily detected visually, there exist many other movements for which this is not the case.\n\nWe collected 124 different Balero trajectories where the subject was free to select the employed\nmovement. For doing so, we used a VICONT M data collection system which samples the trajecto-\nries at 200Hz to track both the cup as well as all seven major degrees of freedom of the human arm.\nFor the evaluation of our method, we considered the seven joint angles of the human presenter as\nwell as the corresponding seven estimated joint velocities.\n\nIn the lowest row of Figure 1, we show how the human motion is collected with a VICONT M motion\ntracking setup. As we will see later, this speci\ufb01c movement is assigned by our method to cluster C2\nwhose representative generative LGSSM can be used successfully for imitating this motion (middle\nrow). A sketch of the represented movement is shown in the top row of Figure 1.\n\n3.2 Clustering and Imitation of Motion Templates\n\nWe trained the variational method with different initial conditions, hidden dimension H = 35 and a\nnumber of clusters K which varied from 20 to 50 in order to avoid suboptimal results due to local\nmaxima.\n\nThe resulting clustering contains nine active motion templates. These are plotted in Figure 2, where,\ninstead of the 14-dimensional joint angles and velocities, we show the three-dimensional cup tra-\njectories resulting from these joint movements, as it is easier for humans to make sense of cartesian\ntrajectories. Clusters C1, C2 and C3 are movements to the side which subsequently catch the ball.\nHere, C1 is a short jerk, C3 appears to have a circular movement similar to a jerky movement, while\nC2 uses a longer but smoother movement to induce kinetic energy in the ball. Motion templates\nC4 and C5 are dropping movements where the cup moves down fast for more than 1.2m and then\n\n6\n\n\fExecution 1\n\nExecution 2\n\nExecution 1\n\nExecution 2\n\n0.5\n\n0\n\n\u22120.4\n\n5\n\n0\n\n\u22124\n\n]\nd\na\nr\n[\ns\nn\no\ni\nt\ni\ns\no\nP\n\n]\ns\n/\nd\na\nr\n[\ns\ne\ni\nt\ni\nc\no\nl\ne\nV\n\n0.5\n\n0\n\n\u22120.4\n\n5\n\n0\n\n\u22124\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.5\n\n0\n\n\u22120.4\n\n5\n\n0\n\n\u22124\n\n0.5\n\n0\n\n\u22120.4\n\n5\n\n0\n\n\u22124\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\n0.16\n\n0.32\n\n0.48\n\n0.64\n\nTime[s]\n\nTime[s]\n\nTime[s]\n\nTime[s]\n\n(a)\n\n(b)\n\nFigure 3: (a) Time-series recorded from two executions of the Balero movement assigned by our\nmodel to cluster C1. In the \ufb01rst and second rows are plotted the positions and velocities respectively\n(for better visualization each time-series component is plotted with its mean removed). (b) Two\nexecutions of the Balero movement generated by our trained model using probability distributions\nof cluster C1.\n\ncatches the ball. The template C5 is a smoother movement than C4 with a wider catching movement.\nFor C6 and C7, we observe a signi\ufb01cantly different movement where the cup is jerked upwards drag-\nging the ball in this direction and then catches the ball on the way down. Clusters C8 and C9 exhibit\nthe most interesting movement where the main motion is forward-backwards and the ball swings\ninto the cup. In C8 this task is achieved by moving upwards at the same time while in C9 there is\nlittle loss of height.\n\nTo generate Balero movements with our trained model, we can use the recursive formulation of the\nLGSSM given by Equation 1 where, for each cluster k, Ak, Bk and \u00b5k\n1 are replaced by the mean\nvalues of their inferred Gaussian q distributions, while the noise covariances are replaced by the\nmodes of their Gamma q distributions. The initial hidden state h1 and the noise elements \u03b7h\nt and\n\u03b7v\nt are sampled from their respective q distributions, whist the inferred optimal values are used for\n\u00b5k\n2:T .\n\nIn Figure 3 (a) we plotted two recorded executions of the Balero task assigned by our model to cluster\nC1. As we can see, the two executions have similar dynamics but also display some differences due\nto human variability in performing the same type of movement. In Figure 3 (b) we plotted two\nexecutions generated by our model using the learned distributions representing cluster C1. Our\nmodel can generate time-series with very similar dynamics to the ones of the recorded time-series.\n\nTo investigate the accuracy of the obtained motion templates, we used them for executing Balero\nmovements on a simulated anthropomorphic SARCOS arm. Inspired by Miyamoto et al. [15], a\nsmall visual feedback term based on a Jacobian transpose method was activated when the ball was\nwithin 3cm in order to ensure task-ful\ufb01llment. We found that our motion templates are accurate\nenough to generate successful task executions. This can be seen in Figure 1 for cluster C2 (middle\nrow) and in the video on the author\u2019s website.\n\n4 Conclusions\n\nIn this paper, we addressed the problem of automatic generation of skill libraries for both robot\nlearning and human motion analysis as a unsupervised time-series clustering and learning problem\nbased on human trajectories. We have introduced a novel Bayesian temporal mixture model based\non a variational approximation method which is especially designed to enable the use of ef\ufb01cient\ninference algorithms. We demonstrated that our model gives rise to a meaningful clustering of\nhuman executions of the dif\ufb01cult game of dexterity Balero and is able to generate time-series which\nare very close to the recorded ones. Finally, we have shown that the model can be used to obtain\nsuccessful executions of the Balero movements on a physically realistic simulation of the SARCOS\nMaster Arm.\n\n7\n\n\f5 Acknowledgments\n\nThe authors would like to thank David Barber for useful discussions and Betty Mohler for help with\ndata collection.\n\nReferences\n\n[1] T. Flash and B. Hochner. Motor primitives in vertebrates and invertebrates. Current Opinion in\n\nNeurobiology, 15(6):660\u2013666, 2005.\n\n[2] B. Williams, M. Toussaint, and A. Storkey. Modelling motion primitives and their timing in\nbiologically executed movements. In Advances in Neural Information Processing Systems 20,\npages 1609\u20131616, 2008.\n\n[3] A. Ijspeert, J. Nakanishi, and S. Schaal. Learning attractor landscapes for learning motor prim-\n\nitives. In Advances in Neural Information Processing Systems 15, pages 1547\u20131554, 2003.\n\n[4] S. Calinon, F. Guenter, and A. Billard. On learning, representing and generalizing a task in a\nhumanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B, 37(2):286\u2013298,\n2007.\n\n[5] J. Durbin and S. J. Koopman. Time Series Analysis by State Space Methods. Oxford Univ. Press,\n\n2001.\n\n[6] Y. Xiong and D-Y. Yeung. Mixtures of ARMA models for model-based time series clustering.\n\nIn Proceedings of the IEEE International Conference on Data Mining, pages 717\u2013720, 2002.\n\n[7] C. Li and G. Biswas. A Bayesian approach to temporal data clustering using hidden Markov\nmodels. In Proceedings of the International Conference on Machine Learning, pages 543\u2013550,\n2000.\n\n[8] J. Kober, B. Mohler and J. Peters. Learning perceptual coupling for motor primitives. Interna-\n\ntional Conference on Intelligent Robots and Systems, pages 834\u2013839, 2008.\n\n[9] S. Fogel, J. Jacob, and C. Smith. Increased sleep spindle activity following simple motor proce-\n\ndural learning in humans. Actas de Fisiologia, 7(123), 2001.\n\n[10] D. J. C. MacKay. Information Theory, Inference and Learning Algorithms. Cambridge Univ.\n\nPress, 2003.\n\n[11] D. Wipf and J. Palmer and B. Rao. Perspectives on Sparse Bayesian Learning. In Advances in\n\nNeural Information Processing Systems 16, 2004.\n\n[12] S. Chiappa and D. Barber. Dirichlet Mixtures of Bayesian Linear Gaussian State-Space Mod-\nels: a Variational Approach. Technical Report no. 161, MPI for Biological Cybernetics, T\u00fcbin-\ngen, Germany, 2007.\n\n[13] K. Kurihara, M. Welling, and Y. W. Teh. Collapsed variational Dirichlet process mixture\nmodels. In Proceedings of the International Joint Conference on Arti\ufb01cial Intelligence, pages\n2796\u20132801, 2007.\n\n[14] D. Barber and S. Chiappa. Uni\ufb01ed inference for variational Bayesian linear Gaussian state-\n\nspace models. In Advances in Neural Information Processing Systems 19, pages 81\u201388, 2007.\n\n[15] H. Miyamoto and S. Schaal and F. Gandolfo and Y. Koike and R. Osu and E. Nakano and\nY. Wada and M. Kawato. A Kendama learning robot based on bi-directional theory. Neural\nNetworks, 9(8): 1281\u20131302, 1996\n\n8\n\n\f", "award": [], "sourceid": 536, "authors": [{"given_name": "Silvia", "family_name": "Chiappa", "institution": null}, {"given_name": "Jens", "family_name": "Kober", "institution": null}, {"given_name": "Jan", "family_name": "Peters", "institution": null}]}