{"title": "Learning to Optimize Tensor Programs", "book": "Advances in Neural Information Processing Systems", "page_first": 3389, "page_last": 3400, "abstract": "We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution are key enablers of effective deep learning systems. However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-supported. The reliance on hardware specific operator libraries limits the applicability of high-level graph optimizations and incurs significant engineering costs when deploying to new hardware targets. We use learning to remove this engineering burden. We learn domain specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants. We further accelerate the search by effective model transfer across workloads. Experimental results show that our framework delivers performance competitive with state-of-the-art hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPU.", "full_text": "Learning to Optimize Tensor Programs\n\nTianqi Chen1 Lianmin Zheng2 Eddie Yan1 Ziheng Jiang1 Thierry Moreau1\n\nLuis Ceze1 Carlos Guestrin1 Arvind Krishnamurthy1\n\n1Paul G. Allen School of Computer Science & Engineering, University of Washington\n\n2Shanghai Jiao Tong University\n\nAbstract\n\nWe introduce a learning-based framework to optimize tensor programs for deep\nlearning workloads. Ef\ufb01cient implementations of tensor operators, such as matrix\nmultiplication and high dimensional convolution, are key enablers of effective\ndeep learning systems. However, current systems rely on manually optimized\nlibraries, e.g., cuDNN, that support only a narrow range of server class GPUs.\nSuch reliance limits the applicability of high-level graph optimizations and incurs\nsigni\ufb01cant engineering costs when deploying to new hardware targets. We use\nlearning to remove this engineering burden. We learn domain-speci\ufb01c statistical\ncost models to guide the search of tensor operator implementations over billions\nof possible program variants. We further accelerate the search using effective\nmodel transfer across workloads. Experimental results show that our framework\ndelivers performance that is competitive with state-of-the-art hand-tuned libraries\nfor low-power CPUs, mobile GPUs, and server-class GPUs.\n\n1\n\nIntroduction\n\nDeep learning (DL) has become ubiquitous in our daily lives. DL models can now recognize\nimages [23], understand natural language [38], play games [27], and automate system decisions (e.g.,\ndevice placement [26] and indexing [21]). Tensor operators, such as matrix multiplication and high\ndimensional convolution, are basic building blocks of DL models. Scalable learning systems [1, 4, 8,\n2] rely on manually optimized, high-performance tensor operation libraries, such as cuDNN, that\nare optimized for a narrow range of hardware devices. To optimize a tensor operator, programmers\nmust choose from many implementations that are logically equivalent but differ dramatically in\nperformance due to differences in threading, memory reuse, pipelining and other hardware factors.\nSupporting diverse hardware back-ends therefore requires tremendous engineering effort. Even\non currently supported hardware, developing DL frameworks and models is limited by the set of\noptimized operators in libraries, preventing optimizations (such as operator fusion) that can produce\nunsupported operators.\nThis research explores the following question: can we use learning to alleviate this engineering\nburden and automatically optimize tensor operator programs for a given hardware platform? Our\naf\ufb01rmative answer is based on statistical cost models we built that predict program run time using a\ngiven low-level program. These cost models, which guide our exploration of the space of possible\nprograms, use transferable representations that generalize across different workloads to accelerate\nsearch. We make the following contributions:\n\u2022 We formalize the new problem of learning to optimize tensor programs and summarize its key\n\u2022 We propose a machine learning framework to solve this problem.\n\u2022 We further accelerate the optimization by 2\u00d7 to 10\u00d7 using transfer learning.\n\ncharacteristics.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fFigure 1: Sample problem. For a given tensor operator speci\ufb01cation (Cij =(cid:80)\n\nk AkiBkj), there are multiple\npossible low-level program implementations, each with different choices of loop order, tiling size, and other\noptions. Each choice creates a logically equivalent program with different performance. Our problem is to\nexplore the space of programs to \ufb01nd the fastest implementation.\n\nWe provide a detailed empirical analysis of component design choices in this framework. Experi-\nmental results on real-world DL workloads show that our framework yields end-to-end performance\nimprovements ranging from 1.2\u00d7 to 3.8\u00d7 over existing frameworks.\n2 Problem Formalization\n\ngeneration, we specify tensor operators using index expressions (e.g., Cij =(cid:80)\n\nWe begin by walking through the motivating example in Figure 1. To enable automatic code\nk AkiBkj). Let E\ndenote the space of index expressions. The index expression leaves many low-level implementation\ndetails, such as loop order, memory scope, and threading unspeci\ufb01ed. As a result, we can generate\nmultiple variants of low-level code that are logically equivalent to the expression for a given e \u2208 E.\nWe use Se to denote the space of possible transformations (schedules) from e to low-level code. For\nan s \u2208 Se, let x = g(e, s) be the generated low-level code. Here, g represents a compiler framework\nthat generates low-level code from e, s. We are interested in minimizing f (x), which is the real run\ntime cost on the hardware. Importantly, we do not know an analytical expression for f (x) but can\nquery it by running experiments on the hardware. For a given tuple of (g, e,Se, f ), our problem can\nbe formalized as the following objective:\n\narg min\n\ns\u2208Se\n\nf (g(e, s))\n\n(1)\n\nThis problem formalization is similar to that of traditional hyper-parameter optimization problems [34,\n33, 35, 13, 17, 25] but with several key differentiating characteristics:\nRelatively Low Experiment Cost. Traditionally, hyper-parameter optimization problems incur\na high cost to query f, viz., running experiments could take hours or days. However, the cost of\ncompiling and running a tensor program is a few seconds. This property requires that model training\nand inference be fast ; otherwise, there is no bene\ufb01t over pro\ufb01ling execution on real hardware. It also\nmeans that we can collect more training data during optimization.\nDomain-Speci\ufb01c Problem Structure. Most existing hyper-parameter optimization algorithms\ntreat the problem as a black box. As we optimize programs, we can leverage their rich structures to\nbuild effective models.\nLarge Quantity of Similar Operators. An end-to-end DL system must optimize tensor operator\nprograms for different input sizes, shapes, and data layout con\ufb01gurations. These tasks are similar and\ncan offer opportunities for transfer learning.\nWe describe two key prerequisites for automatic code generation that is competitive with hand-\noptimized code. (1) We need to de\ufb01ne an exhaustive search space Se that covers all hardware-aware\noptimizations in hand-tuned libraries. (2) We need to ef\ufb01ciently \ufb01nd an optimal schedule in Se.\nThere are many domain-speci\ufb01c languages (DSLs) for code generation [32, 36, 15, 37, 20, 30], each\nwith with a different E, Se and g. Polyhedral models [5, 42, 41] are a popular choice for Se; they\nmodel the loop domains as integer linear constraints. An alternative approach originating from\nHalide [32] de\ufb01nes a schedule space using a set of transformation primitives. Improving Se is an\nimportant research direction that is beyond the scope of this paper; we pick a rich Se and focus on\nschedule optimization in the rest of the paper.\n\n2\n\ne<latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"hP+6LrUf2d3tZaldqaQQvEKMXyw=\">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit><latexit sha1_base64=\"DprgtIzi24Eq9y5/8TyqdCxQO58=\">AAAB3XicbZBLSwMxFIXv+Ky1anXrJlgEV2XGjS4FNy5bsA9oh5JJ77SxmcyQ3BHK0F/gxoUi/i13/hvTx0JbDwQ+zknIvSfKlLTk+9/e1vbO7t5+6aB8WDk6PqmeVto2zY3AlkhVaroRt6ikxhZJUtjNDPIkUtiJJvfzvPOMxspUP9I0wzDhIy1jKTg5q4mDas2v+wuxTQhWUIOVGoPqV3+YijxBTUJxa3uBn1FYcENSKJyV+7nFjIsJH2HPoeYJ2rBYDDpjl84Zsjg17mhiC/f3i4In1k6TyN1MOI3tejY3/8t6OcW3YSF1lhNqsfwozhWjlM23ZkNpUJCaOuDCSDcrE2NuuCDXTdmVEKyvvAnt63rg14OmDyU4hwu4ggBu4A4eoAEtEIDwAm/w7j15r97Hsq4tb9XbGfyR9/kDtZiLlg==</latexit><latexit sha1_base64=\"DprgtIzi24Eq9y5/8TyqdCxQO58=\">AAAB3XicbZBLSwMxFIXv+Ky1anXrJlgEV2XGjS4FNy5bsA9oh5JJ77SxmcyQ3BHK0F/gxoUi/i13/hvTx0JbDwQ+zknIvSfKlLTk+9/e1vbO7t5+6aB8WDk6PqmeVto2zY3AlkhVaroRt6ikxhZJUtjNDPIkUtiJJvfzvPOMxspUP9I0wzDhIy1jKTg5q4mDas2v+wuxTQhWUIOVGoPqV3+YijxBTUJxa3uBn1FYcENSKJyV+7nFjIsJH2HPoeYJ2rBYDDpjl84Zsjg17mhiC/f3i4In1k6TyN1MOI3tejY3/8t6OcW3YSF1lhNqsfwozhWjlM23ZkNpUJCaOuDCSDcrE2NuuCDXTdmVEKyvvAnt63rg14OmDyU4hwu4ggBu4A4eoAEtEIDwAm/w7j15r97Hsq4tb9XbGfyR9/kDtZiLlg==</latexit><latexit sha1_base64=\"f06LPawGe2Q0Ej/v9kIC5ARzRvQ=\">AAAB6HicbVBNT8JAEJ3iF+IX6tHLRmLiibRe9Ej04hESCyTQkO0yhZXtttndmpCGX+DFg8Z49Sd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT5p6yRTDH2WiER1Q6pRcIm+4UZgN1VI41BgJ5zczf3OEyrNE/lgpikGMR1JHnFGjZVaOKjW3Lq7AFknXkFqUKA5qH71hwnLYpSGCap1z3NTE+RUGc4Ezir9TGNK2YSOsGeppDHqIF8cOiMXVhmSKFG2pCEL9fdETmOtp3FoO2NqxnrVm4v/eb3MRDdBzmWaGZRsuSjKBDEJmX9NhlwhM2JqCWWK21sJG1NFmbHZVGwI3urL66R9Vffcutdya43bIo4ynME5XIIH19CAe2iCDwwQnuEV3pxH58V5dz6WrSWnmDmFP3A+fwDILYzl</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit>s1<latexit sha1_base64=\"74/5ryLy7rv4hfaCJ57+tAHgIZ0=\">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPRi8eK9gPaUDbbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvTKUw6HnfTmltfWNzq7xd2dnd2z9wD49aJsk0402WyER3Qmq4FIo3UaDknVRzGoeSt8Px7cxvP3FtRKIecZLyIKZDJSLBKFrpwfT9vlv1at4cZJX4BalCgUbf/eoNEpbFXCGT1Jiu76UY5FSjYJJPK73M8JSyMR3yrqWKxtwE+fzUKTmzyoBEibalkMzV3xM5jY2ZxKHtjCmOzLI3E//zuhlG10EuVJohV2yxKMokwYTM/iYDoTlDObGEMi3srYSNqKYMbToVG4K//PIqaV3UfK/m319W6zdFHGU4gVM4Bx+uoA530IAmMBjCM7zCmyOdF+fd+Vi0lpxi5hj+wPn8AQQSjZs=</latexit><latexit sha1_base64=\"74/5ryLy7rv4hfaCJ57+tAHgIZ0=\">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPRi8eK9gPaUDbbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvTKUw6HnfTmltfWNzq7xd2dnd2z9wD49aJsk0402WyER3Qmq4FIo3UaDknVRzGoeSt8Px7cxvP3FtRKIecZLyIKZDJSLBKFrpwfT9vlv1at4cZJX4BalCgUbf/eoNEpbFXCGT1Jiu76UY5FSjYJJPK73M8JSyMR3yrqWKxtwE+fzUKTmzyoBEibalkMzV3xM5jY2ZxKHtjCmOzLI3E//zuhlG10EuVJohV2yxKMokwYTM/iYDoTlDObGEMi3srYSNqKYMbToVG4K//PIqaV3UfK/m319W6zdFHGU4gVM4Bx+uoA530IAmMBjCM7zCmyOdF+fd+Vi0lpxi5hj+wPn8AQQSjZs=</latexit><latexit sha1_base64=\"74/5ryLy7rv4hfaCJ57+tAHgIZ0=\">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPRi8eK9gPaUDbbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvTKUw6HnfTmltfWNzq7xd2dnd2z9wD49aJsk0402WyER3Qmq4FIo3UaDknVRzGoeSt8Px7cxvP3FtRKIecZLyIKZDJSLBKFrpwfT9vlv1at4cZJX4BalCgUbf/eoNEpbFXCGT1Jiu76UY5FSjYJJPK73M8JSyMR3yrqWKxtwE+fzUKTmzyoBEibalkMzV3xM5jY2ZxKHtjCmOzLI3E//zuhlG10EuVJohV2yxKMokwYTM/iYDoTlDObGEMi3srYSNqKYMbToVG4K//PIqaV3UfK/m319W6zdFHGU4gVM4Bx+uoA530IAmMBjCM7zCmyOdF+fd+Vi0lpxi5hj+wPn8AQQSjZs=</latexit><latexit sha1_base64=\"74/5ryLy7rv4hfaCJ57+tAHgIZ0=\">AAAB6nicbVBNS8NAEJ3Ur1q/oh69LBbBU0lE0GPRi8eK9gPaUDbbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvTKUw6HnfTmltfWNzq7xd2dnd2z9wD49aJsk0402WyER3Qmq4FIo3UaDknVRzGoeSt8Px7cxvP3FtRKIecZLyIKZDJSLBKFrpwfT9vlv1at4cZJX4BalCgUbf/eoNEpbFXCGT1Jiu76UY5FSjYJJPK73M8JSyMR3yrqWKxtwE+fzUKTmzyoBEibalkMzV3xM5jY2ZxKHtjCmOzLI3E//zuhlG10EuVJohV2yxKMokwYTM/iYDoTlDObGEMi3srYSNqKYMbToVG4K//PIqaV3UfK/m319W6zdFHGU4gVM4Bx+uoA530IAmMBjCM7zCmyOdF+fd+Vi0lpxi5hj+wPn8AQQSjZs=</latexit>x1=g(e,s1)<latexit sha1_base64=\"1kEHQ6sAAtWXqv0U9+UgjHcNIHQ=\">AAAB+HicbVBNS8NAEJ3Ur1o/GvXoZbEIFaQkIuhFKHrxWMF+QBvCZrtpl242YXcj1tBf4sWDIl79Kd78N27bHLT1wcDjvRlm5gUJZ0o7zrdVWFldW98obpa2tnd2y/befkvFqSS0SWIey06AFeVM0KZmmtNOIimOAk7bwehm6rcfqFQsFvd6nFAvwgPBQkawNpJvlx99F12hQZWeIuW7J75dcWrODGiZuDmpQI6Gb3/1+jFJIyo04Viprusk2suw1IxwOin1UkUTTEZ4QLuGChxR5WWzwyfo2Ch9FMbSlNBopv6eyHCk1DgKTGeE9VAtelPxP6+b6vDSy5hIUk0FmS8KU450jKYpoD6TlGg+NgQTycytiAyxxESbrEomBHfx5WXSOqu5Ts29O6/Ur/M4inAIR1AFFy6gDrfQgCYQSOEZXuHNerJerHfrY95asPKZA/gD6/MHw9uRMg==</latexit><latexit sha1_base64=\"1kEHQ6sAAtWXqv0U9+UgjHcNIHQ=\">AAAB+HicbVBNS8NAEJ3Ur1o/GvXoZbEIFaQkIuhFKHrxWMF+QBvCZrtpl242YXcj1tBf4sWDIl79Kd78N27bHLT1wcDjvRlm5gUJZ0o7zrdVWFldW98obpa2tnd2y/befkvFqSS0SWIey06AFeVM0KZmmtNOIimOAk7bwehm6rcfqFQsFvd6nFAvwgPBQkawNpJvlx99F12hQZWeIuW7J75dcWrODGiZuDmpQI6Gb3/1+jFJIyo04Viprusk2suw1IxwOin1UkUTTEZ4QLuGChxR5WWzwyfo2Ch9FMbSlNBopv6eyHCk1DgKTGeE9VAtelPxP6+b6vDSy5hIUk0FmS8KU450jKYpoD6TlGg+NgQTycytiAyxxESbrEomBHfx5WXSOqu5Ts29O6/Ur/M4inAIR1AFFy6gDrfQgCYQSOEZXuHNerJerHfrY95asPKZA/gD6/MHw9uRMg==</latexit><latexit sha1_base64=\"1kEHQ6sAAtWXqv0U9+UgjHcNIHQ=\">AAAB+HicbVBNS8NAEJ3Ur1o/GvXoZbEIFaQkIuhFKHrxWMF+QBvCZrtpl242YXcj1tBf4sWDIl79Kd78N27bHLT1wcDjvRlm5gUJZ0o7zrdVWFldW98obpa2tnd2y/befkvFqSS0SWIey06AFeVM0KZmmtNOIimOAk7bwehm6rcfqFQsFvd6nFAvwgPBQkawNpJvlx99F12hQZWeIuW7J75dcWrODGiZuDmpQI6Gb3/1+jFJIyo04Viprusk2suw1IxwOin1UkUTTEZ4QLuGChxR5WWzwyfo2Ch9FMbSlNBopv6eyHCk1DgKTGeE9VAtelPxP6+b6vDSy5hIUk0FmS8KU450jKYpoD6TlGg+NgQTycytiAyxxESbrEomBHfx5WXSOqu5Ts29O6/Ur/M4inAIR1AFFy6gDrfQgCYQSOEZXuHNerJerHfrY95asPKZA/gD6/MHw9uRMg==</latexit><latexit sha1_base64=\"1kEHQ6sAAtWXqv0U9+UgjHcNIHQ=\">AAAB+HicbVBNS8NAEJ3Ur1o/GvXoZbEIFaQkIuhFKHrxWMF+QBvCZrtpl242YXcj1tBf4sWDIl79Kd78N27bHLT1wcDjvRlm5gUJZ0o7zrdVWFldW98obpa2tnd2y/befkvFqSS0SWIey06AFeVM0KZmmtNOIimOAk7bwehm6rcfqFQsFvd6nFAvwgPBQkawNpJvlx99F12hQZWeIuW7J75dcWrODGiZuDmpQI6Gb3/1+jFJIyo04Viprusk2suw1IxwOin1UkUTTEZ4QLuGChxR5WWzwyfo2Ch9FMbSlNBopv6eyHCk1DgKTGeE9VAtelPxP6+b6vDSy5hIUk0FmS8KU450jKYpoD6TlGg+NgQTycytiAyxxESbrEomBHfx5WXSOqu5Ts29O6/Ur/M4inAIR1AFFy6gDrfQgCYQSOEZXuHNerJerHfrY95asPKZA/gD6/MHw9uRMg==</latexit>s2<latexit sha1_base64=\"qN923TPi/PUCw4bkJcCJu6fCh+s=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeiF48V7Ae0oWy2m3bp7ibsToQS+he8eFDEq3/Im//GpM1BWx8MPN6bYWZeEEth0XW/ndLG5tb2Tnm3srd/cHhUPT7p2CgxjLdZJCPTC6jlUmjeRoGS92LDqQok7wbTu9zvPnFjRaQfcRZzX9GxFqFgFHPJDhuVYbXm1t0FyDrxClKDAq1h9WswiliiuEYmqbV9z43RT6lBwSSfVwaJ5TFlUzrm/Yxqqrj108Wtc3KRKSMSRiYrjWSh/p5IqbJ2poKsU1Gc2FUvF//z+gmGN34qdJwg12y5KEwkwYjkj5ORMJyhnGWEMiOyWwmbUEMZZvHkIXirL6+TTqPuuXXv4arWvC3iKMMZnMMleHANTbiHFrSBwQSe4RXeHOW8OO/Ox7K15BQzp/AHzucPOqqNsA==</latexit><latexit sha1_base64=\"qN923TPi/PUCw4bkJcCJu6fCh+s=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeiF48V7Ae0oWy2m3bp7ibsToQS+he8eFDEq3/Im//GpM1BWx8MPN6bYWZeEEth0XW/ndLG5tb2Tnm3srd/cHhUPT7p2CgxjLdZJCPTC6jlUmjeRoGS92LDqQok7wbTu9zvPnFjRaQfcRZzX9GxFqFgFHPJDhuVYbXm1t0FyDrxClKDAq1h9WswiliiuEYmqbV9z43RT6lBwSSfVwaJ5TFlUzrm/Yxqqrj108Wtc3KRKSMSRiYrjWSh/p5IqbJ2poKsU1Gc2FUvF//z+gmGN34qdJwg12y5KEwkwYjkj5ORMJyhnGWEMiOyWwmbUEMZZvHkIXirL6+TTqPuuXXv4arWvC3iKMMZnMMleHANTbiHFrSBwQSe4RXeHOW8OO/Ox7K15BQzp/AHzucPOqqNsA==</latexit><latexit sha1_base64=\"qN923TPi/PUCw4bkJcCJu6fCh+s=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeiF48V7Ae0oWy2m3bp7ibsToQS+he8eFDEq3/Im//GpM1BWx8MPN6bYWZeEEth0XW/ndLG5tb2Tnm3srd/cHhUPT7p2CgxjLdZJCPTC6jlUmjeRoGS92LDqQok7wbTu9zvPnFjRaQfcRZzX9GxFqFgFHPJDhuVYbXm1t0FyDrxClKDAq1h9WswiliiuEYmqbV9z43RT6lBwSSfVwaJ5TFlUzrm/Yxqqrj108Wtc3KRKSMSRiYrjWSh/p5IqbJ2poKsU1Gc2FUvF//z+gmGN34qdJwg12y5KEwkwYjkj5ORMJyhnGWEMiOyWwmbUEMZZvHkIXirL6+TTqPuuXXv4arWvC3iKMMZnMMleHANTbiHFrSBwQSe4RXeHOW8OO/Ox7K15BQzp/AHzucPOqqNsA==</latexit><latexit sha1_base64=\"qN923TPi/PUCw4bkJcCJu6fCh+s=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeiF48V7Ae0oWy2m3bp7ibsToQS+he8eFDEq3/Im//GpM1BWx8MPN6bYWZeEEth0XW/ndLG5tb2Tnm3srd/cHhUPT7p2CgxjLdZJCPTC6jlUmjeRoGS92LDqQok7wbTu9zvPnFjRaQfcRZzX9GxFqFgFHPJDhuVYbXm1t0FyDrxClKDAq1h9WswiliiuEYmqbV9z43RT6lBwSSfVwaJ5TFlUzrm/Yxqqrj108Wtc3KRKSMSRiYrjWSh/p5IqbJ2poKsU1Gc2FUvF//z+gmGN34qdJwg12y5KEwkwYjkj5ORMJyhnGWEMiOyWwmbUEMZZvHkIXirL6+TTqPuuXXv4arWvC3iKMMZnMMleHANTbiHFrSBwQSe4RXeHOW8OO/Ox7K15BQzp/AHzucPOqqNsA==</latexit>x2=g(e,s2)<latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"hP+6LrUf2d3tZaldqaQQvEKMXyw=\">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit><latexit sha1_base64=\"/vh5r0qgY2d6MTk2fGhbnHAqDDg=\">AAAB7XicbZDNSgMxFIXv1L9aqx3dugkWoYKUmW50IwhuXFawrdAOQya904ZmMkOSEWvpk7hxoYiv4863Mf1ZaOuBwMc5CffmRJng2njet1PY2Nza3inulvbK+wcV97Dc1mmuGLZYKlL1EFGNgktsGW4EPmQKaRIJ7ESjm1neeUSleSrvzTjDIKEDyWPOqLFW6Faewga5IoManhMdNs5Ct+rVvbnIOvhLqMJSzdD96vVTlicoDRNU667vZSaYUGU4Ezgt9XKNGWUjOsCuRUkT1MFkvviUnFqnT+JU2SMNmbu/X0xoovU4iezNhJqhXs1m5n9ZNzfxZTDhMssNSrYYFOeCmJTMWiB9rpAZMbZAmeJ2V8KGVFFmbFclW4K/+uV1aDfqvlf37zwowjGcQA18uIBruIUmtIBBDi/wBu/Os/PqfCzqKjjL3o7gj5zPH4bCj9E=</latexit><latexit sha1_base64=\"/vh5r0qgY2d6MTk2fGhbnHAqDDg=\">AAAB7XicbZDNSgMxFIXv1L9aqx3dugkWoYKUmW50IwhuXFawrdAOQya904ZmMkOSEWvpk7hxoYiv4863Mf1ZaOuBwMc5CffmRJng2njet1PY2Nza3inulvbK+wcV97Dc1mmuGLZYKlL1EFGNgktsGW4EPmQKaRIJ7ESjm1neeUSleSrvzTjDIKEDyWPOqLFW6Faewga5IoManhMdNs5Ct+rVvbnIOvhLqMJSzdD96vVTlicoDRNU667vZSaYUGU4Ezgt9XKNGWUjOsCuRUkT1MFkvviUnFqnT+JU2SMNmbu/X0xoovU4iezNhJqhXs1m5n9ZNzfxZTDhMssNSrYYFOeCmJTMWiB9rpAZMbZAmeJ2V8KGVFFmbFclW4K/+uV1aDfqvlf37zwowjGcQA18uIBruIUmtIBBDi/wBu/Os/PqfCzqKjjL3o7gj5zPH4bCj9E=</latexit><latexit sha1_base64=\"mlywhQMQMNoDAb34GBg6JwKRXC4=\">AAAB+HicbVBNS8NAEN3Ur1o/GvXoZbEIFaQkvehFKHrxWMF+QBvCZjtpl242YXcj1tBf4sWDIl79Kd78N27bHLT1wcDjvRlm5gUJZ0o7zrdVWFvf2Nwqbpd2dvf2y/bBYVvFqaTQojGPZTcgCjgT0NJMc+gmEkgUcOgE45uZ33kAqVgs7vUkAS8iQ8FCRok2km+XH/06vsLDKpxj5dfPfLvi1Jw58Cpxc1JBOZq+/dUfxDSNQGjKiVI910m0lxGpGeUwLfVTBQmhYzKEnqGCRKC8bH74FJ8aZYDDWJoSGs/V3xMZiZSaRIHpjIgeqWVvJv7n9VIdXnoZE0mqQdDFojDlWMd4lgIeMAlU84khhEpmbsV0RCSh2mRVMiG4yy+vkna95jo1986pNK7zOIroGJ2gKnLRBWqgW9RELURRip7RK3qznqwX6936WLQWrHzmCP2B9fkDxbCRMA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit><latexit sha1_base64=\"Sgd23IBZtjrPnbTlDNqZsZEDwc8=\">AAAB+HicbVBNS8NAEJ34WetHox69LBahgpSkCHoRil48VrAf0Iaw2W7apZtN2N2INfSXePGgiFd/ijf/jds2B219MPB4b4aZeUHCmdKO822trK6tb2wWtorbO7t7JXv/oKXiVBLaJDGPZSfAinImaFMzzWknkRRHAaftYHQz9dsPVCoWi3s9TqgX4YFgISNYG8m3S49+DV2hQYWeIeXXTn277FSdGdAycXNShhwN3/7q9WOSRlRowrFSXddJtJdhqRnhdFLspYommIzwgHYNFTiiystmh0/QiVH6KIylKaHRTP09keFIqXEUmM4I66Fa9Kbif1431eGllzGRpJoKMl8UphzpGE1TQH0mKdF8bAgmkplbERliiYk2WRVNCO7iy8ukVau6TtW9Oy/Xr/M4CnAEx1ABFy6gDrfQgCYQSOEZXuHNerJerHfrY966YuUzh/AH1ucPxvCRNA==</latexit>default codeloop tilingtiling, map to micro kernel intrinsics for yo in range(1024 / ty): for xo in range(1024 / tx): C[yo*ty:yo*ty+ty][xo*tx:xo*tx+tx] = 0 for k in range(1024): for yi in range(ty): for xi in range(tx): C[yo*ty+yi][xo*tx+xi] += A[k][yo*ty+yi] * B[k][xo*tx+xi]for yo in range(128): for xo in range(128): intrin.fill_zero(C[yo*8:yo*8+8][xo*8:xo*8+8]) for ko in range(128): intrin.fused_gemm8x8_add( C[yo*8:yo*8+8][xo*8:xo*8+8], A[ko*8:ko*8+8][yo*8:yo*8+8], B[ko*8:ko*8+8][xo*8:xo*8+8])for y in range(1024): for x in range(1024): C[y][x] = 0 for k in range(1024): C[y][x] += A[k][y] * B[k][x]yo, xo, yi, xi = s[C].title(y, x, ty, tx)s[C].reorder(yo, xo, k, yi, xi)yo,xo,ko,yi,xi,ki = s[C].title(y,x,k,8,8,8)s[C].tensorize(yi, intrin.gemm8x8)compute expressionA = t.placeholder((1024, 1024))B = t.placeholder((1024, 1024))k = t.reduce_axis((0, 1024))C = t.compute((1024, 1024), lambda y, x: t.sum(A[k, y] * B[k, x], axis=k))x0<latexit sha1_base64=\"R135vjpnIa7C4FCMWzv6Aw3hypQ=\">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2k3bpZhN2N2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3H1FpHssHM0nQj+hQ8pAzaqx0/9R3++WKW3XnIKvEy0kFcjT65a/eIGZphNIwQbXuem5i/Iwqw5nAaamXakwoG9Mhdi2VNELtZ/NTp+TMKgMSxsqWNGSu/p7IaKT1JApsZ0TNSC97M/E/r5ua8MrPuExSg5ItFoWpICYms7/JgCtkRkwsoUxxeythI6ooMzadkg3BW355lbQuqp5b9e4uK/XrPI4inMApnIMHNajDLTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwAKLI2f</latexit><latexit sha1_base64=\"R135vjpnIa7C4FCMWzv6Aw3hypQ=\">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2k3bpZhN2N2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3H1FpHssHM0nQj+hQ8pAzaqx0/9R3++WKW3XnIKvEy0kFcjT65a/eIGZphNIwQbXuem5i/Iwqw5nAaamXakwoG9Mhdi2VNELtZ/NTp+TMKgMSxsqWNGSu/p7IaKT1JApsZ0TNSC97M/E/r5ua8MrPuExSg5ItFoWpICYms7/JgCtkRkwsoUxxeythI6ooMzadkg3BW355lbQuqp5b9e4uK/XrPI4inMApnIMHNajDLTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwAKLI2f</latexit><latexit sha1_base64=\"R135vjpnIa7C4FCMWzv6Aw3hypQ=\">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2k3bpZhN2N2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3H1FpHssHM0nQj+hQ8pAzaqx0/9R3++WKW3XnIKvEy0kFcjT65a/eIGZphNIwQbXuem5i/Iwqw5nAaamXakwoG9Mhdi2VNELtZ/NTp+TMKgMSxsqWNGSu/p7IaKT1JApsZ0TNSC97M/E/r5ua8MrPuExSg5ItFoWpICYms7/JgCtkRkwsoUxxeythI6ooMzadkg3BW355lbQuqp5b9e4uK/XrPI4inMApnIMHNajDLTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwAKLI2f</latexit><latexit sha1_base64=\"R135vjpnIa7C4FCMWzv6Aw3hypQ=\">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2k3bpZhN2N2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4Zua3H1FpHssHM0nQj+hQ8pAzaqx0/9R3++WKW3XnIKvEy0kFcjT65a/eIGZphNIwQbXuem5i/Iwqw5nAaamXakwoG9Mhdi2VNELtZ/NTp+TMKgMSxsqWNGSu/p7IaKT1JApsZ0TNSC97M/E/r5ua8MrPuExSg5ItFoWpICYms7/JgCtkRkwsoUxxeythI6ooMzadkg3BW355lbQuqp5b9e4uK/XrPI4inMApnIMHNajDLTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwAKLI2f</latexit>\fFigure 2: Framework for learning to optimize tensor programs.\n\nWe use primitives from an existing code generation framework [9] to form Se. Our search space\nincludes multi-level tiling on each loop axis, loop ordering, shared memory caching for GPUs, and\nannotations such as unrolling and vectorization. The search space size |Se| can be on the order of\nbillions of possible implementations for a single GPU operator. As we \ufb01nd in section 6 , our choice\nof Se can contain programs competitive with hand-optimized libraries.\n3 Learning to Optimize Tensor Programs\n\nWe propose a machine learning (ML)-based framework to solve this problem. Figure 2 presents\nthe framework and its modules. We build a statistical cost model \u02c6f (x) to estimate the cost of each\nlow-level program x. An exploration module proposes new schedule con\ufb01gurations to run on the\nhardware. The run time statistics are collected in a database D = {(ei, si, ci)}, which can in turn be\nused to update \u02c6f. We discuss module-speci\ufb01c design choices in the following subsections.\n\n3.1 Statistical Cost Model\n\nThe \ufb01rst statistical model we support is based on gradient boosted trees [11](GBTs). We extract\ndomain-speci\ufb01c features from a given low-level abstract syntax tree (AST) x. The features include\nloop structure information (e.g., memory access count and data reuse ratio) and generic annotations\n(e.g., vectorization, unrolling, thread binding). We use XGBoost [7], which has proven to be a strong\nfeature-based model in past problems. Our second model is a TreeGRU[39], which recursively\nencodes a low-level AST into an embedding vector. We map the embedding vector to a \ufb01nal predicted\ncost using a linear layer.\nGBT and TreeGRU represent two distinct ML approaches to problem resolution. Both are valuable,\nbut they offer different bene\ufb01ts. GBT relies on precise feature extraction and makes fast predictions\nusing CPUs. TreeGRU, the deep learning-based approach, is extensible and requires no feature\nengineering, but it lags in training and predictive speed. We apply batching to the TreeGRU model\nand use GPU acceleration to make training and prediction fast enough to be usable in our framework.\n\n3.2 Training Objective Function\n\nWe can choose from multiple objective functions to train a statistical cost model for a given collection\ni( \u02c6f (xi)\u2212 ci)2 , which\nencourages the model to predict cost accurately. On the other hand, as we care only about the relative\norder of program run times rather than their absolute values in the selection process, we can instead\n\nof data D = {(ei, si, ci)}. A common choice is the regression loss function(cid:80)\nuse the following rank loss function [6]:(cid:88)\n\nlog(1 + e\n\n\u2212 sign(ci\u2212cj )( \u02c6f (xi)\u2212 \u02c6f (xj ))).\n\n(2)\n\nWe can use the prediction \u02c6f (x) to select the top-performing implementations.\n\ni,j\n\n3.3 Exploration Module\n\nThe exploration module controls the search loop, which is summarized in Algorithm 1. At each\niteration, it must pick a batch of candidate programs based on \u02c6f (x) and query f (x) on real hardware.\nWe cannot simply enumerate the entire space of Se and pick the top-b candidates due to the size\nof the search space. Instead, we use simulated annealing [19] with \u02c6f (x) as the energy function.\n\n3\n\nExpressionSchedule SpaceExploration ModuleCost ModelHardware EnvironmentCode Generatore<latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"hP+6LrUf2d3tZaldqaQQvEKMXyw=\">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit><latexit sha1_base64=\"DprgtIzi24Eq9y5/8TyqdCxQO58=\">AAAB3XicbZBLSwMxFIXv+Ky1anXrJlgEV2XGjS4FNy5bsA9oh5JJ77SxmcyQ3BHK0F/gxoUi/i13/hvTx0JbDwQ+zknIvSfKlLTk+9/e1vbO7t5+6aB8WDk6PqmeVto2zY3AlkhVaroRt6ikxhZJUtjNDPIkUtiJJvfzvPOMxspUP9I0wzDhIy1jKTg5q4mDas2v+wuxTQhWUIOVGoPqV3+YijxBTUJxa3uBn1FYcENSKJyV+7nFjIsJH2HPoeYJ2rBYDDpjl84Zsjg17mhiC/f3i4In1k6TyN1MOI3tejY3/8t6OcW3YSF1lhNqsfwozhWjlM23ZkNpUJCaOuDCSDcrE2NuuCDXTdmVEKyvvAnt63rg14OmDyU4hwu4ggBu4A4eoAEtEIDwAm/w7j15r97Hsq4tb9XbGfyR9/kDtZiLlg==</latexit><latexit sha1_base64=\"DprgtIzi24Eq9y5/8TyqdCxQO58=\">AAAB3XicbZBLSwMxFIXv+Ky1anXrJlgEV2XGjS4FNy5bsA9oh5JJ77SxmcyQ3BHK0F/gxoUi/i13/hvTx0JbDwQ+zknIvSfKlLTk+9/e1vbO7t5+6aB8WDk6PqmeVto2zY3AlkhVaroRt6ikxhZJUtjNDPIkUtiJJvfzvPOMxspUP9I0wzDhIy1jKTg5q4mDas2v+wuxTQhWUIOVGoPqV3+YijxBTUJxa3uBn1FYcENSKJyV+7nFjIsJH2HPoeYJ2rBYDDpjl84Zsjg17mhiC/f3i4In1k6TyN1MOI3tejY3/8t6OcW3YSF1lhNqsfwozhWjlM23ZkNpUJCaOuDCSDcrE2NuuCDXTdmVEKyvvAnt63rg14OmDyU4hwu4ggBu4A4eoAEtEIDwAm/w7j15r97Hsq4tb9XbGfyR9/kDtZiLlg==</latexit><latexit sha1_base64=\"f06LPawGe2Q0Ej/v9kIC5ARzRvQ=\">AAAB6HicbVBNT8JAEJ3iF+IX6tHLRmLiibRe9Ej04hESCyTQkO0yhZXtttndmpCGX+DFg8Z49Sd589+4QA8KvmSSl/dmMjMvTAXXxnW/ndLG5tb2Tnm3srd/cHhUPT5p6yRTDH2WiER1Q6pRcIm+4UZgN1VI41BgJ5zczf3OEyrNE/lgpikGMR1JHnFGjZVaOKjW3Lq7AFknXkFqUKA5qH71hwnLYpSGCap1z3NTE+RUGc4Ezir9TGNK2YSOsGeppDHqIF8cOiMXVhmSKFG2pCEL9fdETmOtp3FoO2NqxnrVm4v/eb3MRDdBzmWaGZRsuSjKBDEJmX9NhlwhM2JqCWWK21sJG1NFmbHZVGwI3urL66R9Vffcutdya43bIo4ynME5XIIH19CAe2iCDwwQnuEV3pxH58V5dz6WrSWnmDmFP3A+fwDILYzl</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit><latexit sha1_base64=\"NPruYLn66/puOzAMtMM3tSFgc5w=\">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipiYNyxa26C5B14uWkAjkag/JXfxizNEJpmKBa9zw3MX5GleFM4KzUTzUmlE3oCHuWShqh9rPFoTNyYZUhCWNlSxqyUH9PZDTSehoFtjOiZqxXvbn4n9dLTXjjZ1wmqUHJlovCVBATk/nXZMgVMiOmllCmuL2VsDFVlBmbTcmG4K2+vE7aV1XPrXrN60r9No+jCGdwDpfgQQ3qcA8NaAEDhGd4hTfn0Xlx3p2PZWvByWdO4Q+czx/JbYzp</latexit>e,s<latexit sha1_base64=\"EC80YnDk9bIj8j/Ofs/0DuTf/ps=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBg5REBD0WvXisYD+gDWWznbRLdzdhdyOU0L/gxYMiXv1D3vw3Jm0O2vpg4PHeDDPzglhwY1332ymtrW9sbpW3Kzu7e/sH1cOjtokSzbDFIhHpbkANCq6wZbkV2I01UhkI7ASTu9zvPKE2PFKPdhqjL+lI8ZAzanMJL4gZVGtu3Z2DrBKvIDUo0BxUv/rDiCUSlWWCGtPz3Nj6KdWWM4GzSj8xGFM2oSPsZVRRicZP57fOyFmmDEkY6ayUJXP190RKpTFTGWSdktqxWfZy8T+vl9jwxk+5ihOLii0WhYkgNiL542TINTIrphmhTPPsVsLGVFNms3gqWQje8surpH1Z99y693BVa9wWcZThBE7hHDy4hgbcQxNawGAMz/AKb450Xpx352PRWnKKmWP4A+fzB1tgjcY=</latexit><latexit sha1_base64=\"EC80YnDk9bIj8j/Ofs/0DuTf/ps=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBg5REBD0WvXisYD+gDWWznbRLdzdhdyOU0L/gxYMiXv1D3vw3Jm0O2vpg4PHeDDPzglhwY1332ymtrW9sbpW3Kzu7e/sH1cOjtokSzbDFIhHpbkANCq6wZbkV2I01UhkI7ASTu9zvPKE2PFKPdhqjL+lI8ZAzanMJL4gZVGtu3Z2DrBKvIDUo0BxUv/rDiCUSlWWCGtPz3Nj6KdWWM4GzSj8xGFM2oSPsZVRRicZP57fOyFmmDEkY6ayUJXP190RKpTFTGWSdktqxWfZy8T+vl9jwxk+5ihOLii0WhYkgNiL542TINTIrphmhTPPsVsLGVFNms3gqWQje8surpH1Z99y693BVa9wWcZThBE7hHDy4hgbcQxNawGAMz/AKb450Xpx352PRWnKKmWP4A+fzB1tgjcY=</latexit><latexit sha1_base64=\"EC80YnDk9bIj8j/Ofs/0DuTf/ps=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBg5REBD0WvXisYD+gDWWznbRLdzdhdyOU0L/gxYMiXv1D3vw3Jm0O2vpg4PHeDDPzglhwY1332ymtrW9sbpW3Kzu7e/sH1cOjtokSzbDFIhHpbkANCq6wZbkV2I01UhkI7ASTu9zvPKE2PFKPdhqjL+lI8ZAzanMJL4gZVGtu3Z2DrBKvIDUo0BxUv/rDiCUSlWWCGtPz3Nj6KdWWM4GzSj8xGFM2oSPsZVRRicZP57fOyFmmDEkY6ayUJXP190RKpTFTGWSdktqxWfZy8T+vl9jwxk+5ihOLii0WhYkgNiL542TINTIrphmhTPPsVsLGVFNms3gqWQje8surpH1Z99y693BVa9wWcZThBE7hHDy4hgbcQxNawGAMz/AKb450Xpx352PRWnKKmWP4A+fzB1tgjcY=</latexit><latexit sha1_base64=\"EC80YnDk9bIj8j/Ofs/0DuTf/ps=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBbBg5REBD0WvXisYD+gDWWznbRLdzdhdyOU0L/gxYMiXv1D3vw3Jm0O2vpg4PHeDDPzglhwY1332ymtrW9sbpW3Kzu7e/sH1cOjtokSzbDFIhHpbkANCq6wZbkV2I01UhkI7ASTu9zvPKE2PFKPdhqjL+lI8ZAzanMJL4gZVGtu3Z2DrBKvIDUo0BxUv/rDiCUSlWWCGtPz3Nj6KdWWM4GzSj8xGFM2oSPsZVRRicZP57fOyFmmDEkY6ayUJXP190RKpTFTGWSdktqxWfZy8T+vl9jwxk+5ihOLii0WhYkgNiL542TINTIrphmhTPPsVsLGVFNms3gqWQje8surpH1Z99y693BVa9wWcZThBE7hHDy4hgbcQxNawGAMz/AKb450Xpx352PRWnKKmWP4A+fzB1tgjcY=</latexit>\u02c6f(x)<latexit sha1_base64=\"3YWuE+22s48EHVzAe+Zi3sO8Uu4=\">AAAB8XicbVBNS8NAEJ34WetX1aOXxSLUS0lE0GPRi8cK9gPbUDbbTbt0swm7E7GE/gsvHhTx6r/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9ci1EbG6x3HC/YgOlAgFo2ilh+6QYhZOKk9nvVLZrbozkGXi5aQMOeq90le3H7M04gqZpMZ0PDdBP6MaBZN8UuymhieUjeiAdyxVNOLGz2YXT8ipVfokjLUthWSm/p7IaGTMOApsZ0RxaBa9qfif10kxvPIzoZIUuWLzRWEqCcZk+j7pC80ZyrEllGlhbyVsSDVlaEMq2hC8xZeXSfO86rlV7+6iXLvO4yjAMZxABTy4hBrcQh0awEDBM7zCm2OcF+fd+Zi3rjj5zBH8gfP5Az4nkJ4=</latexit><latexit sha1_base64=\"3YWuE+22s48EHVzAe+Zi3sO8Uu4=\">AAAB8XicbVBNS8NAEJ34WetX1aOXxSLUS0lE0GPRi8cK9gPbUDbbTbt0swm7E7GE/gsvHhTx6r/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9ci1EbG6x3HC/YgOlAgFo2ilh+6QYhZOKk9nvVLZrbozkGXi5aQMOeq90le3H7M04gqZpMZ0PDdBP6MaBZN8UuymhieUjeiAdyxVNOLGz2YXT8ipVfokjLUthWSm/p7IaGTMOApsZ0RxaBa9qfif10kxvPIzoZIUuWLzRWEqCcZk+j7pC80ZyrEllGlhbyVsSDVlaEMq2hC8xZeXSfO86rlV7+6iXLvO4yjAMZxABTy4hBrcQh0awEDBM7zCm2OcF+fd+Zi3rjj5zBH8gfP5Az4nkJ4=</latexit><latexit sha1_base64=\"3YWuE+22s48EHVzAe+Zi3sO8Uu4=\">AAAB8XicbVBNS8NAEJ34WetX1aOXxSLUS0lE0GPRi8cK9gPbUDbbTbt0swm7E7GE/gsvHhTx6r/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9ci1EbG6x3HC/YgOlAgFo2ilh+6QYhZOKk9nvVLZrbozkGXi5aQMOeq90le3H7M04gqZpMZ0PDdBP6MaBZN8UuymhieUjeiAdyxVNOLGz2YXT8ipVfokjLUthWSm/p7IaGTMOApsZ0RxaBa9qfif10kxvPIzoZIUuWLzRWEqCcZk+j7pC80ZyrEllGlhbyVsSDVlaEMq2hC8xZeXSfO86rlV7+6iXLvO4yjAMZxABTy4hBrcQh0awEDBM7zCm2OcF+fd+Zi3rjj5zBH8gfP5Az4nkJ4=</latexit><latexit sha1_base64=\"3YWuE+22s48EHVzAe+Zi3sO8Uu4=\">AAAB8XicbVBNS8NAEJ34WetX1aOXxSLUS0lE0GPRi8cK9gPbUDbbTbt0swm7E7GE/gsvHhTx6r/x5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjjVjDdYLGPdDqjhUijeQIGStxPNaRRI3gpGN1O/9ci1EbG6x3HC/YgOlAgFo2ilh+6QYhZOKk9nvVLZrbozkGXi5aQMOeq90le3H7M04gqZpMZ0PDdBP6MaBZN8UuymhieUjeiAdyxVNOLGz2YXT8ipVfokjLUthWSm/p7IaGTMOApsZ0RxaBa9qfif10kxvPIzoZIUuWLzRWEqCcZk+j7pC80ZyrEllGlhbyVsSDVlaEMq2hC8xZeXSfO86rlV7+6iXLvO4yjAMZxABTy4hBrcQh0awEDBM7zCm2OcF+fd+Zi3rjj5zBH8gfP5Az4nkJ4=</latexit>f(x)<latexit sha1_base64=\"Appt6dOASLoU0puF9XJna1LvMt4=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWMG2hDWWz3bRLdzdhdyOW0L/gxYMiXv1D3vw3btoctPXBwOO9GWbmhQln2rjut1NaW9/Y3CpvV3Z29/YPqodHbR2nilCfxDxW3RBrypmkvmGG026iKBYhp51wcpv7nUeqNIvlg5kmNBB4JFnECDa5FNWfzgfVmttw50CrxCtIDQq0BtWv/jAmqaDSEI617nluYoIMK8MIp7NKP9U0wWSCR7RnqcSC6iCb3zpDZ1YZoihWtqRBc/X3RIaF1lMR2k6BzVgve7n4n9dLTXQdZEwmqaGSLBZFKUcmRvnjaMgUJYZPLcFEMXsrImOsMDE2nooNwVt+eZW0Lxqe2/DuL2vNmyKOMpzAKdTBgytowh20wAcCY3iGV3hzhPPivDsfi9aSU8wcwx84nz9sX43R</latexit><latexit sha1_base64=\"Appt6dOASLoU0puF9XJna1LvMt4=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWMG2hDWWz3bRLdzdhdyOW0L/gxYMiXv1D3vw3btoctPXBwOO9GWbmhQln2rjut1NaW9/Y3CpvV3Z29/YPqodHbR2nilCfxDxW3RBrypmkvmGG026iKBYhp51wcpv7nUeqNIvlg5kmNBB4JFnECDa5FNWfzgfVmttw50CrxCtIDQq0BtWv/jAmqaDSEI617nluYoIMK8MIp7NKP9U0wWSCR7RnqcSC6iCb3zpDZ1YZoihWtqRBc/X3RIaF1lMR2k6BzVgve7n4n9dLTXQdZEwmqaGSLBZFKUcmRvnjaMgUJYZPLcFEMXsrImOsMDE2nooNwVt+eZW0Lxqe2/DuL2vNmyKOMpzAKdTBgytowh20wAcCY3iGV3hzhPPivDsfi9aSU8wcwx84nz9sX43R</latexit><latexit sha1_base64=\"Appt6dOASLoU0puF9XJna1LvMt4=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWMG2hDWWz3bRLdzdhdyOW0L/gxYMiXv1D3vw3btoctPXBwOO9GWbmhQln2rjut1NaW9/Y3CpvV3Z29/YPqodHbR2nilCfxDxW3RBrypmkvmGG026iKBYhp51wcpv7nUeqNIvlg5kmNBB4JFnECDa5FNWfzgfVmttw50CrxCtIDQq0BtWv/jAmqaDSEI617nluYoIMK8MIp7NKP9U0wWSCR7RnqcSC6iCb3zpDZ1YZoihWtqRBc/X3RIaF1lMR2k6BzVgve7n4n9dLTXQdZEwmqaGSLBZFKUcmRvnjaMgUJYZPLcFEMXsrImOsMDE2nooNwVt+eZW0Lxqe2/DuL2vNmyKOMpzAKdTBgytowh20wAcCY3iGV3hzhPPivDsfi9aSU8wcwx84nz9sX43R</latexit><latexit sha1_base64=\"Appt6dOASLoU0puF9XJna1LvMt4=\">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWMG2hDWWz3bRLdzdhdyOW0L/gxYMiXv1D3vw3btoctPXBwOO9GWbmhQln2rjut1NaW9/Y3CpvV3Z29/YPqodHbR2nilCfxDxW3RBrypmkvmGG026iKBYhp51wcpv7nUeqNIvlg5kmNBB4JFnECDa5FNWfzgfVmttw50CrxCtIDQq0BtWv/jAmqaDSEI617nluYoIMK8MIp7NKP9U0wWSCR7RnqcSC6iCb3zpDZ1YZoihWtqRBc/X3RIaF1lMR2k6BzVgve7n4n9dLTXQdZEwmqaGSLBZFKUcmRvnjaMgUJYZPLcFEMXsrImOsMDE2nooNwVt+eZW0Lxqe2/DuL2vNmyKOMpzAKdTBgytowh20wAcCY3iGV3hzhPPivDsfi9aSU8wcwx84nz9sX43R</latexit>x=g(e,s)<latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"hP+6LrUf2d3tZaldqaQQvEKMXyw=\">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit><latexit sha1_base64=\"z2KSz7Kif4ZV+s/8P9zK5Mr1Z7I=\">AAAB53icbZBLSwMxFIXv1FetVatbN8EiVJAy40Y3guDGZQX7gOlQMmmmDc0kQ3JHLKU/w40LRfxH7vw3po+Fth4IfJyTkHtPnElh0fe/vcLG5tb2TnG3tFfePzisHJVbVueG8SbTUptOTC2XQvEmCpS8kxlO01jydjy6m+XtJ26s0OoRxxmPUjpQIhGMorPCZ3JDBjV+Qex5r1L16/5cZB2CJVRhqUav8tXta5anXCGT1Now8DOMJtSgYJJPS93c8oyyER3w0KGiKbfRZD7ylJw5p08SbdxRSObu7xcTmlo7TmN3M6U4tKvZzPwvC3NMrqOJUFmOXLHFR0kuCWoy25/0heEM5dgBZUa4WQkbUkMZupZKroRgdeV1aF3WA78ePPhQhBM4hRoEcAW3cA8NaAIDDS/wBu8eeq/ex6Kugrfs7Rj+yPv8AcRGjlw=</latexit><latexit sha1_base64=\"z2KSz7Kif4ZV+s/8P9zK5Mr1Z7I=\">AAAB53icbZBLSwMxFIXv1FetVatbN8EiVJAy40Y3guDGZQX7gOlQMmmmDc0kQ3JHLKU/w40LRfxH7vw3po+Fth4IfJyTkHtPnElh0fe/vcLG5tb2TnG3tFfePzisHJVbVueG8SbTUptOTC2XQvEmCpS8kxlO01jydjy6m+XtJ26s0OoRxxmPUjpQIhGMorPCZ3JDBjV+Qex5r1L16/5cZB2CJVRhqUav8tXta5anXCGT1Now8DOMJtSgYJJPS93c8oyyER3w0KGiKbfRZD7ylJw5p08SbdxRSObu7xcTmlo7TmN3M6U4tKvZzPwvC3NMrqOJUFmOXLHFR0kuCWoy25/0heEM5dgBZUa4WQkbUkMZupZKroRgdeV1aF3WA78ePPhQhBM4hRoEcAW3cA8NaAIDDS/wBu8eeq/ex6Kugrfs7Rj+yPv8AcRGjlw=</latexit><latexit sha1_base64=\"zWSX7SEAPOAMI/A+45JBiFTQx0U=\">AAAB8nicbVBNS8NAEJ34WetX1aOXxSJUkJJ40YtQ9OKxgv2ANJTNdtIu3WzC7kYspT/DiwdFvPprvPlv3LY5aOuDgcd7M8zMC1PBtXHdb2dldW19Y7OwVdze2d3bLx0cNnWSKYYNlohEtUOqUXCJDcONwHaqkMahwFY4vJ36rUdUmifywYxSDGLalzzijBor+U/kmvQreE70WbdUdqvuDGSZeDkpQ456t/TV6SUsi1EaJqjWvuemJhhTZTgTOCl2Mo0pZUPaR99SSWPUwXh28oScWqVHokTZkobM1N8TYxprPYpD2xlTM9CL3lT8z/MzE10FYy7TzKBk80VRJohJyPR/0uMKmREjSyhT3N5K2IAqyoxNqWhD8BZfXibNi6rnVr17t1y7yeMowDGcQAU8uIQa3EEdGsAggWd4hTfHOC/Ou/Mxb11x8pkj+APn8wfz/4+1</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit><latexit sha1_base64=\"7SfFS4wAO3Vo/fTFR5rvB5I+J3s=\">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHoRil48VrAfsF1KNs22odlkSWbFsvRnePGgiFd/jTf/jWm7B219MPB4b4aZeWEiuAHX/XYKK6tr6xvFzdLW9s7uXnn/oGVUqilrUiWU7oTEMMElawIHwTqJZiQOBWuHo9up335k2nAlH2CcsCAmA8kjTglYyX/C13hQZWfYnPbKFbfmzoCXiZeTCsrR6JW/un1F05hJoIIY43tuAkFGNHAq2KTUTQ1LCB2RAfMtlSRmJshmJ0/wiVX6OFLalgQ8U39PZCQ2ZhyHtjMmMDSL3lT8z/NTiK6CjMskBSbpfFGUCgwKT//Hfa4ZBTG2hFDN7a2YDokmFGxKJRuCt/jyMmmd1zy35t1fVOo3eRxFdISOURV56BLV0R1qoCaiSKFn9IreHHBenHfnY95acPKZQ/QHzucP9T+PuQ==</latexit>D<latexit sha1_base64=\"1Z6CzjBl0OMVztfQ+m452YDkcY0=\">AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPppB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueeMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGGdHIqWCzSj81LCF0QkasZ6kkMTNBNo88c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X//N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTommlC0LVVsCf7yyaukfVH3vbr/cFlr3BR1lOEETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg==</latexit><latexit sha1_base64=\"1Z6CzjBl0OMVztfQ+m452YDkcY0=\">AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPppB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueeMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGGdHIqWCzSj81LCF0QkasZ6kkMTNBNo88c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X//N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTommlC0LVVsCf7yyaukfVH3vbr/cFlr3BR1lOEETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg==</latexit><latexit sha1_base64=\"1Z6CzjBl0OMVztfQ+m452YDkcY0=\">AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPppB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueeMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGGdHIqWCzSj81LCF0QkasZ6kkMTNBNo88c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X//N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTommlC0LVVsCf7yyaukfVH3vbr/cFlr3BR1lOEETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg==</latexit><latexit sha1_base64=\"1Z6CzjBl0OMVztfQ+m452YDkcY0=\">AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIRdFnUhcsK9gFtKJPppB06mQkzN0IJ/Qw3LhRx69e482+ctFlo64GBwzn3MueeMBHcoOd9O6W19Y3NrfJ2ZWd3b/+genjUNirVlLWoEkp3Q2KY4JK1kKNg3UQzEoeCdcLJbe53npg2XMlHnCYsiMlI8ohTglbq9WOCY0pEdjcbVGte3ZvDXSV+QWpQoDmofvWHiqYxk0gFMabnewkGGdHIqWCzSj81LCF0QkasZ6kkMTNBNo88c8+sMnQjpe2T6M7V3xsZiY2ZxqGdzCOaZS8X//N6KUbXQcZlkiKTdPFRlAoXlZvf7w65ZhTF1BJCNbdZXTommlC0LVVsCf7yyaukfVH3vbr/cFlr3BR1lOEETuEcfLiCBtxDE1pAQcEzvMKbg86L8+58LEZLTrFzDH/gfP4AdN2RWg==</latexit>Objective FunctionSe<latexit sha1_base64=\"tAVAC9U/qBVzQqUzJzuAwTDLLCw=\">AAAB9HicbVDLSgMxFL1TX7W+qi7dBIvgqsyIoMuiG5cV7QPaoWTS2zY0kxmTTKEM/Q43LhRx68e482/MtLPQ1gOBwzn3ck9OEAuujet+O4W19Y3NreJ2aWd3b/+gfHjU1FGiGDZYJCLVDqhGwSU2DDcC27FCGgYCW8H4NvNbE1SaR/LRTGP0QzqUfMAZNVbyuyE1I0ZF+jDrYa9ccavuHGSVeDmpQI56r/zV7UcsCVEaJqjWHc+NjZ9SZTgTOCt1E40xZWM6xI6lkoao/XQeekbOrNIng0jZJw2Zq783UhpqPQ0DO5mF1MteJv7ndRIzuPZTLuPEoGSLQ4NEEBORrAHS5wqZEVNLKFPcZiVsRBVlxvZUsiV4y19eJc2LqudWvfvLSu0mr6MIJ3AK5+DBFdTgDurQAAZP8Ayv8OZMnBfn3flYjBacfOcY/sD5/AEIKZJB</latexit><latexit sha1_base64=\"tAVAC9U/qBVzQqUzJzuAwTDLLCw=\">AAAB9HicbVDLSgMxFL1TX7W+qi7dBIvgqsyIoMuiG5cV7QPaoWTS2zY0kxmTTKEM/Q43LhRx68e482/MtLPQ1gOBwzn3ck9OEAuujet+O4W19Y3NreJ2aWd3b/+gfHjU1FGiGDZYJCLVDqhGwSU2DDcC27FCGgYCW8H4NvNbE1SaR/LRTGP0QzqUfMAZNVbyuyE1I0ZF+jDrYa9ccavuHGSVeDmpQI56r/zV7UcsCVEaJqjWHc+NjZ9SZTgTOCt1E40xZWM6xI6lkoao/XQeekbOrNIng0jZJw2Zq783UhpqPQ0DO5mF1MteJv7ndRIzuPZTLuPEoGSLQ4NEEBORrAHS5wqZEVNLKFPcZiVsRBVlxvZUsiV4y19eJc2LqudWvfvLSu0mr6MIJ3AK5+DBFdTgDurQAAZP8Ayv8OZMnBfn3flYjBacfOcY/sD5/AEIKZJB</latexit><latexit sha1_base64=\"tAVAC9U/qBVzQqUzJzuAwTDLLCw=\">AAAB9HicbVDLSgMxFL1TX7W+qi7dBIvgqsyIoMuiG5cV7QPaoWTS2zY0kxmTTKEM/Q43LhRx68e482/MtLPQ1gOBwzn3ck9OEAuujet+O4W19Y3NreJ2aWd3b/+gfHjU1FGiGDZYJCLVDqhGwSU2DDcC27FCGgYCW8H4NvNbE1SaR/LRTGP0QzqUfMAZNVbyuyE1I0ZF+jDrYa9ccavuHGSVeDmpQI56r/zV7UcsCVEaJqjWHc+NjZ9SZTgTOCt1E40xZWM6xI6lkoao/XQeekbOrNIng0jZJw2Zq783UhpqPQ0DO5mF1MteJv7ndRIzuPZTLuPEoGSLQ4NEEBORrAHS5wqZEVNLKFPcZiVsRBVlxvZUsiV4y19eJc2LqudWvfvLSu0mr6MIJ3AK5+DBFdTgDurQAAZP8Ayv8OZMnBfn3flYjBacfOcY/sD5/AEIKZJB</latexit><latexit sha1_base64=\"tAVAC9U/qBVzQqUzJzuAwTDLLCw=\">AAAB9HicbVDLSgMxFL1TX7W+qi7dBIvgqsyIoMuiG5cV7QPaoWTS2zY0kxmTTKEM/Q43LhRx68e482/MtLPQ1gOBwzn3ck9OEAuujet+O4W19Y3NreJ2aWd3b/+gfHjU1FGiGDZYJCLVDqhGwSU2DDcC27FCGgYCW8H4NvNbE1SaR/LRTGP0QzqUfMAZNVbyuyE1I0ZF+jDrYa9ccavuHGSVeDmpQI56r/zV7UcsCVEaJqjWHc+NjZ9SZTgTOCt1E40xZWM6xI6lkoao/XQeekbOrNIng0jZJw2Zq783UhpqPQ0DO5mF1MteJv7ndRIzuPZTLuPEoGSLQ4NEEBORrAHS5wqZEVNLKFPcZiVsRBVlxvZUsiV4y19eJc2LqudWvfvLSu0mr6MIJ3AK5+DBFdTgDurQAAZP8Ayv8OZMnBfn3flYjBacfOcY/sD5/AEIKZJB</latexit>experiment feedbackupdatehistory data\f: Transformation space Se\n\nAlgorithm 1: Learning to Optimize Tensor Programs\nInput\nOutput : Selected schedule con\ufb01guration s\u2217\nD \u2190 \u2205\nwhile n_trials < max_n_trials do\n\n// Pick the next promising batch\nQ \u2190 run parallel simulated annealing to collect candidates in Se using energy function \u02c6f\nS \u2190 run greedy submodular optimization to pick (1 \u2212 \u0001)b-subset from Q by maximizing Equation 3\nS \u2190 S \u222a { Randomly sample \u0001b candidates. }\n// Run measurement on hardware environment\nfor s in S do\n\nc \u2190 f (g(e, s)); D \u2190 D \u222a {(e, s, c)}\n\nend\n// Update cost model\nupdate \u02c6f using D\nn_trials \u2190 n_trials + b\n\nend\ns\u2217 \u2190 history best schedule con\ufb01guration\n\nSpeci\ufb01cally, we use a batch of parallel Markov chains to improve the prediction throughput of the\nstatistical cost model. We select the top-performing batch of candidates to run on real hardware. The\ncollected performance data is used to update \u02c6f. We make the states of the Markov chains persistent\nacross \u02c6f updates. We also apply the \u0001-greedy to select \u0001b (e.g. 0.05) candidates randomly to ensure\nexploration.\nDiversity-Aware Exploration. We consider both quality and diversity when selecting b candidates\nfor hardware evaluation. Assume that the schedule con\ufb01guration s can be decomposed into m\ncomponents s = [s1, s2,\u00b7\u00b7\u00b7 sm]. We maximize the following objective to select candidate set S from\nthe top \u03bbb candidates:\n\n\u02c6f (g(e, s)) + \u03b1\n\n| \u222as\u2208S {sj}|\n\n(3)\n\nL(S) = \u2212(cid:88)\n\ns\u2208S\n\nm(cid:88)\n\nj=1\n\nThe \ufb01rst term encourages us to pick candidates with low run time costs. The second term counts the\nnumber of different con\ufb01guration components that are covered by S. L(S) is a submodular function,\nand we can apply the greedy algorithm [29, 22] to get an approximate solution.\nUncertainty Estimator. Bayesian optimization methods [34, 33, 35, 17] use acquisition functions\nother than the mean when an uncertainty estimate of \u02c6f is available. Typical choices include expected\nimprovement (EI) and upper con\ufb01dence bound (UCB). We can use bootstrapping to get the model\u2019s\nuncertainty estimate and validate the effectiveness of these methods. As we will see in section 6\n, considering uncertainty does not improve the search in our problem. However, the choice of\nacquisition function remains a worthy candidate for further exploration.\n\n4 Accelerating Optimization via Transfer Learning\n\n(cid:48) to speed up the optimization.\n\nThus far, we have focused only on learning to optimize a single tensor operator workload. In practice,\nwe need to optimize for many tensor operators with different input shapes and data types. In a real\n(cid:48) from previously seen workloads. We can apply\nworld setting, the system collects historical data D\ntransfer learning to effectively use D\nThe key to transfer learning is to create a transferable representation that is invariant to the source\nand target domains. We can then share the cost model using the common representation across\ndomains. Different choices of representations may have different levels of invariance.\nA common practice in Bayesian optimization methods is to directly use con\ufb01guration s as the model\u2019s\ninput. However, the search space speci\ufb01cation can change for different workloads or when the\nuser speci\ufb01es a new search space for the same workload. The con\ufb01guration representation s is not\ninvariant to changes in the search space.\n\n4\n\n\fFigure 3: Possible ways to encode the low-level loop AST.\n\nWorkload Name\n\nC1\n\nH, W\nIC, OC\nK, S\n\n224,224\n\n3,64\n7,2\n\nC2\n56,56\n64,64\n3,1\n\nC3\n56,56\n64,64\n1,1\n\nC4\n56,56\n64,128\n\n3,2\n\nC5\n56,56\n64,128\n\n1,2\n\nC6\n28,28\n128,128\n\n3,1\n\nC7\n28,28\n128,256\n\n3,2\n\nC8\n28,28\n128,256\n\n1,2\n\nC9\n14,14\n256,256\n\n3,1\n\nC10\n14,14\n256,512\n\n3,2\n\nC11\n14,14\n256,512\n\n1,2\n\nC12\n7,7\n\n512,512\n\n3,1\n\nTable 1: Con\ufb01gurations of all conv2d operators in a single batch ResNet-18 inference. H,W denotes height and\nwidth, IC input channels, OC output channels, K kernel size, and S stride size.\n\nOn the other hand, the low-level loop AST x (Figure 3a) is a shared representation of programs that\nis invariant to the search space. To leverage this invariance, our cost model \u02c6f (x) takes the low-level\nloop AST x as input. We also need to encode x into a vector space to perform prediction. The speci\ufb01c\nencoding of x can also result in different levels of invariance.\nContext Relation Features for GBT. We de\ufb01ne context features at each loop level to represent\nloop characteristics. A simple representation of context features is a vector (e.g., in Figure 3b, where\neach loop has a row of features). Context features are informative but, crucially, cannot generalize\nacross different loop nest patterns; we de\ufb01ne context relation features to overcome this issue.\nTo build context relation features, we treat context vectors as a bag of points and extract fea-\ntures that model relations between feature axes. Formally, let Z be the context feature matrix\nsuch that Zki corresponds to the i-th feature of loop k. We de\ufb01ne a set of log2-spaced con-\nstant thresholds \u03b2 = [\u03b21, \u03b22,\u00b7\u00b7\u00b7 \u03b2m]. The relation feature between feature i and j is de\ufb01ned\nas: R(ij)\nt = maxk\u2208{k|Zkj <\u03b2t} Zki. This encoding summarizes useful relations, such as loop count\nvs. touched memory size (related to the memory hierarchy of the access), that affect run time cost.\nContext Encoded TreeGRU. The invariant representation also exists for the neural-based model.\nFigure 3c shows a way to encode the program by learning an embedding vector for each identi\ufb01er\nand summarizing the AST using TreeGRU. This model works well for modeling a single workload.\nHowever, the set of loop variables can change across different domains, and we do not have embedding\nfor the new loop variables. We instead encode each loop variable using the context vector extracted\nfor GBT to summarize the AST (Figure 3d). We scatter each loop level, embedding h into m vectors\nusing the rule outi = softmax(W T h)ih. Conceptually, the softmax classi\ufb01es the loop level into a\nmemory slot in out. Then, we sum the scattered vectors of all loop levels to get the \ufb01nal embedding.\nOnce we have a transferable representation, we can use a simple transfer learning method by\ncombining a global model and an in-domain local model, as follows:\n\n\u02c6f (x) = \u02c6f (global)(x) + \u02c6f (local)(x).\n\n(4)\n\nThe global model \u02c6f (global)(x) is trained on D\neffective initial predictions before we have suf\ufb01cient data to \ufb01t \u02c6f (local)(x).\n\n(cid:48) using the invariant representation; it helps to make\n\n5 Prior Work\n\nBlack box optimization (auto-tuning) is used in high-performance computing libraries such as\nATLAS [43] and FFTW [12]. Alternatively, a hardware-dependent cost model can be built to guide\nthe search [28, 5]. Polyhedral methods [5, 42] use integer linear programming to optimize cost.\nTensor Comprehensions [41] combine both approaches, using black-box optimization to choose\nparameters of thread blocks and polyhedral optimization to generate internal loops. Black-box\napproaches can require many experiment trials to explore a huge Se. On the other hand, prede\ufb01ned\ncost models may not be suf\ufb01ciently accurate to capture the complexity of modern hardware and must\nbe manually rede\ufb01ned for each new hardware target.\n\n5\n\nfor y in range(8): for x in range(8): C[y][x]=0 for k in range(8): C[y][x]+=A[k][y]*B[k][x]forxforforykCAByxkykxembeddingforyxkCABCABy646464x8864k188y1x8k64touched memoryouter looplength(a) Low level AST(b) Loop context vectorsforcontext vec of xforforcontext vec of ycontext vec of k(c) Vanilla TreeGRU(d) Context Encoded TreeGRU+soft scatter\ufb01nalembedding\fFigure 4: Statistical cost model vs. genetic algorithm (GA) and random search (Random) evaluated on NVIDIA\nTITAN X. \u2019Number of trials\u2019 corresponds to number of evaluations on the real hardware. We also conducted\ntwo hardware evaluations per trial in Random \u00d72 and GA \u00d72. Both the GBT- and TreeGRU-based models\nconverged faster and achieved better results than the black-box baselines.\n\nFigure 5: Rank vs. Regression objective function evaluated on NVIDIA TITAN X. The rank-based objective\neither outperformed or performed the same as the regression-based objective in presented results.\n\nPreviously, statistical cost models have been applied to optimize SAT solvers [17, 18]. We apply this\nidea to our problem and build a domain-speci\ufb01c cost model that enables effective transfer among\nworkloads. A recent trend is to use deep neural networks to perform program analysis [3, 10]. Our\nnew problem setting and experiment environment can serve as a testbed for unexplored research\nopportunities in related directions.\n\n6 Experiments\n\n6.1 Component Evaluations\n\nWe \ufb01rst evaluated each design choice in the framework. Component evaluations were based on\nconvolution workloads in ResNet-18 [14] for ImageNet classi\ufb01cation (Table 1). Due to space\nlimitations, we show component evaluation results only on representative workloads; the complete\nset of results is reported in the supplementary material. All methods compared in this subsection\nwere initialized with no historical data. Section 6.2 evaluates the transfer learning setting.\nImportance of Statistical Cost Model. Figure 4 compares the performance of the statistical cost\nmodel to black-box methods. Both the GBT and TreeGRU models outperformed the black-box\nmethods and found operators that were 2\u00d7 faster than those found with random searches. This\nresult is particularly interesting compared to prior results in hyper-parameter tuning [25], where\nmodel-based approaches were shown to work only as well as random searching. Our statistical\nmodels bene\ufb01t from domain-speci\ufb01c modeling and help the framework \ufb01nd better con\ufb01gurations.\nChoice of Objective Function. We compared the two objective functions in Figure 5 on both types\nof models. In most cases, we found that using a rank-based objective was slightly better than using a\nregression-based one: the rank-based objective may have sidestepped the potentially challenging task\nof modeling absolute cost values. We chose rank as our default objective.\nImpact of Diversity-Aware Exploration. We evaluated the impact of the diversity-aware explo-\nration objective in Figure 6. Most of the workloads we evaluated showed no positive or negative\nimpact for diversity-based selection. However, diversity-aware exploration improved C6, which\nshows some potential usefulness to the approach. We adopted the diversity-aware strategy since it\ncan be helpful, has no meaningful negative impact, and negligibly affects run time.\n\n6\n\n0200400600800NumberofTrials123TFLOPSC10200400600800NumberofTrials23C2GBTTreeGRUGAGAX2RandomRandomX20200400600800NumberofTrials0.51.0C50200400600800NumberofTrials12C60200400600800NumberofTrials1.52.02.5TFLOPSC10200400600800NumberofTrials23C2GBTRankTreeGRURankGBTRegressionTreeGRURegression0200400600800NumberofTrials0.500.751.00C50200400600800NumberofTrials1.52.02.5C6\fFigure 6: Impact of diversity-aware selection with different choices of \u03bb evaluated on NVIDIA TITAN X.\nDiversity-aware selection had no positive or negative impact on most of the evaluated workloads.\n\nFigure 7: Impact of uncertainty-aware acquisition functions evaluated on NVIDIA TITAN X. Uncertainty-aware\nacquisition functions yielded no improvements in our evaluations.\n\nFigure 8: Impact of transfer learning. Transfer-based models quickly found better solutions.\n\nImpact of Uncertainty Estimator. We evaluated the usefulness of uncertainty-aware acquisition\nfunctions in Figure 7. The uncertainty measurement was achieved by training \ufb01ve models using\nbootstrapping. We used the regression objective in this setting\u2014similar to its use in most Bayesian\noptimization methods. Results show that uncertainty estimation was not as important in our problem,\npossibly because our models were trained with more training samples than traditional hyper-parameter\noptimization problems.\n\n6.2 Transfer Learning Evaluations\n\nThe evaluations presented so far used no historical data. This subsection evaluates the improvements\nobtainable with transfer learning.\nImprovements by Transfer. We \ufb01rst evaluated general improvements made possible by transfer\n(cid:48) collected from C1,C2,C3,C4,C5,C6 and used them\nlearning. We randomly picked samples from D\nto form the source domain (30000 samples in the TITAN X experiment and 20000 samples in the\nARM GPU and ARM A53 experiments). We then compared the performance of transfer-enabled\nmethods to learning from scratch for target workloads C7,C8,C9. Results are shown in Figure 8.\nOverall, using transfer learning yielded a 2\u00d7 to 10\u00d7 speedup. This approach is especially important\nfor real DL compilation systems, which continuously optimize incoming workloads.\nInvariant Representation and Domain Distance. As discussed in Section 4, different representa-\ntions have different levels of invariance. We used three scenarios to study the relationship between\ndomain distance and the invariance of feature representations: (1) running optimizations on only one\ntarget domain; (2) C1\u2013C6\u21927: C1\u2013C6 as source domain and C7 as target domain (transfer within same\noperator type); (3) C1\u2013C6\u2192Matmul-1024: C1\u2013C6 as source domain and matrix multiplication as\n\n7\n\n0200400600800NumberofTrials1.52.02.5TFLOPSC10200400600800NumberofTrials23C2\u03bb=1\u03bb=2\u03bb=40200400600800NumberofTrials0.500.751.00C50200400600800NumberofTrials12C60200400600800NumberofTrials1.52.02.5TFLOPSC10200400600800NumberofTrials23C2ExpectedImprovementUpperCon\ufb01denceBoundMean0200400600800NumberofTrials0.500.751.00C50200400600800NumberofTrials1.01.52.02.5C650100150Number of Trials0.00.51.0TFLOPSC1C6 -> C750100150Number of Trials0.00.20.4C1C6 -> C8GBT-TransferTreeGRU-TransferGBTTreeGRU50100150Number of Trials01C1C6 -> C950100150Number of Trials02C1C6 -> Matmul-1024\fFigure 9: Comparison of different representations in different transfer domain settings. The con\ufb01guration-based\nmodel can be viewed as a typical Bayesian optimization approach (batched version of SMAC [17]). We found\nthat models using con\ufb01guration space features worked well within a domain but were less useful across domains.\nThe \ufb02attened AST features worked well when transferring across convolution workloads but were not useful\nacross operator types. Context relation representation allowed effective transfer across operator types.\n\n(a) Optimization curves in wall clock time. (We set cuDNN v7, Tensor\ufb02ow Lite and ARM Com-\nputeLibrary v18.03 as the baselines for TITAN X, ARM A53 and ARM Mali-T860, respectively.)\n\n(b) NVIDIA TITAN X Single Op\n\n(c) ARM Cortex-A53 Single Op\n\nFigure 10: Single operator performance on the TITAN X and ARM CPU. (Additional ARM GPU (Mali) results\nare provided in the supplementary material.) We also included a weight pre-transformed Winograd kernel [24]\nfor 3 \u00d7 3 conv2d (AutoTVM PT). AutoTVM generated programs that were competitive with hardware-speci\ufb01c\nlibraries.\n\ntarget domain (transfer across operator types). Results ( Figure 9) show the need for more invariance\nwhen domains are farther apart. Using our transferable feature representation, our model generalized\nacross different input shapes and operator types. We also ran a preliminary study on transfer from\nan ARM Mali GPU to an ARM Cortex-A53 ( Figure 9d), showing that the proposed representation\nenabled transfer across devices. Developing an invariant feature representation poses a dif\ufb01cult\nproblem worthy of additional research.\n\n6.3 End-to-End Evaluation\n\nThus far, our evaluation has focused on speci\ufb01c design choices in our framework. We now segue to\nthe natural follow-up question: can learning to optimize tensor programs improve real-world deep\nlearning systems on diverse hardware targets? We call our framework AutoTVM. We compared our\napproach to existing DL frameworks backed by highly engineered hardware-speci\ufb01c libraries on\ndiverse hardware back-ends: a server class GPU, an embedded CPU, and a mobile GPU. Note that\nAutoTVM performs optimization and code generation with no external operator library.\nWe \ufb01rst evaluated single-operator optimization against baselines that used hardware-speci\ufb01c li-\nbraries. The baselines were: cuDNN v7 for the NVIDIA GPU, TFLite(commit: 7558b085) for\nthe Cortex-A53, and the ARM Compute Library (v18.03) for the ARM Mali GPU. We also in-\n\n8\n\n0250500750Number of Trials(a)0.00.51.0TFLOPSTITANXC7 in domain50100150Number of Trials(b)0.00.51.0TITANXC1C6 -> C7GBT on Configuration SGBT on Flatten Loop Context xGBT on Context Relation RGBT No Transfer50100150Number of Trials(c)02TITANXC1C6 -> Matmul-102450100150Number of Trials(d)0.0000.005Mali GPU C1C6 -> A53 CPU C702505007501000Time (second)01Relative SpeedupC7 on NVIDIA TITAN X02505007501000Time (second)01C8 on NVIDIA TITAN XBaselineTensorComprehensionsRandomAutoTVMAutoTVM Transfer0100200300Time (second)0.51.01.5C7 on ARM Cortex-A530100200300Time (second)0.00.51.0C7 on ARM Mali-T860C1C2C3C4C5C6C7C8C9C10C11C120.01.02.03.0Relative SpeedupcuDNNTensorComprehensionsAutoTVMAutoTVM PTC1C2C3C4C5C6C7C8C9C10C11C120.01.02.03.0Relative SpeedupTensorflow LiteAutoTVM\f(a) NVIDIA TITAN X End2End\n\n(b) ARM Cortex-A53 End2End (c) ARM Mali-T860 End2End\n\nFigure 11: End-to-end performance across back-ends. 2AutoTVM outperforms the baseline methods.\n\ncluded TensorComprehensions (commit: ef644ba) [41] as an additional baseline for the TITAN X 1\nTensorComprehensions used 2 random seeds \u00d725 generations \u00d7200 population for each operator,\nand padding was removed (TC does not yet support padding). The results are shown in Figure 10.\nAutoTVM generated high-performance tensor programs across different hardware back-ends.\nFurther, we embedded our framework into an existing DL graph compiler stack and performed end-\nto-end workload evaluation. We evaluated real world end-to-end DL inference workloads, including\nResNet [14], MobileNet [16], LSTM Language Model [44], Deep Q Network (DQN) [27], and Deep\nConvolutional Generative Adversarial Networks (DCGAN) [31]. Our baselines were: MXNet (v1.1),\nTensor\ufb02ow (v1.7) for the GPU, TFLite(commit: 7558b085) for the Cortex A53, and ARM Compute\nLibrary (v18.03) for the ARM Mali GPU. Results are summarized in Figure 11. AutoTVM improved\nend-to-end performance by 1.2\u00d7 to 3.8\u00d7. These improvements were due to both tensor program\noptimization and operator fusion optimizations; the latter would otherwise be impossible if we used\nlibraries with a limited set of operators.\n\n7 Discussion and Conclusion\n\nWe presented AutoTVM: a machine learning-based framework that automatically optimizes the\nimplementation of tensor operators in deep learning systems. Our statistical cost model allows\neffective model sharing between workloads and speeds up the optimization process via model transfer.\nThe positive experimental results of this new approach show promise for DL deployment. Beyond\nour solution framework, the speci\ufb01c characteristics of this new problem make it an ideal testbed\nfor innovations in related areas, such as neural program modeling, Bayesian optimization, transfer\nlearning, and reinforcement learning. On the systems side, learning to optimize tensor programs can\nenable more fused operators, data layouts, and data types across diverse hardware back-ends\u2014crucial\nto improving DL systems. Our framework can be found at https://tvm.ai.\n\nAcknowledgement\n\nWe would like to thank members of Sampa, SAMPL and Systems groups at the Allen School for their\nfeedback on the work and manuscript. This work was supported in part by a Google PhD Fellowship\nfor Tianqi Chen, ONR award #N00014-16-1-2795, NSF under grants CCF-1518703, CNS-1614717,\nand CCF-1723352, and gifts from Intel (under the CAPA program), Oracle, Huawei and anonymous\nsources.\n\nReferences\n\n[1] Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin,\nSanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga,\nSherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin\n\n1According to personal communication [40], TC is not yet intended for use in compute-bound problems.\n\nHowever, it still provides a good reference baseline for inclusion in the comparison.\n\n2DCGAN and LSTM were not reported on A53 and Mali because they are not yet supported by baseline\n\nsystems.\n\n9\n\nResNet-18MobileNet0.01.02.03.04.05.06.07.0Time(ms)LSTM LMDQNDCGAN0.00.10.20.30.40.50.60.70.80.9Tensorflow XLATensorflowMXNetAutoTVMResNet-18MobileNet0.0100.0200.0300.0400.0500.0600.0700.0800.0Time(ms)DQN0.02.04.06.08.010.012.0Tensorflow LiteAutoTVMResNet-18MobileNet0.050.0100.0150.0200.0250.0Time(ms)DQN0.01.02.03.04.05.0ARMComputeLibAutoTVM\fWicke, Yuan Yu, and Xiaoqiang Zheng. Tensor\ufb02ow: A system for large-scale machine learning. In 12th\nUSENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265\u2013283, 2016.\n\n[2] Amit Agarwal, Eldar Akchurin, Chris Basoglu, Guoguo Chen, Scott Cyphers, Jasha Droppo, Adam\nEversole, Brian Guenter, Mark Hillebrand, Ryan Hoens, Xuedong Huang, Zhiheng Huang, Vladimir\nIvanov, Alexey Kamenev, Philipp Kranen, Oleksii Kuchaiev, Wolfgang Manousek, Avner May, Bhaskar\nMitra, Olivier Nano, Gaizka Navarro, Alexey Orlov, Marko Padmilac, Hari Parthasarathi, Baolin Peng,\nAlexey Reznichenko, Frank Seide, Michael L. Seltzer, Malcolm Slaney, Andreas Stolcke, Yongqiang\nWang, Huaming Wang, Kaisheng Yao, Dong Yu, Yu Zhang, and Geoffrey Zweig. An introduction to\ncomputational networks and the computational network toolkit. Technical Report MSR-TR-2014-112,\nAugust 2014.\n\n[3] Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. Learning to represent programs with\n\ngraphs. In International Conference on Learning Representations, 2018.\n\n[4] Fr\u00e9d\u00e9ric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron,\nNicolas Bouchard, and Yoshua Bengio. Theano: new features and speed improvements. Deep Learning\nand Unsupervised Feature Learning NIPS 2012 Workshop, 2012.\n\n[5] Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral\nparallelizer and locality optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming\nLanguage Design and Implementation, PLDI \u201908, pages 101\u2013113. ACM, 2008.\n\n[6] Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender.\nLearning to rank using gradient descent. In Proceedings of the 22Nd International Conference on Machine\nLearning, ICML \u201905, pages 89\u201396, New York, NY, USA, 2005. ACM.\n\n[7] Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In Proceedings of the 22Nd\nACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD \u201916, pages\n785\u2013794, New York, NY, USA, 2016. ACM.\n\n[8] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan\nZhang, , and Zheng Zhang. MXNet: A \ufb02exible and ef\ufb01cient machine learning library for heterogeneous\ndistributed systems. In Neural Information Processing Systems, Workshop on Machine Learning Systems\n(LearningSys\u201915), 2015.\n\n[9] Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen,\nLeyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. Tvm: An automated\nend-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems\nDesign and Implementation (OSDI 18), 2018.\n\n[10] Xinyun Chen, Chang Liu, and Dawn Song. Tree-to-tree neural networks for program translation. CoRR,\n\nabs/1802.03691, 2018.\n\n[11] J.H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics,\n\n29(5):1189\u20131232, 2001.\n\n[12] M. Frigo and S. G. Johnson. Fftw: an adaptive software architecture for the fft. In Acoustics, Speech and\nSignal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, volume 3, pages\n1381\u20131384 vol.3, May 1998.\n\n[13] Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. Google\nvizier: A service for black-box optimization. In Proceedings of the 23rd ACM SIGKDD International\nConference on Knowledge Discovery and Data Mining, KDD \u201917, pages 1487\u20131495. ACM, 2017.\n\n[14] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks.\n\narXiv preprint arXiv:1603.05027, 2016.\n\n[15] Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. Futhark:\nPurely functional gpu-programming with nested parallelism and in-place array updates. In Proceedings of\nthe 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017,\npages 556\u2013571, New York, NY, USA, 2017. ACM.\n\n[16] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand,\nMarco Andreetto, and Hartwig Adam. Mobilenets: Ef\ufb01cient convolutional neural networks for mobile\nvision applications. CoRR, abs/1704.04861, 2017.\n\n10\n\n\f[17] Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general\nalgorithm con\ufb01guration. In Proceedings of the 5th International Conference on Learning and Intelligent\nOptimization, LION\u201905, pages 507\u2013523, Berlin, Heidelberg, 2011. Springer-Verlag.\n\n[18] Frank Hutter, Lin Xu, Holger Hoos, and Kevin Leyton-Brown. Algorithm runtime prediction: Methods\nand evaluation (extended abstract). In Proceedings of the Twenty-Fourth International Joint Conference on\nArti\ufb01cial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 4197\u20134201, 2015.\n\n[19] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing.\n\n220(4598):671\u2013680, 1983.\n\nScience,\n\n[20] Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. The tensor\n\nalgebra compiler. Proc. ACM Program. Lang., 1(OOPSLA):77:1\u201377:29, October 2017.\n\n[21] Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. The case for learned index\n\nstructures. CoRR, abs/1712.01208, 2017.\n\n[22] Andreas Krause and Daniel Golovin. Submodular function maximization. In Tractability: Practical\n\nApproaches to Hard Problems. Cambridge University Press, February 2014.\n\n[23] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classi\ufb01cation with deep convolutional\n\nneural networks. In Advances in Neural Information Processing Systems 25, pages 1097\u20131105. 2012.\n\n[24] Andrew Lavin and Scott Gray. Fast algorithms for convolutional neural networks. In 2016 IEEE Conference\non Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages\n4013\u20134021, 2016.\n\n[25] Lisha Li, Kevin G. Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. Ef\ufb01cient\n\nhyperparameter optimization and in\ufb01nitely many armed bandits. CoRR, abs/1603.06560, 2016.\n\n[26] Azalia Mirhoseini, Hieu Pham, Quoc V. Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar,\nMohammad Norouzi, Samy Bengio, and Jeff Dean. Device placement optimization with reinforcement\nlearning. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney,\nNSW, Australia, 6-11 August 2017, pages 2430\u20132439, 2017.\n\n[27] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare,\nAlex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through\ndeep reinforcement learning. Nature, 518(7540):529, 2015.\n\n[28] Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, and Kayvon Fatahalian.\nAutomatically scheduling halide image processing pipelines. ACM Trans. Graph., 35(4):83:1\u201383:11, July\n2016.\n\n[29] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of approximations for\n\nmaximizing submodular set functions\u2014i. Mathematical Programming, 14(1):265\u2013294, 1978.\n\n[30] Shoumik Palkar, James J. Thomas, Deepak Narayanan, Anil Shanbhag, Rahul Palamuttam, Holger Pirk,\nMalte Schwarzkopf, Saman P. Amarasinghe, Samuel Madden, and Matei Zaharia. Weld: Rethinking the\ninterface between data-intensive applications. CoRR, abs/1709.06416, 2017.\n\n[31] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep\n\nconvolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.\n\n[32] Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fr\u00e9do Durand, and Saman\nAmarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation\nin image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming\nLanguage Design and Implementation, PLDI \u201913, pages 519\u2013530, New York, NY, USA, 2013. ACM.\n\n[33] B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. Taking the human out of the loop: A\n\nreview of bayesian optimization. Proceedings of the IEEE, 104(1):148\u2013175, Jan 2016.\n\n[34] Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. Practical bayesian optimization of machine learning\nalgorithms. In Proceedings of the 25th International Conference on Neural Information Processing Systems\n- Volume 2, NIPS\u201912, pages 2951\u20132959, USA, 2012.\n\n[35] Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md.\nMostofa Ali Patwary, Prabhat Prabhat, and Ryan P. Adams. Scalable bayesian optimization using deep\nneural networks. In Proceedings of the 32Nd International Conference on International Conference on\nMachine Learning - Volume 37, ICML\u201915, pages 2171\u20132180, 2015.\n\n11\n\n\f[36] Michel Steuwer, Toomas Remmelg, and Christophe Dubach. Lift: A functional data-parallel ir for\nhigh-performance gpu code generation. In Proceedings of the 2017 International Symposium on Code\nGeneration and Optimization, CGO \u201917, pages 74\u201385, Piscataway, NJ, USA, 2017. IEEE Press.\n\n[37] Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown, Hassan Cha\ufb01, Michael Wu, Anand R. Atreya, Kunle\nOlukotun, Tiark Rompf, and Martin Odersky. Optiml: An implicitly parallel domain-speci\ufb01c language for\nmachine learning. In Proceedings of the 28th International Conference on International Conference on\nMachine Learning, ICML\u201911, pages 609\u2013616, USA, 2011.\n\n[38] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. In\nProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2,\nNIPS\u201914, pages 3104\u20133112, Cambridge, MA, USA, 2014. MIT Press.\n\n[39] Kai Sheng Tai, Richard Socher, and Christopher D Manning. Improved semantic representations from\n\ntree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075, 2015.\n\n[40] Nicolas Vasilache. personal communication.\n\n[41] Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S.\nMoses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. Tensor comprehensions: Framework-\nagnostic high-performance machine learning abstractions. CoRR, abs/1802.04730, 2018.\n\n[42] Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, Jos\u00e9 Ignacio G\u00f3mez, Christian Tenllado, and Francky\nCatthoor. Polyhedral parallel code generation for cuda. ACM Trans. Archit. Code Optim., 9(4):54:1\u201354:23,\nJanuary 2013.\n\n[43] R. Clint Whaley and Jack J. Dongarra. Automatically tuned linear algebra software. In Proceedings of the\n1998 ACM/IEEE Conference on Supercomputing, SC \u201998, pages 1\u201327, Washington, DC, USA, 1998. IEEE\nComputer Society.\n\n[44] Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent neural network regularization. arXiv\n\npreprint arXiv:1409.2329, 2014.\n\n12\n\n\f", "award": [], "sourceid": 1717, "authors": [{"given_name": "Tianqi", "family_name": "Chen", "institution": "University of Washington"}, {"given_name": "Lianmin", "family_name": "Zheng", "institution": "Shanghai Jiaotong University"}, {"given_name": "Eddie", "family_name": "Yan", "institution": "university of washington"}, {"given_name": "Ziheng", "family_name": "Jiang", "institution": "Fudan University"}, {"given_name": "Thierry", "family_name": "Moreau", "institution": "university of washington"}, {"given_name": "Luis", "family_name": "Ceze", "institution": "University of Washington"}, {"given_name": "Carlos", "family_name": "Guestrin", "institution": "University of Washington"}, {"given_name": "Arvind", "family_name": "Krishnamurthy", "institution": "University of Washington"}]}