{"title": "A spatially varying two-sample recombinant coalescent, with applications to HIV escape response", "book": "Advances in Neural Information Processing Systems", "page_first": 193, "page_last": 200, "abstract": "Statistical evolutionary models provide an important mechanism for describing and understanding the escape response of a viral population under a particular therapy. We present a new hierarchical model that incorporates spatially varying mutation and recombination rates at the nucleotide level. It also maintains sep- arate parameters for treatment and control groups, which allows us to estimate treatment effects explicitly. We use the model to investigate the sequence evolu- tion of HIV populations exposed to a recently developed antisense gene therapy, as well as a more conventional drug therapy. The detection of biologically rele- vant and plausible signals in both therapy studies demonstrates the effectiveness of the method.", "full_text": "A spatially varying two-sample recombinant\n\ncoalescent, with applications to HIV escape response\n\nAlexander Braunstein\nStatistics Department\n\nUniversity of Pennsylvania\n\nWharton School\n\nPhiladelphia, PA 19104\n\nbraunsf@wharton.upenn.edu\n\nZhi Wei\n\nComputer Science Department\n\nNew Jersey Institute of Technology\n\nNewark, NJ 07102\n\nzhiwei@njit.edu\n\nShane T. Jensen\n\nStatistics Department\n\nUniversity of Pennsylvania\n\nWharton School\n\nPhiladelphia, PA 19104\n\nJon D. McAuliffe\n\nStatistics Department\n\nUniversity of Pennsylvania\n\nWharton School\n\nPhiladelphia, PA 19104\n\nstjensen@wharton.upenn.edu\n\nmcjon@wharton.upenn.edu\n\nAbstract\n\nStatistical evolutionary models provide an important mechanism for describing\nand understanding the escape response of a viral population under a particular\ntherapy. We present a new hierarchical model that incorporates spatially varying\nmutation and recombination rates at the nucleotide level. It also maintains sep-\narate parameters for treatment and control groups, which allows us to estimate\ntreatment effects explicitly. We use the model to investigate the sequence evolu-\ntion of HIV populations exposed to a recently developed antisense gene therapy,\nas well as a more conventional drug therapy. The detection of biologically rele-\nvant and plausible signals in both therapy studies demonstrates the effectiveness\nof the method.\n\n1 Introduction\n\nThe human immunode\ufb01ciency virus (HIV) has one of the highest levels of genetic variability yet\nobserved in nature. This variability stems from its unusual population dynamics: a high growth\nrate (\u223c10 billion new viral particles, or virions, per patient per day) combined with a replication\ncycle that involves frequent nucleotide mutations as well as recombination between different HIV\ngenomes that have infected the same cell.\nThe rapid evolution of HIV and other viruses gives rise to a so-called escape response when infected\ncells are subjected to therapy. Widespread availability of genome sequencing technology has had a\nprofound effect on the study of viral escape response. Increasingly, virologists are gathering two-\nsample data sets of viral genome sequences: a control sample contains genomes from a set of virions\ngathered before therapy, and a treatment sample consists of genomes from the post-therapeutic vi-\nral population. HIV treatment samples gathered just days after the start of therapy can exhibit a\nsigni\ufb01cant escape response.\nUp to now, statistical analyses of two-sample viral sequence data sets have been mainly rudimentary.\nAs a representative example, [7] presents tabulated counts of mutation occurrences (relative to a\nreference wild-type sequence) in the control group and the treatment group, without attempting any\nstatistical inference.\n\n1\n\n\fIn this paper we develop a model which allows for a detailed quanti\ufb01cation of the escape response\npresent in a two-sample data set. The model incorporates mutation and recombination rate param-\neters which vary positionally along the viral genome, and which differ between the treatment and\ncontrol samples. We present a reversible-jump MCMC procedure for approximate posterior infer-\nence of these parameters. The resulting posterior distribution suggests speci\ufb01c regions of the genome\nwhere the treatment sample\u2019s evolutionary dynamics differ from the control\u2019s: this is the putative\nescape response. Thus, the model permits an analysis that can point the way to improvements of\ncurrent therapies and to the development of new therapeutic strategies for HIV and other viruses.\nIn the remainder of the paper, we \ufb01rst provide the details of our statistical model and inference\nprocedure. Then we illustrate the use of the model in two applications. The \ufb01rst study consists of a\ncontrol sample of viral sequences obtained from HIV-infected individuals before a drug treatment,\nand a corresponding post-treatment sample [9]. The second study set is an in vitro investigation of a\nnew gene therapy for HIV; it contains a control sample of untreated virions and a treatment sample\nof virions challenged with the therapy [7].\n\n2 Methods\n\nWe begin by brie\ufb02y describing the standard statistical genetics framework for populations evolving\nunder mutation and recombination. Then we present a new Bayesian hierarchical model for two\ngroups of sequences, each group sampled from one of two related populations. We derive an MCMC\nprocedure for approximate posterior inference in the model; this procedure is implemented in the\nprogram PICOMAP. Our approach involves modi\ufb01cations and generalizations of the OMEGAMAP\nmethod [12], as we explain. In what follows, each \u201cindividual\u201d in a population is a sequence of L\nnucleotides (plus a gap symbol, used when sequences have insertions or deletions relative to each\nother). The positions along a sequence are called sites. An alignment is a matrix in which rows are\nsequences, columns are sites, and the (i, j)th entry is individual i\u2019s nucleotide at site j.\n\n2.1 The coalescent with recombination\n\nThe genome sequences in the control sample were drawn at random from a large population of\nsequences at a \ufb01xed point in time. We approximate the evolution of this population using the Wright-\nFisher evolutionary model with recombination [3]. Similarly, the treatment sample sequences are\nviewed as randomly drawn from a Wright-Fisher recombining population, but governed by different\nevolutionary parameters.\nIn the basic Wright-Fisher model without recombination, a \ufb01xed-size population evolves in discrete,\nnonoverlapping generations. Each sequence in the gth generation is determined by randomly choos-\ning a sequence from the (g \u2212 1)th generation, mutating it at one position with probability u, and\nleaving it unchanged with probability 1 \u2212 u. Typically, many individuals in each generation share a\nparent from the previous generation.\nA key insight in statistical population genetics, due to Kingman [5], is the following. If we have\na small sample from a large Wright-Fisher population at a \ufb01xed time, and we want to do calcula-\ntions involving the probability distribution over the sample\u2019s unknown ancestral history, it is highly\nuneconomical to \u201cwork forwards\u201d from older generations \u2013 most individuals will not be part of the\nsample\u2019s genealogy. Instead, we should follow the lineages of the sampled individuals backwards\nin time as they repeatedly coalesce at common ancestors, forming a tree rooted at the most recent\ncommon ancestor (MRCA) of the sample. Kingman showed that the continuous-time limit of the\nWright-Fisher model induces a simple distribution, called the coalescent process, on the topology\nand branch lengths of the resulting tree. Mutation events in the coalescent can be viewed as a sepa-\nrate point process marking locations on the branches of a given coalescent tree. This point process\nis independent of the tree-generating coalescent process.\nRecombination, however, substantially complicates matters. The Wright-Fisher dynamics are ex-\ntended to model recombination as follows. Choose one \u201cpaternal\u201d and one \u201cmaternal\u201d sequence\nfrom generation (g \u2212 1). With probability r, their child sequence in generation g is a recombinant:\na juncture between two adjacent sites is chosen uniformly at random, and the child is formed by\njoining the paternal sequence to the left of the juncture with the maternal sequence to the right. With\nprobability (1 \u2212 r), the child is a copy of just one of the two parents, possibly mutated as above.\n\n2\n\n\fNow look backwards in time at the ancestors of a sample: we \ufb01nd both coalescence events, where\ntwo sequences merge into a common ancestor, and recombination events, where a single sequence\nsplits into the two parent sequences that formed it. Thus the genealogy is not a tree but a graph, the\nancestral recombination graph (ARG). The continuous-time limit of the Wright-Fisher model with\nrecombination induces a distribution over ARGs called the recombinant coalescent [4, 2].\nIn fact, the ARG is the union of L coalescent trees. A single site is never split by recombination, so\nwe can follow that site in the sample backwards in time through coalescence events to its MRCA. But\nrecombination causes the sample to have a possibly different ancestral tree (and different MRCA)\nat each site. The higher the rate of recombination (corresponding to the parameter r), the more\noften the tree changes along the alignment. For this reason, methods that estimate a \ufb01xed, global\nphylogeny are badly biased in samples from highly recombinant populations, like viruses [10].\nThe Wright-Fisher assumptions appear quite stylized. But experience has shown that the coalescent\nand the recombinant coalescent can give reasonable results when applied to samples from popula-\ntions not matching the Wright-Fisher model, such as populations of increasing size [3].\n\n2.2 A two-sample hierarchical recombinant coalescent\n\nWe now present the components of our new hierarchical model for a control sample and a treatment\nsample of nucleotide sequences drawn from two recombining populations. To our knowledge, this is\nthe \ufb01rst fully speci\ufb01ed probabilistic model for such data. There are four parameter vectors of primary\ninterest in the model: a control-population mutation rate \u00b5C which varies along the sequence, a\ncorresponding spatially varying treatment-population vector \u00b5T, and analogous recombination rate\nparameter vectors \u03c1C and \u03c1T. (The \u00b5 and \u03c1 here correspond to the u and r mentioned above.)\nThe prior distribution on \u00b5C and \u00b5T takes the following hierarchical form:\n\n(B\u00b5, S\u00b5) | q\u00b5\n\nlog \u00b5i | \u00b50, \u03c32\ni ) | \u00b5i, \u03c32\n\n\u00b5\n\n\u223c Blocks(q\u00b5) ,\n\u00b50 \u223c N(log \u00b50, \u03c32\n\u00b50),\niid\u223c N(log \u00b5i, \u03c32\n\u00b5),\n\n(log \u00b5C\n\ni , log \u00b5T\n\ni = 1, . . . , B\u00b5,\ni = i, . . . , B\u00b5 .\n\n(1)\n(2)\n\n(3)\n\ni and \u00b5T\n\n\u00b5, . . . , SB\u00b5\n\n\u00b5 ), 1 \u2264 S1\n\n\u00b5 < \u00b7\u00b7\u00b7 \u2264 SB\u00b5\n\ni . The triples (\u00b5i, \u00b5C\n\nThis prior is designed to give \u00b5C and \u00b5T a block structure: the Blocks distribution divides the L\nsequence positions into B\u00b5 adjacent subsequences, with the index of each subsequence\u2019s rightmost\n\u00b5 \u2264 L. Under the Blocks distribution,\nsite given by S\u00b5 = (S1\n(B\u00b5 \u2212 1) is a Bin(L \u2212 1, q\u00b5) random variable, and given B\u00b5, the indexes S\u00b5 are a simple random\nsample without replacement from {1, . . . , L}. The sites in the ith block all mutate at the same rate\n\u00b5C\ni (in the control population) or \u00b5T\ni (in the treatment population). We lose no generality in sharing\nthe same block structure between the populations: two separate block structures can be replaced with\na single block structure formed from the union of their S\u00b5\u2019s. To generate the per-population mutation\nrates within a block, we \ufb01rst draw a lognormally distributed variable \u00b5i, which then furnishes the\nmean for the independent lognormal variables \u00b5C\ni ) are mutually\nindependent across blocks i = 1, . . . , B\u00b5.\nThe recombination rate parameters (\u03c1C, \u03c1T) are independent of (\u00b5C, \u00b5T) and have the same form of\nprior distribution (1)\u2013(3), mutatis mutandis. In our empirical analyses, we set the hyperparameters\nq\u00b5 and q\u03c1 to get prior means of 20 to 50 blocks; results were not sensitive to these settings. We\nput simple parametric distributions on the hyperparameters \u00b50, \u03c32\n\u00b5, and their \u03c1 analogs, and\nincluded them in the sampling procedure.\nThe remaining component of the model is the likelihood of the two observed samples. Let HC\nbe the alignment of control-sample sequences and HT the treatment-sample sequence alignment.\nConditional on all parameters, HC and HT are independent. Focus for a moment on HC. Since we\nwish to view it as a sample from a Wright-Fisher recombining population, its likelihood corresponds\nto the probability, under the coalescent-with-recombination distribution, of the set of all ARGs that\ncould have generated HC. However, using the nucleotide mutation model described below, even\nMonte Carlo approximation of this probability is computationally intractable [12].\nSo instead we approximate the true likelihood with a distribution called the \u201cproduct of approximate\nconditionals,\u201d or PAC [6]. PAC orders the K sequences in HC arbitrarily, then approximates their\nprobability as the product of probabilities from K hidden Markov models. The kth HMM evaluates\n\n\u00b50, \u03c32\n\ni , \u00b5T\n\n3\n\n\fthe probability that sequence k was produced by mutating and recombining sequences 1 through\nk \u2212 1. We thus obtain the \ufb01nal components of our hierarchical model:\n\nHC | \u00b5C, \u03c1C, \u03b7 \u223c PAC(\u00b5C, \u03c1C, \u03b7) ,\nHT | \u00b5T, \u03c1T, \u03b7 \u223c PAC(\u00b5T, \u03c1T, \u03b7) .\n\n(4)\n(5)\n\nIn order to apply PAC, we must specify a nucleotide substitution model, that is, the probability that a\nnucleotide i mutates to a nucleotide j over evolutionary distance t. In the above, \u03b7 parametrizes this\nmodel. For our analyses, we employed the well-known Felsenstein substitution model, augmented\nwith a \ufb01fth symbol to represent gaps [8]. For simplicity, we constructed \ufb01xed empirical estimates of\nthe Felsenstein parameters \u03b7, in a standard way.\nTo incorporate the extended Felsenstein model in PAC, it is necessary to integrate evolutionary\ndistance out of the substitution process p(j | i, 2t), using the exponential distribution induced by the\ncoalescent on the evolutionary distance 2t between pairs of sampled individuals. It can be shown\nthat the required quantity is\n\n(cid:19)\n\n1[i = j] +\n\n1[(i, j) \u2208 {(A, G), (C, T )}] .\n\n(6)\n\np(j | i) =\n\np(j | i, 2t)p(t) dt =\n\n(cid:90)\n\n(cid:18)\n\n(cid:18)\n\n(cid:19)\n\n\u2212\n\n1 \u2212 k\n\n(cid:18) k\n\nk + 2\u03b2\n\nk + 2\u03b2\n\n\u03c0j +\n\nk\n\nk + 2(\u03b11[i (cid:54)= gap] + \u03b2)\nk\n\n(cid:19)(cid:18) \u03c0j\n\n(cid:19)\n\nk + 2(\u03b1 + \u03b2)\n\n\u03c0i + \u03c0j\n\nHere k is the number of sampled individuals, and \u03c0i, \u03c0j, \u03b1, and \u03b2 are Felsenstein model parameters\n(the last two depending on the mutation rate at the site in question). 1[\u00b7] is the indicator function of\nthe predicate in brackets.\nThe blocking prior (1) and the use of PAC with spatially varying parameters are ideas drawn from\nOMEGAMAP [12]. But our approach differs in two signi\ufb01cant respects. First, OMEGAMAP models\ncodons (the protein sequence encoded by nucleotides), not the nucleotides themselves. This is some-\ntimes unsuitable. For example, in one of our empirical analyses, the treatment population receives\nRNA antisense gene therapy. The target of this therapy is the primary HIV genome sequence itself,\nnot its protein products. So we would expect the escape response to manifest at the nucleotide level,\nin the targeted region of the genome. Our model can capture this. Second, we perform simultaneous\nhierarchical inference about the control and treatment sample, which encourages the parameter es-\ntimates to differ between the samples only where strongly justi\ufb01ed by the data. Using a one-sample\ntool like OMEGAMAP on each sample in isolation would tend to increase the number of artifactual\ndifferences between corresponding parameters in each sample.\n\n2.3\n\nInference\n\nThe posterior distribution of the parameters in our model cannot be calculated analytically. We\ntherefore employ a reversible-jump Metropolis-within-Gibbs sampling strategy to construct an ap-\nproximate posterior. In such an approach, sets of parameters are iteratively sampled from their pos-\nterior conditional distributions, given the current values of all other parameters. Because the Blocks\nprior generates mutation and recombination parameters with piecewise-constant pro\ufb01les along the\nsequence, we call our sampler implementation PICOMAP.\nThe sampler uses Metropolis-Hastings updates for the numerical values of parameters, and\nreversible-jump updates [1] to explore the blocking structures (B\u00b5, S\u00b5) and (B\u03c1, S\u03c1). The block\nupdates consider extending a block to the left or right, merging two adjacent blocks, and splitting a\nblock. They are similar to the updates (B2)-(B4) of [12], so we omit the details.\nTo illustrate one of the parameter updates within a block, let (\u00b5C\ncontrol and treatment mutation rates in block i. We sample proposal values\n\ni ) be the current values of the\n\ni , \u00b5T\n\nlog \u02dc\u00b5C\nlog \u02dc\u00b5T\n\ni \u223c N(log \u00b5C\ni \u223c N(log \u00b5T\n\ni , \u03c4 2) ,\ni , \u03c4 2) ,\n\n4\n\n(7)\n(8)\n\n\fFigure 1: Posterior estimate of the effect of enfuvirtide drug therapy on mutation rates. Blue line is\nposterior mean, Black lines are 95% highest-posterior-density (HPD) intervals.\n\nwhere \u03c4 2 is a manually con\ufb01gured tuning parameter for the proposal distribution. These proposals\nare accepted with probability\n\np(HC | \u02dc\u00b5C\np(HC | \u00b5C\n\ni , \u03b8) p(HT | \u02dc\u00b5T\ni , \u03b8) p(HT | \u00b5T\n\ni , \u03b8)\ni , \u03b8)\n\n\u00b7 p(\u02dc\u00b5C\np(\u00b5C\n\ni , \u02dc\u00b5T\ni , \u00b5T\n\ni | \u00b5i)\ni | \u00b5i)\n\n,\n\nwhere\n\np(\u02dc\u00b5C\np(\u00b5C\n\ni , \u02dc\u00b5T\ni , \u00b5T\n\ni | \u00b5i)\ni | \u00b5i)\n\n= \u00b5T\ni \u00b5C\ni \u02dc\u00b5C\n\u02dc\u00b5T\n\ni\n\ni\n\nexp{\u2212((log \u02dc\u00b5T\nexp{\u2212((log \u00b5T\n\ni \u2212 log \u00b5i)2 + (log \u02dc\u00b5C\ni \u2212 log \u00b5i)2 + (log \u00b5C\n\ni \u2212 log \u00b5i)2)/2\u03c32}\ni \u2212 log \u00b5i)2)/2\u03c32} .\n\n(9)\n\n(10)\n\nHere \u03b8 denotes the current values of all other model parameters. Notice that symmetry in the pro-\nposal distribution causes that part of the MH acceptance ratio to cancel.\nThe PICOMAP sampler involves a number of other update formulas, which we do not describe here\ndue to space constraints.\n\n3 Results\n\nIn this section, we apply the PICOMAP methodology to HIV sequence data from two different stud-\nies. In the \ufb01rst study, several HIV-infected patients were exposed to a drug-based therapy. In the\nsecond study, the HIV virus was exposed in vitro to a novel antisense gene therapy. In both cases,\nour analysis extracts biologically relevant features of the evolutionary response of HIV to these\ntherapeutic challenges.\nFor each study we ran at least 8 chains to monitor convergence of the sampler. The chains con-\nverged without exception and were thinned accordingly, then combined for analysis. In the interest\nof brevity, we include only plots of the posterior treatment-effect estimates for both mutation and\nrecombination rates.\n\n3.1 Drug therapy study\n\nIn this study, \ufb01ve patients had blood samples taken both before and after treatment with the drug\nenfuvirtide, also known as Fuzeon or T-20 [11]. Sequences of the Envelope (Env) region of the HIV\ngenome were generated from each of these blood samples. Pooling across these patients, we have\n28 pre-exposure Env sequences which we label as the control sample, and 29 post-exposure Env\nsequences which we label as the treatment sample. We quantify the treatment effect of exposure\n\n5\n\n\fFigure 2: Posterior estimate of the effect of enfuvirtide drug therapy on recombination rates. Blue\nline is posterior mean, Black lines are 95% HPD intervals.\n\nto the drug by calculating the posterior mean and 95% highest-posterior-density (HPD) intervals of\nthe difference in recombination rates \u03c1T \u2212 \u03c1C and mutation rates \u00b5T \u2212 \u00b5C at each position of the\ngenomic sequence.\nThe very existence in the patient of a post-exposure HIV population indicates the evolution of\nsequence changes that have conferred resistance to the action of the drug enfuvirtide.\nIn fact,\nresistance-conferring mutations are known a priori to occur at nucleotide locations 1639-1668 in\nthe Env sequence. Figure 1 shows the posterior estimate of the treatment effect on mutation rates\nover the length of the Env sequence. From nucleotide positions 1590-1700, the entire 95% HPD in-\nterval of the mutation rate treatment effect is above zero, which suggests our model is able to detect\nelevated levels of mutation in the resistance-conferring region, among individuals in the treatment\nsample.\nAnother preliminary observation from this study was that both the pre-exposure and post-exposure\nsequences are mixtures of several different HIV subtypes. Subtype identity is speci\ufb01ed by the V3\nloop subsequence of the Env sequence, which corresponds to nucleotide positions 887-995. Since it\nis unlikely that resistance-conferring mutations developed independently in each subtype, we suspect\nthat the resistance-conferring mutations were passed to the different subtypes via recombination.\nRecombination is the primary means by which drug resistance is transferred in vivo between strains\nof HIV, so recombination at these locations involving drug resistant strains would allow successful\ntransfer of the resistance-conferring mutations between types of HIV.\nFigure 2 shows the spatial posterior estimate of the treatment effect on recombination. We see\ntwo areas of increased recombination, one from nucleotide positions 1020-1170 and another from\nnucleotide positions 1900-2200. As an interesting side note, we see a marked decrease in mutation\nand recombination in the V3 loop that determines sequence speci\ufb01city.\n3.2 Antisense gene therapy study\n\nIn the VIRxSYS antisense gene therapy study, we have two populations of wild type HIV in vitro.\nThe samples consist of 19 Env sequences from a control HIV population that was allowed to evolve\nneutrally in cell culture, along with 48 Env sequences sampled from an HIV population evolving in\ncell cultures that were transfected with the VIRxSYS antisense vector [7]. The antisense gene ther-\napy vector targets nucleotide positions 1325 - 2249. Unlike drug therapy treatments, whose effect\ncan be nulli\ufb01ed by just one or two well placed mutations, a relatively large number of mutations\nare required to escape the effects of antisense gene therapy. We again quantify the treatment effect\nof exposure to the antisense vector by calculating the posterior mean and 95% HPD interval of the\n\n6\n\n\fFigure 3: Posterior estimate of the effect of VIRxSYS antisense gene therapy on mutation rates.\nBlue line is posterior mean, Black lines are 95% HPD intervals.\n\ndifference in recombination rates \u03c1T \u2212 \u03c1C and mutation rates \u00b5T \u2212 \u00b5C at each position of the Env\nsequence.\nFigures 3 and 4 show the posterior estimate of the treatment\u2019s effect on mutation and recombination,\nrespectively. The most striking feature of the plots is the area of signi\ufb01cantly elevated mutation in the\ntreatment sequences. The leftmost region of the highest plateau corresponds to nucleotide position\n1325, the 5\u2019 boundary of the antisense target region. This area of heightened mutation overlaps\nwith the target region for around 425 nucleotides in the 3\u2019 direction. We see fewer differences in\nthe recombination rate, suggesting that mutation is the primary mechanism of evolutionary response\nto the antisense vector. In fact, we estimate lower recombination rates in the target region of the\ntreatment sequences relative to the control sequences.\n\n4 Discussion\n\nWe have introduced a hierarchical model for the estimation of evolutionary escape response in a\npopulation exposed to therapeutic challenge. The escape response is quanti\ufb01ed by mutation and\nrecombination rate parameters. Our method allows for spatial heterogeneity in these mutation and\nrecombination rates.\nIt estimates differences between treatment and control sample parameters,\nwith parameter values encouraged to be similar between the two populations except where the data\nsuggests otherwise. We applied our procedure to sequence data from two different HIV therapy\nstudies, detecting evolutionary responses in both studies that are of biological interest and may be\nrelevant to the design of future HIV treatments.\nAlthough virological problems motivated the creation of our model, it applies more generally to two-\nsample data sets of nucleic acid sequences drawn from any population. The model is particularly\nrelevant for populations in which the recombination rate is a substantial fraction of the mutation rate,\nsince simpler models which ignore recombination can produce seriously misleading results.\n\nAcknowledgements\n\nThis research was supported by a grant from the University of Pennsylvania Center for AIDS Re-\nsearch. Thanks to Neelanjana Ray, Jessamina Harrison, Robert Doms, Matthew Stephens and Gwen\nBinder for helpful discussions.\n\n7\n\n\fFigure 4: Posterior estimate of the effect of VIRxSYS antisense gene therapy on recombination\nrates. Blue line is posterior mean, Black lines are 95% HPD intervals.\n\nReferences\n\n[1] P. J. Green. Reversible jump Markov chain Monte Carlo computation and Bayesian model\n\ndetermination. Biometrika, 82:711\u2013731, 1995.\n\n[2] R. C. Grif\ufb01ths and P. Marjoram. An ancestral recombination graph. In Progress in Population\n\nGenetics and Human Evolution, pages 257\u2013270. Springer Verlag, 1997.\n\n[3] J. Hein, M. Schierup, and C. Wiuf. Gene Genealogies, Variation and Evolution: A Primer in\n\nCoalescent Theory. Oxford University Press, 2005.\n\n[4] R. R. Hudson. Properties of a neutral allele model with intragenic recombination. Theoretical\n\nPopulation Biology, 23:183\u2013201, 1983.\n\n[5] J. F. C. Kingman. The coalescent. Stochastic Processes and Their Applications, 13:235\u2013248,\n\n1982.\n\n[6] N. Li and M. Stephens. Modeling linkage disequilibrium and identifying recombination\nhotspots using single-nucleotide polymorphism data. Genetics, 165:2213\u20132233, December\n2003.\n\n[7] X. Lu, Q. Yu, G. Binder, Z. Chen, T. Slepushkina, J. Rossi, and B. Dropulic. Antisense-\nmediated inhibition of human immunode\ufb01ciency virus (HIV) replication by use of an HIV\ntype 1-based vector results in severely attenuated mutants incapable of developing resistance.\nJournal of Virology, 78:7079\u20137088, 2004.\n\n[8] G. McGuire, M. Denham, and D. Balding. Models of sequence evolution for DNA sequences\n\ncontaining gaps. Molecular Biology and Evolution, 18(4):481\u2013490, 2001.\n\n[9] N. Ray, J. Harrison, L. Blackburn, J. Martin, S. Deeks, and R. Doms. Clinical resistance to\nenfuvirtide does not affect susceptibility of human immunode\ufb01ciency virus type 1 to other\nclasses of entry inhibitors. Journal of Virology, 81:3240\u20133250, 2007.\n\n[10] M. H. Schierup and J. Hein. Consequences of recombination on traditional phylogenetic anal-\n\nysis. Genetics, 156:879\u2013891, 2000.\n\n[11] C. Wild, T. Greenwell, and T. Matthews. A synthetic peptide from HIV-1 gp41 is a potent\ninhibitor of virus mediated cell-cell fusion. AIDS Research and Human Retroviruses, 9:1051\u2013\n1053, 1993.\n\n[12] D. Wilson and G. McVean. Estimating diversifying selection and functional constraint in the\n\npresence of recombination. Genetics, 172:1411\u20131425, 2006.\n\n8\n\n\f", "award": [], "sourceid": 662, "authors": [{"given_name": "Alexander", "family_name": "Braunstein", "institution": null}, {"given_name": "Zhi", "family_name": "Wei", "institution": null}, {"given_name": "Shane", "family_name": "Jensen", "institution": null}, {"given_name": "Jon", "family_name": "Mcauliffe", "institution": null}]}