http://swrc.ontoware.org/ontology#Thesis
STOCHASTIC MODELS IN POPULATION GENETICS
en
間野 修平
マノ シュウヘイ
MANO Shuhei
総研大乙第189号
Stochastic models have played important roles in population genetics. They have given theoretical understanding on evolutionary mechanism of maintaining genetic diversity within and between species. Following in a line of Fisher (1930) and Wright (1945), in 1945-1980's Kimua and his coworkers had given foundations of<br >evolutionary theories by developing stochastic models based on the diffusion process. By applying their theoretical predictions to emerging molecular data at that tine, many important aspects of molecular evolution have been revealed so far. The most significant prediction is probably the neutral hypothesis of molecular evolution, which was advocated by Kimura (1968). In early 1980's, a stochastic model, which is now called the called the coalescent model, is introduced (Kingman, 1982; Tajima, 1983; Hudson, 1983). The coalescent process is a stochastic process of ancestors of a sample, which are taken from a population evolving under the diffusion model. The coalescent model has given a useful framework of statistical analysis of a sample taken from a population; tests based on the model and efficient simulation schemes to generate samples evolving under arbitrary hypothesis have been developed so far. In this dissertation, the author will present several analytical results on stochastic models in population genetics obtained by the author and coworkers. These models cover various aspects, but with special reference to multi-locus diffusion models (Chapter 2, 3) and the relationship of the diffusion model and the coalescent model under selection (Chapter 4, 5). They are central issues under development in the current population genetics theory.<br /> In chapter 2, effects of random genetic drift upon linkage disequilibrium are investigated in terms of a two-locus diffusion models. An analytic expression conditional, expectation of transient gamete frequency, given that one of the two loci remains polymorphic, is obtained by calculating the moments of the distribution.<br />Using this expression, a model where linkage disequilibrium is introduced by a single mutant is investigated.<br />The random genetic drift should have large important on the model, since the mutant is prone to disappear from a population. The conditional expectation of the gamete frequency given that the locus with the mutant allele remains polymorphic is presented. The behavior is significantly different from the monotonic decrease observed in the deterministic model without random genetic drift. Then, evolution of linkage disequilibrium of the founders in exponentially growing populations is investigated in terms of a time-inhomogeneous stochastic model, which is an extension of the diffusion approximation of the Wright-Fisher model. As a measure of linkage disequilibrium, the squared standard linkage deviation, which is defined by a ratio of the moments, is investigated. A system of ordinary differential equations that these moments obey is provided, In addition, by a perturbative series expansion in a growth parameter, an asymptotic formula for the squared standard linkage deviation after a large number of generations is obtained. According to the formula, the squared standard linkage deviation tends to be 1/(4<i>Nc</i>), where <i>N</i> is the current size of the population and <i>c</i> is the recombination rate between two loci. It depends on neither of the initial effective size of the population, the growth rate, nor the mutation rate. In exponentially growing populations, linkage disequilibrium will be asymptotically the same as that in a constant size population, the effective size of which is the current size.<br /> In Chapter 3, evolutionary rates of duplicated genes under concerted evolution by gene conversion are investigated. Effects of directional natural selection and bias in conversion rate on fixation of a single mutant in a locus, where the mutant spreads in a multigene family by gene conversion, are investigated. For the directional selection, a model in which selection operates on the number of the mutant in a diploid is assumed. Because of gene conversion between loci, either the mutant or wildtype allele will eventually fix in all the loci. An analytic expression of the fixation probability is obtained in terms of a two-locus diffusion model. For thg genic selection, the formula is given by the well known for the single-locus problem, replacing the effective population size by the twice and the initial allele frequency by arithmetic mean of the initial frequencies in the two loci. The expression depends on neither of the initial linkage disequilibrium, the recombination rata, nor the conversion rate. According to simulations, the simple correspondence between the formula for a single locus<br />and for two loci holds when number of the locus is larger than two. For the biased gene conversion, an analytic expression of the fixation probability is obtained in terms of a n-locus diffusion model. With these formula of the fixation probabilities, effects of gene conversion on the rate of molecular evolution in a multigene family under concerted evolution are discussed. It is shown that selection and bias in conversion rate operate more efficiently in a large multigene family.<br /> In Chapter 4, the ancestral selection graph, which is an analogue to the coalescent genealogy, is investigated.<br />The number of ancestral particles, backward in time, of a sample of genes is an ancestral process, which is a birth and death process with quadratic death and linear birth rate. An explicit form of the number of ancestral particle is obtained, by using the density of the allele frequency in a diffusion model obtained by Kimura (1955).<br />It is shown that fixation is convergence of the ancestral process to the stationary measure. The time to fixation of an allele is studied in terms of the ancestral process.<br /> In Chapter 5, an approximate sampling formula for the infinite allele model at the end of a selective sweep is obtained, in terms of a weighted binomial-mixture of the Ewens sampling formula. The approximate sampling formula is based on the hitchhiking model proposed by Maynard Smith and Haigh (1974). The formula will give a simple and useful framework for theoretical understanding of allele frequency distribution at the end of a sweep. By using the approximate sampling formula for the infinite allele model at the end of a sweep, a new likelihood based test to detect recent selective sweep is presented. Although the test seems slightly less powerful than the test based on the frequency of the most common allele when the mutation rate (size of the neutral region) is low, however, the test gives estimates of the selection coefficient and the position of the target of the selection.