Population Subdivision and Molecular Sequence Variation: Theory and Analysis of Drosophila ananassae Data

Genetics ◽  
2003 ◽  
Vol 165 (3) ◽  
pp. 1385-1395
Author(s):  
Claus Vogl ◽  
Aparup Das ◽  
Mark Beaumont ◽  
Sujata Mohanty ◽  
Wolfgang Stephan

Abstract Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter 0398; to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of 0398;, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.

2013 ◽  
Vol 846-847 ◽  
pp. 1304-1307
Author(s):  
Ye Wang ◽  
Yan Jia ◽  
Lu Min Zhang

Mining partial orders from sequence data is an important data mining task with broad applications. As partial orders mining is a NP-hard problem, many efficient pruning algorithm have been proposed. In this paper, we improve a classical algorithm of discovering frequent closed partial orders from string. For general sequences, we consider items appearing together having equal chance to calculate the detecting matrix used for pruning. Experimental evaluations from a real data set show that our algorithm can effectively mine FCPO from sequences.


Genome ◽  
2009 ◽  
Vol 52 (3) ◽  
pp. 217-221 ◽  
Author(s):  
Xia Shen ◽  
Bruce Walsh ◽  
Jing J. Li ◽  
Hong X. Pang ◽  
Wen J. Wang ◽  
...  

While many studies of cis-elements CArG bound by serum response factor (SRF) are in progress, little is known about the positional distribution of the functional CArG elements around the transcription start site (TSS) of genes that they influence. We use a validated CArG data set to calculate the distance distribution of functional CArG elements around the TSS. Distances between adjacent CArGs were also analyzed. We compare these distributions with those derived using a control set of randomly selected CArGs (that were not experimentally validated for function). Our results show that most functional CArG elements (108 of 152, 71%) exist upstream of the annotated TSS, with copy number increasing as one moves closer to the TSS. Moreover, the average number of the CArG elements in the CArG-containing genes is significantly more than that in the control genes. Our study extends earlier bioinformatic analyses of functional CArG elements and provides an application of comparative sequence data to the identification of transcription factor binding sites.


Author(s):  
Sara Fuentes-Soriano ◽  
Elizabeth A. Kellogg

Physarieae is a small tribe of herbaceous annual and woody perennial mustards that are mostly endemic to North America, with its members including a large amount of variation in floral, fruit, and chromosomal variation. Building on a previous study of Physarieae based on morphology and ndhF plastid DNA, we reconstructed the evolutionary history of the tribe using new sequence data from two nuclear markers, and compared the new topologies against previously published cpDNA-based phylogenetic hypotheses. The novel analyses included ca. 420 new sequences of ITS and LUMINIDEPENDENS (LD) markers for 39 and 47 species, respectively, with sampling accounting for all seven genera of Physarieae, including nomenclatural type species, and 11 outgroup taxa. Maximum parsimony, maximum likelihood, and Bayesian analyses showed that these additional markers were largely consistent with the previous ndhF data that supported the monophyly of Physarieae and resolved two major clades within the tribe, i.e., DDNLS (Dithyrea, Dimorphocarpa, Nerisyrenia, Lyrocarpa, and Synthlipsis)and PP (Paysonia and Physaria). New analyses also increased internal resolution for some closely related species and lineages within both clades. The monophyly of Dithyrea and the sister relationship of Paysonia to Physaria was consistent in all trees, with the sister relationship of Nerisyrenia to Lyrocarpa supported by ndhF and ITS, and the positions of Dimorphocarpa and Synthlipsis shifted within the DDNLS Clade depending on the employed data set. Finally, using the strong, new phylogenetic framework of combined cpDNA + nDNA data, we discussed standing hypotheses of trichome evolution in the tribe suggested by ndhF.


2012 ◽  
Vol 60 (1) ◽  
pp. 32 ◽  
Author(s):  
Laurence J. Clarke ◽  
Duncan I. Jardine ◽  
Margaret Byrne ◽  
Kelly Shepherd ◽  
Andrew J. Lowe

Atriplex sp. Yeelirrie Station (L. Trotter & A. Douglas LCH 25025) is a highly restricted, potentially new species of saltbush, known from only two sites ~30 km apart in central Western Australia. Knowledge of genetic structure within the species is required to inform conservation strategies as both populations occur within a palaeovalley that contains significant near-surface uranium mineralisation. We investigate the structure of genetic variation within populations and subpopulations of this taxon using nuclear microsatellites. Internal transcribed spacer sequence data places this new taxon within a clade of polyploid Atriplex species, and the maximum number of alleles per locus suggests it is hexaploid. The two populations possessed similar levels of genetic diversity, but exhibited a surprising level of genetic differentiation given their proximity. Significant isolation by distance over scales of less than 5 km suggests dispersal is highly restricted. In addition, the proportion of variation between the populations (12%) is similar to that among A. nummularia populations sampled at a continent-wide scale (several thousand kilometres), and only marginally less than that between distinct A. nummularia subspecies. Additional work is required to further clarify the exact taxonomic status of the two populations. We propose management recommendations for this potentially new species in light of its highly structured genetic variation.


mSystems ◽  
2018 ◽  
Vol 3 (3) ◽  
Author(s):  
Gabriel A. Al-Ghalith ◽  
Benjamin Hillmann ◽  
Kaiwei Ang ◽  
Robin Shields-Cutler ◽  
Dan Knights

ABSTRACT Next-generation sequencing technology is of great importance for many biological disciplines; however, due to technical and biological limitations, the short DNA sequences produced by modern sequencers require numerous quality control (QC) measures to reduce errors, remove technical contaminants, or merge paired-end reads together into longer or higher-quality contigs. Many tools for each step exist, but choosing the appropriate methods and usage parameters can be challenging because the parameterization of each step depends on the particularities of the sequencing technology used, the type of samples being analyzed, and the stochasticity of the instrumentation and sample preparation. Furthermore, end users may not know all of the relevant information about how their data were generated, such as the expected overlap for paired-end sequences or type of adaptors used to make informed choices. This increasing complexity and nuance demand a pipeline that combines existing steps together in a user-friendly way and, when possible, learns reasonable quality parameters from the data automatically. We propose a user-friendly quality control pipeline called SHI7 (canonically pronounced “shizen”), which aims to simplify quality control of short-read data for the end user by predicting presence and/or type of common sequencing adaptors, what quality scores to trim, whether the data set is shotgun or amplicon sequencing, whether reads are paired end or single end, and whether pairs are stitchable, including the expected amount of pair overlap. We hope that SHI7 will make it easier for all researchers, expert and novice alike, to follow reasonable practices for short-read data quality control. IMPORTANCE Quality control of high-throughput DNA sequencing data is an important but sometimes laborious task requiring background knowledge of the sequencing protocol used (such as adaptor type, sequencing technology, insert size/stitchability, paired-endedness, etc.). Quality control protocols typically require applying this background knowledge to selecting and executing numerous quality control steps with the appropriate parameters, which is especially difficult when working with public data or data from collaborators who use different protocols. We have created a streamlined quality control pipeline intended to substantially simplify the process of DNA quality control from raw machine output files to actionable sequence data. In contrast to other methods, our proposed pipeline is easy to install and use and attempts to learn the necessary parameters from the data automatically with a single command.


2006 ◽  
Vol 361 (1475) ◽  
pp. 2045-2053 ◽  
Author(s):  
Daniel Falush ◽  
Mia Torpdahl ◽  
Xavier Didelot ◽  
Donald F Conrad ◽  
Daniel J Wilson ◽  
...  

In bacteria, DNA sequence mismatches act as a barrier to recombination between distantly related organisms and can potentially promote the cohesion of species. We have performed computer simulations which show that the homology dependence of recombination can cause de novo speciation in a neutrally evolving population once a critical population size has been exceeded. Our model can explain the patterns of divergence and genetic exchange observed in the genus Salmonella , without invoking either natural selection or geographical population subdivision. If this model was validated, based on extensive sequence data, it would imply that the named subspecies of Salmonella enterica correspond to good biological species, making species boundaries objective. However, multilocus sequence typing data, analysed using several conventional tools, provide a misleading impression of relationships within S. enterica subspecies enterica and do not provide the resolution to establish whether new species are presently being formed.


The Auk ◽  
2007 ◽  
Vol 124 (1) ◽  
pp. 71-84 ◽  
Author(s):  
W. Andrew Cox ◽  
Rebecca T. Kimball ◽  
Edward L. Braun

Abstract The evolutionary relationship between the New World quail (Odontophoridae) and other groups of Galliformes has been an area of debate. In particular, the relationship between the New World quail and guineafowl (Numidinae) has been difficult to resolve. We analyzed >8 kb of DNA sequence data from 16 taxa that represent all major lineages of Galliformes to resolve the phylogenetic position of New World quail. A combined data set of eight nuclear loci and three mitochondrial regions analyzed with maximum parsimony, maximum likelihood, and Bayesian methods provide congruent and strong support for New World quail being basal members of a phasianid clade that excludes guineafowl. By contrast, the three mitochondrial regions exhibit modest incongruence with each other. This is reflected in the combined mitochondrial analyses that weakly support the Sibley-Ahlquist topology that placed the New World quail basal in relation to guineafowl and led to the placement of New World quail in its own family, sister to the Phasianidae. However, simulation-based topology tests using the mitochondrial data were unable to reject the topology suggested by our combined (mitochondrial and nuclear) data set. By contrast, similar tests using our most likely topology and our combined nuclear and mitochondrial data allow us to strongly reject the Sibley-Ahlquist topology and a topology based on morphological data that unites Old and New World quail. Posición Filogenética de las Codornices del Nuevo Mundo (Odontophoridae): Ocho Loci Nucleares y Tres Regiones Mitocondriales Contradicen la Morfología y la Filogenia de Sibley y Ahlquist


2019 ◽  
Author(s):  
Maria Angenica Fulo Regilme ◽  
Megumi Sato ◽  
Tsutomu Tamura ◽  
Reiko Arai ◽  
Marcello Otake Sato ◽  
...  

AbstractIxodid tick species such as Ixodes ovatus and Haemaphysalis flava are important vector of tick-borne diseases in Japan. In this study, we used genetic structure at two mitochondrial loci (cox1, 16S rRNA gene) to infer gene flow patterns of I. ovatus and H. flava from Niigata Prefecture, Japan. Samples were collected in 29 (I. ovatus) and 17 (H. flava) sampling locations across Niigata Prefecture (12,584.18 km2). For I. ovatus, pairwise FST and analysis of molecular variance (AMOVA) analyses of cox1 sequences indicated significant among-population differentiation. This was in contrast to H. flava, for which there were few cases of low significant pairwise differentiation. A Mantel test revealed isolation by distance and there was positive spatial autocorrelation of haplotypes in I. ovatus cox1 and 16S sequences, but non-significant results were observed in H. flava in both markers. We found three genetic groups (China 1, China 2 and Japan) in the cox1 I. ovatus tree. Newly sampled I. ovatus grouped together with a published I. ovatus sequence from northern Japan and were distinct from two other I. ovatus groups that were reported from southern China. The three genetic groups in our data set suggest the potential for cryptic species among the groups. While many factors can potentially account for the observed differences in genetic structure between the two species, including population persistence and large-scale patterns of range expansion, the differences in the mobility of hosts of tick immature stages (small mammals in I. ovatus; birds in H. flava) is possibly driving the observed patterns.


Sign in / Sign up

Export Citation Format

Share Document