Sanger sequence data

UQ eSpace ◽  
2021 ◽  
Author(s):  
Mohamed Cassim Mohamed Zakeel ◽  
Akinsanmi Olufemi ◽  
Geering Andrew
2020 ◽  
Author(s):  
Long-Fei Fu ◽  
Fang Wen ◽  
Olivier Maurin ◽  
Michele Rodda ◽  
Yi-Gang Wei ◽  
...  

ABSTRACTPilea Lindl., with 933 published names is the largest genus within the Urticaceae. Pilea was last monographed in 1869 and whilst the monophyly of the genus has been proposed by previous authors, this has been based on incomplete taxon sampling and the failure to resolve the position of key taxa. We aimed to generate a robust phylogeny for Pilea and allied genera that could provide a framework for testing the monophyly of Pilea, revising its delimitation and for answering broader scientific questions about this species-rich genus. To do so, we sought to sample taxa representative of previous infrageneric classifications and with anomalous inflorescences or flower configurations and to use the resulting phylogeny to evaluate the delimitation of Pilea and to establish an infrageneric classification. In addition, we included a representative of the Polynesian genus Haroldiella which, morphologically, is very similar to Pilea. Using Sanger sequence data from two plastid and one nuclear regions we constructed a phylogeny using Bayesian Inference, Maximum Likelihood and Maximum Parsiomony approaches. We used our phylogeny to evaluate the informativness of 19 morphological traits and applied both to delimit a monophyletic genus and infrageneric sections. Our results recovered Pilea as paraphyletic with respect to Lecanthus, a consequence of the recovery of a monophyletic clade comprising sections Achudemia and Smithiella, neither of which had been adequately sampled in previous studies. We also recovered Pilea as polyphyletic with respect to Haroldiella. We identified isomery between male and female flowers, flower part number and male sepal arrangement as being phylogenetically informative traits that can be used to delimit two genera, Achudemia, including section Smithiella, recovered as sister to Lecanthus, and Pilea, including Haroldiella, recovered as sister to both. On the basis of our evaluation of both morphological traits and phylogenetic relationships we propose a new infrageneric classification for the genus comprising seven sections, five of which we describe for the first time, § Trimeris Y.G.Wei & A.K.Monro, § Lecanthoides C.J.Chen, § Angulata L.F.Fu & Y.G.Wei, § Tetrameris C.J.Chen, § Verrucosa L.F.Fu & Y.G.Wei, § Plataniflora L.F.Fu & Y.G.Wei and § Leiocarpa L.F.Fu & Y.G.Wei. We also identify a trend of decreasing merism and fruit size, and increasing species-richness as Pilea diverges. In addition, we recover strong geographical structure within our phylogeny, sufficient to propose that Pilea originated in the IndoMalaya biogeographic domain.


2019 ◽  
Author(s):  
Xavier Turon ◽  
Adrià Antich ◽  
Creu Palacín ◽  
Kim Præbel ◽  
Owen Simon Wangensteen

AbstractMetabarcoding is by now a well-established method for biodiversity assessment in terrestrial, freshwater and marine environments. Metabarcoding datasets are usually used for α- and β-diversity estimates, that is, interspecies (or inter-MOTU) patterns. However, the use of hypervariable metabarcoding markers may provide an enormous amount of intraspecies (intra-MOTU) information - mostly untapped so far. The use of cytochrome oxidase (COI) amplicons is gaining momentum in metabarcoding studies targeting eukaryote richness. COI has been for a long time the marker of choice in population genetics and phylogeographic studies. Therefore, COI metabarcoding datasets may be used to study intraspecies patterns and phylogeographic features for hundreds of species simultaneously, opening a new field which we suggest to name metaphylogeography. The main challenge for the implementation of this approach is the separation of erroneous sequences from true intra-MOTU variation. Here, we develop a cleaning protocol based on changes in entropy of the different codon positions of the COI sequence, together with co-occurrence patterns of sequences. Using a dataset of community DNA from several benthic littoral communities in the Mediterranean and Atlantic seas, we first tested by simulation on a subset of sequences a two-step cleaning approach consisting of a denoising step followed by a minimal abundance filtering. The procedure was then applied to the whole dataset. We obtained a total of 563 MOTUs that were usable for phylogeographic inference. We used semiquantitative rank data instead of read abundances to perform AMOVAs and haplotype networks. Genetic variability was mainly concentrated within samples, but with an important between-seas component as well. There were inter-group differences in the amount of variability between and within communities in each sea. For two species the results could be compared with traditional Sanger sequence data available for the same zones, giving similar patterns. Our study shows that metabarcoding data can be used to infer intra- and interpopulation genetic variability of many species at a time, providing a new method with great potential for basic biogeography, connectivity and dispersal studies, and for the more applied fields of conservation genetics, invasion genetics, and design of protected areas.


2019 ◽  
Vol 286 (1901) ◽  
pp. 20190079 ◽  
Author(s):  
Joanna M. Wolfe ◽  
Jesse W. Breinholt ◽  
Keith A. Crandall ◽  
Alan R. Lemmon ◽  
Emily Moriarty Lemmon ◽  
...  

Comprising over 15 000 living species, decapods (crabs, shrimp and lobsters) are the most instantly recognizable crustaceans, representing a considerable global food source. Although decapod systematics have received much study, limitations of morphological and Sanger sequence data have yet to produce a consensus for higher-level relationships. Here, we introduce a new anchored hybrid enrichment kit for decapod phylogenetics designed from genomic and transcriptomic sequences that we used to capture new high-throughput sequence data from 94 species, including 58 of 179 extant decapod families, and 11 of 12 major lineages. The enrichment kit yields 410 loci (greater than 86 000 bp) conserved across all lineages of Decapoda, more clade-specific molecular data than any prior study. Phylogenomic analyses recover a robust decapod tree of life strongly supporting the monophyly of all infraorders, and monophyly of each of the reptant, ‘lobster’ and ‘crab’ groups, with some results supporting pleocyemate monophyly. We show that crown decapods diverged in the Late Ordovician and most crown lineages diverged in the Triassic–Jurassic, highlighting a cryptic Palaeozoic history, and post-extinction diversification. New insights into decapod relationships provide a phylogenomic window into morphology and behaviour, and a basis to rapidly and cheaply expand sampling in this economically and ecologically significant invertebrate clade.


2020 ◽  
Author(s):  
Rogier Bodewes ◽  
Linda Reijnen ◽  
Jeroen Kerkhof ◽  
Jeroen Cremer ◽  
Dennis Schmitz ◽  
...  

AbstractMumps cases continue to occur, also in countries with a relatively high vaccination rate. The last major outbreaks of mumps in the Netherlands were from 2009-2012 and thereafter, only small clusters and single cases were reported. Molecular epidemiology can provide insights in the circulation of mumps viruses. The aims of the present study were to analyze the molecular epidemiology of mumps viruses in the Netherlands in 2017-2019 and to elucidate whether complete genome sequencing adds to the molecular resolution of mumps viruses when compared to sequencing of the mumps SH gene and non-coding regions (SH+NCRs). To this end, Sanger sequence data from the SH+NCRs were analyzed from 82 mumps genotype G viruses. In addition, the complete genomes were obtained from 10 mumps virus isolates using next-generation sequencing. Analysis of SH+NCRs of mumps viruses revealed the presence of two major lineages in the Netherlands, which was confirmed by analysis of complete genomes. Comparison of molecular resolution obtained with SH+NCRs and complete genomes clearly indicated that additional molecular resolution can be obtained by analyzing complete genomes. In conclusion, analysis of SH + NCRs sequence data from recent mumps genotype G viruses indicate that mumps viruses continue to circulate in the Netherlands and surrounding countries. However, to understand exact transmission trees and to compare mumps viruses on a large geographic scale, analysis of complete genomes is a very useful approach.


2018 ◽  
Author(s):  
Joanna M. Wolfe ◽  
Jesse W. Breinholt ◽  
Keith A. Crandall ◽  
Alan R. Lemmon ◽  
Emily Moriarty Lemmon ◽  
...  

AbstractComprising over 15,000 living species, decapods (crabs, shrimp, and lobsters) are the most instantly recognizable crustaceans, representing a considerable global food source. Although decapod systematics have received much study, limitations of morphological and Sanger sequence data have yet to produce a consensus for higher-level relationships. Here we introduce a new anchored hybrid enrichment kit for decapod phylogenetics designed from genomic and transcriptomic sequences that we used to capture new high-throughput sequence data from 94 species, including 58 of 179 extant decapod families, and 11 of 12 major lineages. The enrichment kit yields 410 loci (>86,000 bp) conserved across all lineages of Decapoda, eight times more molecular data than any prior study. Phylogenomic analyses recover a robust decapod tree of life strongly supporting the monophyly of all infraorders, and monophyly of each of the reptant, ‘lobster’, and ‘crab’ groups, with some results supporting pleocyemate monophyly. We show that crown decapods diverged in the Late Ordovician and most crown lineages diverged in the Triassic-Jurassic, highlighting a cryptic Paleozoic history, and post-extinction diversification. New insights into decapod relationships provide a phylogenomic window into morphology and behavior, and a basis to rapidly and cheaply expand sampling in this economically and ecologically significant invertebrate clade.


2020 ◽  
Author(s):  
Jehangir Khan ◽  
Saber Gholizadeh ◽  
Meichun Zhang ◽  
Dongjing Zhang ◽  
Yu Wu ◽  
...  

Abstract Background: Anopheles stephensi Listen (1901) is the major malaria vector in the Asia, and recently in some regions of Africa. This species includes three biological forms, namely “type”, “intermediate” and “mysorensis” with varying degree of vector competence for malaria parasites. To recognize these siblings of An. stephensi lab strain, we used the morphological features of eggs and several genetic markers i.e. Obp1 (odorant binding protein), mitochondrial oxidases subunit 1 and 2 (COI and COII), nuclear internal transcribed spacer 2 locus (ITS2). Methods: Eggs were collected from individual mosquito (n = 50) and observed for the number of ridges under stereomicroscope. DNA was extracted from female mosquitoes. After amplifying fragments by using different genetic markers (Obp1, COI, COII and ITS2), the PCR products were purified and sequenced using Sanger Sequence Technology. Phylogenetic analysis was performed after aligning query sequences against the submitted sequences in GenBank using bioinformatics software. Results: The range of ridges number on each egg float was 12-13 that corresponds to the mysorensis form. Sequence analysis for COI, COII and ITS2 demonstrated 100%, 99.46% and 99.29% similarity of our species with other Chinese, Indian and Iranian strains of An. stephensi. All the sequences of Obp1 intron I region matched 100% with the previously submitted sequences for An. stephensi sibling C (mysorensis form) from Iran and Afghanistan.Conclusion: The current study elaborately describes the morphological and molecular details (sequence data) of the ‘mysorensis’ form of An. stephensi that could be helpful in elucidating its classification and also in its differentiation from other biotypes of the same and other similar anophelines species. Our findings confirmed OBP1 as the only genetic marker that successfully recognized our lab strain (mysorensis form/ sibling C). Thus using OBP1 (alone) may be very phenomenal in similar studies in the future. Future studies should involve the development of appropriate and reliable molecular keys (for sibling species identification) complementary to morphological keys.


2018 ◽  
Vol 3 ◽  
pp. 108
Author(s):  
Elise Ruark ◽  
Esty Holt ◽  
Anthony Renwick ◽  
Márton Münz ◽  
Matthew Wakeling ◽  
...  

Evaluating, optimising and benchmarking of next generation sequencing (NGS) variant calling performance are essential requirements for clinical, commercial and academic NGS pipelines. Such assessments should be performed in a consistent, transparent and reproducible fashion, using independently, orthogonally generated data. Here we present ICR142 Benchmarker, a tool to generate outputs for assessing variant calling performance using the ICR142 NGS validation series, a dataset of exome sequence data from 142 samples together with Sanger sequence data at 704 sites. ICR142 Benchmarker provides summary and detailed information on the sensitivity, specificity and false detection rates of variant callers. ICR142 Benchmarker also automatically generates a single page report highlighting key performance metrics and how performance compares to widely-used open-source tools. We used ICR142 Benchmarker with VCF files outputted by GATK, OpEx and DeepVariant to create a benchmark for variant calling performance. This evaluation revealed pipeline-specific differences and shared challenges in variant calling, for example in detecting indels in short repeating sequence motifs. We next used ICR142 Benchmarker to perform regression testing with versions 0.5.2 and 0.6.1 of DeepVariant. This showed that v0.6.1 improves variant calling performance, but there was evidence of some minor changes in indel calling behaviour that may benefit from attention in future updates. The data also allowed us to evaluate filters to optimise DeepVariant calling, and we recommend using 30 as the QUAL threshold for base substitution calls when using DeepVariant v0.6.1. Finally, we used ICR142 Benchmarker with VCF files from two commercial variant calling providers to facilitate optimisation of their in-house pipelines and to provide transparent benchmarking of their performance. ICR142 Benchmarker consistently and transparently analyses variant calling performance based on the ICR142 NGS validation series, using the standard VCF input and outputting informative metrics to enable user understanding of pipeline performance. ICR142 Benchmarker is freely available at https://github.com/RahmanTeamDevelopment/ICR142_Benchmarker/releases.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 386 ◽  
Author(s):  
Elise Ruark ◽  
Anthony Renwick ◽  
Matthew Clarke ◽  
Katie Snape ◽  
Emma Ramsay ◽  
...  

To provide a useful community resource for orthogonal assessment of NGS analysis software, we present the ICR142 NGS validation series. The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 730 sites; 409 sites with variants and 321 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 286 indel variants and 275 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance. The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332.


F1000Research ◽  
2018 ◽  
Vol 5 ◽  
pp. 386
Author(s):  
Elise Ruark ◽  
Anthony Renwick ◽  
Matthew Clarke ◽  
Katie Snape ◽  
Emma Ramsay ◽  
...  

To provide a useful community resource for orthogonal assessment of NGS analysis software, we present the ICR142 NGS validation series. The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 704 sites; 416 sites with variants and 288 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 293 indel variants and 247 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance. The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332.


2019 ◽  
Vol 44 (4) ◽  
pp. 930-942
Author(s):  
Geraldine A. Allen ◽  
Luc Brouillet ◽  
John C. Semple ◽  
Heidi J. Guest ◽  
Robert Underhill

Abstract—Doellingeria and Eucephalus form the earliest-diverging clade of the North American Astereae lineage. Phylogenetic analyses of both nuclear and plastid sequence data show that the Doellingeria-Eucephalus clade consists of two main subclades that differ from current circumscriptions of the two genera. Doellingeria is the sister group to E. elegans, and the Doellingeria + E. elegans subclade in turn is sister to the subclade containing all remaining species of Eucephalus. In the plastid phylogeny, the two subclades are deeply divergent, a pattern that is consistent with an ancient hybridization event involving ancestral species of the Doellingeria-Eucephalus clade and an ancestral taxon of a related North American or South American group. Divergence of the two Doellingeria-Eucephalus subclades may have occurred in association with northward migration from South American ancestors. We combine these two genera under the older of the two names, Doellingeria, and propose 12 new combinations (10 species and two varieties) for all species of Eucephalus.


Sign in / Sign up

Export Citation Format

Share Document