scholarly journals A flexible pipeline combining clustering and correction tools for prokaryotic and eukaryotic metabarcoding

2019 ◽  
Author(s):  
Miriam I. Brandt ◽  
Blandine Trouche ◽  
Laure Quintric ◽  
Patrick Wincker ◽  
Julie Poulain ◽  
...  

ABSTRACTEnvironmental metabarcoding is an increasingly popular tool for studying biodiversity in marine and terrestrial biomes. With sequencing costs decreasing, multiple-marker metabarcoding, spanning several branches of the tree of life, is becoming more accessible. However, bioinformatic approaches need to adjust to the diversity of taxonomic compartments targeted as well as to each barcode gene specificities. We built and tested a pipeline based on Illumina read correction with DADA2 allowing analyzing metabarcoding data from prokaryotic (16S) and eukaryotic (18S, COI) life compartments. We implemented the option to cluster Amplicon Sequence Variants (ASVs) into Operational Taxonomic Units (OTUs) with swarm v2, a network-based clustering algorithm, and to further curate the ASVs/OTUs based on sequence similarity and co-occurrence rates using a recently developed algorithm, LULU. Finally, flexible taxonomic assignment was implemented via Ribosomal Database Project (RDP) Bayesian classifier and BLAST. We validate this pipeline with ribosomal and mitochondrial markers using eukaryotic mock communities and 42 deep-sea sediment samples. The results show that ASVs, reflecting genetic diversity, may not be appropriate for alpha diversity estimation of organisms fitting the biological species concept. The results underline the advantages of clustering and LULU-curation for producing more reliable metazoan biodiversity inventories, and show that LULU is an effective tool for filtering metazoan molecular clusters, although the minimum identity threshold applied to co-occurring OTUs has to be increased for 18S. The comparison of BLAST and the RDP Classifier underlined the potential of the latter to deliver very good assignments, but highlighted the need for a concerted effort to build comprehensive, ecosystem-specific, databases adapted to the studied communities.

Plant Disease ◽  
2010 ◽  
Vol 94 (11) ◽  
pp. 1372-1372 ◽  
Author(s):  
M. Lembicz ◽  
K. Górzyńska ◽  
A. Leuchtmann

Agropyron repens (synonym Elymus repens, couch grass) is a species native to Europe and Asia. In Poland, it is a common weed of crop fields. In May 2008, we noticed for the first time symptoms of choke disease (caused by Epichloë spp.) on A. repens at two localities in central Poland. The localities, Pakość (52°47.531′N, 18°06.118′E) and Dulsk (52°45.329′N, 18°20.518′E), are located 16 km apart from each other. The following year, we confirmed the occurrence of choke disease on couch grass at these localities. Stromata were formed on reproductive stems that did not produce inflorescences. They ranged from 16 to 31 mm long and were covered with perithecia 520 to 560 × 160 to 250 μm at a density of 35 to 45 per mm2. Asci measured 270 to 310 × 5.2 to 6.5 μm and ascospores were 225 to 275 × 1.5 to 1.7 μm (specimen deposited in ZT). Morphological characters match with the original description of Epichloë bromicola (4). One strain was isolated from stromatal tissue and the partial DNA sequence of tubB including introns 1 to 3 was obtained as previously described (2). In a phylogenetic analysis, the isolate (GenBank Accession No. GU325782) grouped with Epichloë isolates from other Agropyron spp. from Poland (A. intermedium) and Japan (A. ciliare and A. tsukushiense) and with an isolate from a Roegneria sp. (from China). Experimental mating tests involving isolates from A. intermedium and a Roegneria sp. indicated that these isolates were sexually compatible with Epichloë bromicola from Bromus erectus. Similarly, E. yangsii was compatible with E. bromicola. This suggests that Epichloë isolates from Agropyron, Roegneria, and Bromus hosts form a common mating population, and implies that under a biological species concept the phylogenetic definition of E. bromicola has to be broadened. Epichloë on A. repens has been previously found in Poland (1), Germany (3), Hungary, and Romania (specimen deposited in herbarium of ETH Zurich, ZT) based on incidental records or on herbarium specimens that have been listed under E. typhina. Our study, based on morphology, tubB sequence similarity, and mating compatibility, suggests that the fungus infecting A. repens in Poland is E. bromicola. References: (1) I. Adamska. Acta Mycol. 36:31, 2001. (2) D. Brem and A. Leuchtmann. Evolution 57:37, 2003. (3) J. Kohlmeyer and E. Kohlmeyer. Mycologia 66:77, 1974. (4) A. Leuchtmann and C. L. Schardl. Mycol. Res. 102:1169, 1998.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249113
Author(s):  
Paul N. Pearson ◽  
Luke Penny

Planktonic foraminifera are heterotrophic sexually reproducing marine protists with an exceptionally complete fossil record that provides unique insights into long-term patterns and processes of evolution. Populations often exhibit strong biases towards either right (dextral) or left (sinistral) shells. Deep-sea sediment cores spanning millions of years reveal that some species show large and often rapid fluctuations in their dominant coiling direction through time. This is useful for biostratigraphic correlation but further work is required to understand the population dynamical processes that drive these fluctuations. Here we address the case of coiling fluctuations in the planktonic foraminifer genus Pulleniatina based on new high-resolution counts from two recently recovered sediment cores from either side of the Indonesian through-flow in the tropical west Pacific and Indian Oceans (International Ocean Discovery Program Sites U1486 and U1483). We use single-specimen stable isotope analyses to show that dextral and sinistral shells from the same sediment samples can show significant differences in both carbon and oxygen isotopes, implying a degree of ecological separation between populations. In one case we detect a significant difference in size between dextral and sinistral specimens. We suggest that major fluctuations in coiling ratio are caused by cryptic populations replacing one another in competitive sweeps, a mode of evolution that is more often associated with asexual organisms than with the classical ‘biological species concept’.


2021 ◽  
Vol 102 (4) ◽  
Author(s):  
Yiyuan Li ◽  
Angela C. O’Donnell ◽  
Howard Ochman

Mosquito-borne arboviruses, including a diverse array of alphaviruses and flaviviruses, lead to hundreds of millions of human infections each year. Current methods for species-level classification of arboviruses adhere to guidelines prescribed by the International Committee on Taxonomy of Viruses (ICTV), and generally apply a polyphasic approach that might include information about viral vectors, hosts, geographical distribution, antigenicity, levels of DNA similarity, disease association and/or ecological characteristics. However, there is substantial variation in the criteria used to define viral species, which can lead to the establishment of artificial boundaries between species and inconsistencies when inferring their relatedness, variation and evolutionary history. In this study, we apply a single, uniform principle – that underlying the Biological Species Concept (BSC) – to define biological species of arboviruses based on recombination between genomes. Given that few recombination events have been documented in arboviruses, we investigate the incidence of recombination within and among major arboviral groups using an approach based on the ratio of homoplastic sites (recombinant alleles) to non-homoplastic sites (vertically transmitted alleles). This approach supports many ICTV-designations but also recognizes several cases in which a named species comprises multiple biological species. These findings demonstrate that this metric may be applied to all lifeforms, including viruses, and lead to more consistent and accurate delineation of viral species.


Author(s):  
Amanda Cicchino

Reproductive isolation is the hallmark of speciation as defined by the biological species concept. A species that is evolving towards reproductive isolation, but has not reached full isolation, is defined as an incipient species. One mechanism used by incipient species to further drive speciation is the use of mate recognition signals. The spring peeper, Pseudacris crucifer, is a North American frog that can be classified as an incipient species, as previous studies have found 6 distinct mitochondrial lineages within its range. Spring peepers use vocal signals for mate recognition and exhibit a female choice mating system where the males call to attract females. This study investigates the evolution of calling in spring peepers. Using calls from each lineage across the full range of spring peepers, I analyzed 11 different characteristics to determine whether the calls were different, and if so, which characteristics are being selected for. Preliminary evidence suggests that the calls between the lineages are distinct and that certain characteristics of the call are more heavily selected for than others. Full analysis on the data has not been completed at this time. This study will expand the understanding of the evolution of spring peepers, as well as offer insight into the role of mating systems on reproductive isolation.


Author(s):  
Hai-zhen Zhou ◽  
Jian Zhang ◽  
Qing-lei Sun

In this study, we reported a Gram-stain-negative, orange-coloured, rod-shaped, motile and faculatively anaerobic bacterium named strain PB63T, which was isolated from the deep-sea sediment from the Mariana Trench. Growth of PB63T occurred at 10–35 °C (optimum, 28 °C), pH 5.0–8.0 (optimum, 5.0–6.0) and with 0–7 % (w/v) NaCl (optimum, 2–3 %). The results of phylogenetic analysis based on 16S rRNA gene sequences indicated that PB63T represented a member of the genus Novosphingopyxis and was closely related to Novosphingopyxis baekryungensis DSM 16222T (97.9 % sequence similarity). PB63T showed tolerance to a variety of heavy metals, including Co2+, Zn2+, Mn2+ and Cu2+. The complete genome of PB63T was obtained, and many genes involved in heavy metal resistance were found. The genomic DNA G+C content of PB63T was 62.8 mol%. The predominant respiratory quinone of PB63T was ubiquinone-10 (Q-10). The polar lipids of PB63T contained diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine, sphingoglycolipid, glycolipid, phosphatidylcholines and three unidentified lipids. The major fatty acids of PB63T included summed feature 8 (C18 : 1ω7c or/and C18 : 1ω6c), C14 : 0 2-OH, 11-methyl C18 : 1ω7c, C16 : 0, summed feature 3 (C16 : 1ω7c and/or C16 : 1ω6c) and C17 : 1ω6c. The results of phylogenetic, physiological, biochemical and morphological analyses indicated that strain PB63T represents a novel species of the genus Novosphingopyxis , and the name Novosphingopyxis iocasae sp. nov. is proposed with the type species PB63T (=CCTCC AB 2019195T=JCM 34178T).


Author(s):  
Sanghoon Jun ◽  
Seungmin Rho ◽  
Eenjun Hwang

A typical music clip consists of one or more segments with different moods and such mood information could be a crucial clue for determining the similarity between music clips. One representative mood has been selected for music clip for retrieval, recommendation or classification purposes, which often gives unsatisfactory result. In this paper, the authors propose a new music retrieval and recommendation scheme based on the mood sequence of music clips. The authors first divide each music clip into segments through beat structure analysis, then, apply the k-medoids clustering algorithm for grouping all the segments into clusters with similar features. By assigning a unique mood symbol for each cluster, one can transform each music clip into a musical mood sequence. For music retrieval, the authors use the Smith-Waterman (SW) algorithm to measure the similarity between mood sequences. However, for music recommendation, user preferences are retrieved from a recent music playlist or user interaction through the interface, which generates a music recommendation list based on the mood sequence similarity. The authors demonstrate that the proposed scheme achieves excellent performance in terms of retrieval accuracy and user satisfaction in music recommendation.


2018 ◽  
Vol 93 (2) ◽  
pp. 226-241 ◽  
Author(s):  
S.P. Stock ◽  
R. Campos-Herrera ◽  
F.E. El-Borai ◽  
L.W. Duncan

AbstractIn this study, molecular (ribosomal sequence data), morphological and cross-hybridization properties were used to identify a newSteinernemasp. from Florida, USA. Molecular and morphological data provided evidence for placing the novel species into Clade V, or the ‘glaseri-group’ ofSteinernemaspp. Within this clade, analysis of sequence data of the rDNA genes, 28S and internal transcribed spacer (ITS), depicted the novel species as a distinctive entity and closely related toS. glaseriandS. cubanum.Additionally, cross-hybridization assays showed that the new species is unable to interbreed with either of the latter two species, reinforcing its uniqueness from a biological species concept standpoint. Key morphological diagnostic characters forS. khuongin. sp. include the mean morphometric features of the third-stage infective juveniles: total body length (average: 1066 μm), tail length (average: 65 μm), location of the excretory pore (average: 80.5 μm) and the values ofc(average: 16.4),D% (average: 60.5),E% (average: 126) andH% (average: 46.6). Additionally, males can be differentiated fromS. glaseriandS. cubanumby the values of several ratios:D% (average: 68),E% (average: 323) and SW% (average: 120). The natural distribution of this species in Florida encompasses both natural areas and citrus groves, primarily in shallow groundwater ecoregions designated as ‘flatwoods’. The morphological, molecular, phylogenetic and ecological data associated with this nematode support its identity as a new species in theS. glaseri-group.


2020 ◽  
Vol 36 (18) ◽  
pp. 4699-4705
Author(s):  
Hamid Bagheri ◽  
Andrew J Severin ◽  
Hridesh Rajan

Abstract Motivation As the cost of sequencing decreases, the amount of data being deposited into public repositories is increasing rapidly. Public databases rely on the user to provide metadata for each submission that is prone to user error. Unfortunately, most public databases, such as non-redundant (NR), rely on user input and do not have methods for identifying errors in the provided metadata, leading to the potential for error propagation. Previous research on a small subset of the NR database analyzed misclassification based on sequence similarity. To the best of our knowledge, the amount of misclassification in the entire database has not been quantified. We propose a heuristic method to detect potentially misclassified taxonomic assignments in the NR database. We applied a curation technique and quality control to find the most probable taxonomic assignment. Our method incorporates provenance and frequency of each annotation from manually and computationally created databases and clustering information at 95% similarity. Results We found more than two million potentially taxonomically misclassified proteins in the NR database. Using simulated data, we show a high precision of 97% and a recall of 87% for detecting taxonomically misclassified proteins. The proposed approach and findings could also be applied to other databases. Availability and implementation Source code, dataset, documentation, Jupyter notebooks and Docker container are available at https://github.com/boalang/nr. Supplementary information Supplementary data are available at Bioinformatics online.


2014 ◽  
Vol 64 (Pt_2) ◽  
pp. 668-674 ◽  
Author(s):  
Xiaoyang Fan ◽  
Tong Yu ◽  
Zhao Li ◽  
Xiao-Hua Zhang

Three Gram-stain-negative, strictly aerobic, rod-shaped with single polar flagellum, yellow-pigmented bacteria, designated strains XH031T, XH038-3 and XH80-1, were isolated from deep-sea sediment of the South Pacific Gyre (41° 51′ S 153° 6′ W) during the Integrated Ocean Drilling Program (IODP) Expedition 329. Phylogenetic analysis based on 16S rRNA gene sequences indicated that the isolates belonged to the genus Luteimonas and showed the highest 16S rRNA gene sequence similarity with Luteimonas aestuarii B9T (96.95 %), Luteimonas huabeiensis HB2T (96.93 %) and Xanthomonas cucurbitae LMG 690T (96.92 %). The DNA G+C contents of the three isolates were 70.2–73.9 mol%. The major fatty acids were iso-C15 : 0, iso-C16 : 0, iso-C11 : 0 and C16 : 010-methyl and/or iso-C17 : 1ω9c. The major respiratory quinone was ubiquinone-8 (Q-8). The major polar lipids were phosphatidylethanolamine, phosphatidylglycerol, diphosphatidylglycerol and one unknown phospholipid. On the basis of data from polyphasic analysis, the three isolates represent a novel species of the genus Luteimonas , for which the name Luteimonas abyssi sp. nov. is proposed. The type strain is XH031T ( = DSM 25880T = CGMCC 1.12611T).


PLoS ONE ◽  
2013 ◽  
Vol 8 (6) ◽  
pp. e68267 ◽  
Author(s):  
Lélia Lagache ◽  
Jean-Benoist Leger ◽  
Jean-Jacques Daudin ◽  
Rémy J. Petit ◽  
Corinne Vacher

Sign in / Sign up

Export Citation Format

Share Document