quality filtering
Recently Published Documents


TOTAL DOCUMENTS

66
(FIVE YEARS 26)

H-INDEX

12
(FIVE YEARS 3)

2021 ◽  
Vol 12 ◽  
Author(s):  
Patrice Bonny ◽  
Julien Schaeffer ◽  
Alban Besnard ◽  
Marion Desdouits ◽  
Jean Justin Essia Ngang ◽  
...  

Many recent pandemics have been recognized as zoonotic viral diseases. While their origins remain frequently unknown, environmental contamination may play an important role in emergence. Thus, being able to describe the viral diversity in environmental samples contributes to understand the key issues in zoonotic transmission. This work describes the use of a metagenomic approach to assess the diversity of eukaryotic RNA viruses in river clams and identify sequences from human or potentially zoonotic viruses. Clam samples collected over 2years were first screened for the presence of norovirus to verify human contamination. Selected samples were analyzed using metagenomics, including a capture of sequences from viral families infecting vertebrates (VirCapSeq-VERT) before Illumina NovaSeq sequencing. The bioinformatics analysis included pooling of data from triplicates, quality filtering, elimination of bacterial and host sequences, and a deduplication step before de novo assembly. After taxonomic assignment, the viral fraction represented 0.8–15% of reads with most sequences (68–87%) remaining un-assigned. Yet, several mammalian RNA viruses were identified. Contigs identified as belonging to the Astroviridae were the most abundant, with some nearly complete genomes of bastrovirus identified. Picobirnaviridae sequences were related to strains infecting bats, and few others to strains infecting humans or other hosts. Hepeviridae sequences were mostly related to strains detected in sponge samples but also strains from swine samples. For Caliciviridae and Picornaviridae, most of identified sequences were related to strains infecting bats, with few sequences close to human norovirus, picornavirus, and genogroup V hepatitis A virus. Despite a need to improve the sensitivity of our method, this study describes a large diversity of RNA virus sequences from clam samples. To describe all viral contaminants in this type of food, and being able to identify the host infected by viral sequences detected, may help to understand some zoonotic transmission events and alert health authorities of possible emergence.


2021 ◽  
Vol 17 (11) ◽  
pp. e1009581
Author(s):  
Michael S. Robeson ◽  
Devon R. O’Rourke ◽  
Benjamin D. Kaehler ◽  
Michal Ziemski ◽  
Matthew R. Dillon ◽  
...  

Nucleotide sequence and taxonomy reference databases are critical resources for widespread applications including marker-gene and metagenome sequencing for microbiome analysis, diet metabarcoding, and environmental DNA (eDNA) surveys. Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases. Furthermore, database composition drastically influences results, and lack of standardization limits cross-study comparisons. To address these challenges, we developed RESCRIPt, a Python 3 software package and QIIME 2 plugin for reproducible generation and management of reference sequence taxonomy databases, including dedicated functions that streamline creating databases from popular sources, and functions for evaluating, comparing, and interactively exploring qualitative and quantitative characteristics across reference databases. To highlight the breadth and capabilities of RESCRIPt, we provide several examples for working with popular databases for microbiome profiling (SILVA, Greengenes, NCBI-RefSeq, GTDB), eDNA and diet metabarcoding surveys (BOLD, GenBank), as well as for genome comparison. We show that bigger is not always better, and reference databases with standardized taxonomies and those that focus on type strains have quantitative advantages, though may not be appropriate for all use cases. Most databases appear to benefit from some curation (quality filtering), though sequence clustering appears detrimental to database quality. Finally, we demonstrate the breadth and extensibility of RESCRIPt for reproducible workflows with a comparison of global hepatitis genomes. RESCRIPt provides tools to democratize the process of reference database acquisition and management, enabling researchers to reproducibly and transparently create reference materials for diverse research applications. RESCRIPt is released under a permissive BSD-3 license at https://github.com/bokulich-lab/RESCRIPt.


2021 ◽  
Vol 14 (9) ◽  
pp. 6249-6304
Author(s):  
Mahesh Kumar Sha ◽  
Bavo Langerock ◽  
Jean-François L. Blavier ◽  
Thomas Blumenstock ◽  
Tobias Borsdorff ◽  
...  

Abstract. The Sentinel-5 Precursor (S5P) mission with the TROPOspheric Monitoring Instrument (TROPOMI) on board has been measuring solar radiation backscattered by the Earth's atmosphere and surface since its launch on 13 October 2017. In this paper, we present for the first time the S5P operational methane (CH4) and carbon monoxide (CO) products' validation results covering a period of about 3 years using global Total Carbon Column Observing Network (TCCON) and Infrared Working Group of the Network for the Detection of Atmospheric Composition Change (NDACC-IRWG) network data, accounting for a priori alignment and smoothing uncertainties in the validation, and testing the sensitivity of validation results towards the application of advanced co-location criteria. We found that the S5P standard and bias-corrected CH4 data over land surface for the recommended quality filtering fulfil the mission requirements. The systematic difference of the bias-corrected total column-averaged dry air mole fraction of methane (XCH4) data with respect to TCCON data is -0.26±0.56 % in comparison to -0.68±0.74 % for the standard XCH4 data, with a correlation of 0.6 for most stations. The bias shows a seasonal dependence. We found that the S5P CO data over all surfaces for the recommended quality filtering generally fulfil the missions requirements, with a few exceptions, which are mostly due to co-location mismatches and limited availability of data. The systematic difference between the S5P total column-averaged dry air mole fraction of carbon monoxide (XCO) and the TCCON data is on average 9.22±3.45 % (standard TCCON XCO) and 2.45±3.38 % (unscaled TCCON XCO). We found that the systematic difference between the S5P CO column and NDACC CO column (excluding two outlier stations) is on average 6.5±3.54 %. We found a correlation of above 0.9 for most TCCON and NDACC stations. The study shows the high quality of S5P CH4 and CO data by validating the products against reference global TCCON and NDACC stations covering a wide range of latitudinal bands, atmospheric conditions and surface conditions.


2021 ◽  
Author(s):  
Thomas E. Taylor ◽  
Christopher W. O'Dell ◽  
David Crisp ◽  
Akhiko Kuze ◽  
Hannakaisa Lindqvist ◽  
...  

Abstract. The Thermal And Near infrared Sensor for carbon Observation – Fourier Transform Spectrometer (TANSO-FTS) on the Japanese Greenhouse gases Observing SATellite (GOSAT) has been returning data since April 2009. The version 9 (v9) Atmospheric Carbon Observations from Space (ACOS) Level 2 Full Physics (L2FP) retrieval algorithm (Kiel et al., 2019) was used to derive estimates of carbon dioxide (CO2) dry air mole fraction (XCO2) from the TANSO-FTS measurements collected over it's first eleven years of operation. The bias correction and quality filtering of the L2FP XCO2 product were evaluated using estimates derived from the Total Carbon Column Observing Network (TCCON) as well as values simulated from a suite of global atmospheric inverse modeling systems (models). In addition, the v9 ACOS GOSAT XCO2 results were compared with collocated XCO2 estimates derived from NASA's Orbiting Carbon Observatory-2 (OCO-2), using the version 10 (v10) ACOS L2FP algorithm. These tests indicate that the v9 ACOS GOSAT XCO2 product has improved throughput, scatter and bias, when compared to the earlier v7.3 ACOS GOSAT product, which extended through mid 2016. Of the 37 million (M) soundings collected by GOSAT through June 2020, approximately 20 % were selected for processing by the v9 L2FP algorithm after screening for clouds and other artifacts. After post-processing, 5.4 % of the soundings (2M out of 37M) were assigned a “good” XCO2 quality flag, as compared to 3.9 % in v7.3 (< 1M out of 24M). After quality filtering and bias correction, the differences in XCO2 between ACOS GOSAT v9 and both TCCON and models have a scatter (one sigma) of approximately 1 ppm for ocean-glint observations and 1 to 1.5 ppm for land observations. Similarly, global mean biases are less than approximately 0.2 ppm. Seasonal mean biases relative to the v10 OCO-2 XCO2 product are of order 0.1 ppm for observations over land. However, for ocean-glint observations, seasonal mean biases relative to OCO-2 range from 0.2 to 0.6 ppm, with substantial variation in time and latitude. The ACOS GOSAT v9 XCO2 data are available on the NASA Goddard Earth Science Data and Information Services Center (GES-DISC). The v9 ACOS Data User's Guide (DUG) describes best-use practices for the data. This dataset should be especially useful for studies of carbon cycle phenomena that span a full decade or more, and may serve as a useful complement to the shorter OCO-2 v10 dataset, which begins in September 2014.


2021 ◽  
Author(s):  
Thomas LaFramboise ◽  
Jakob Woerner ◽  
Yidi Huang ◽  
Stephan Hutter ◽  
Jesús Sánchez ◽  
...  

Abstract Although recent work has characterized the microbiome in solid tumors, microbial content in hematological malignancies is not well-characterized. Here we analyzed existing deep DNA sequence data from the blood and bone marrow of 1,870 patients with myeloid malignancies, along with healthy controls, for bacterial, fungal, and viral content. After strict quality filtering, we find evidence for dysbiosis in disease cases, and distinct microbial signatures among diagnoses. In patients with low-risk myelodysplastic syndrome, we provide evidence that Epstein-Barr infection status refines risk stratification into more precise categories than the current standard. Motivated by these observations, we construct machine-learning classifiers that can discriminate among disease subtypes based solely on bacterial content. Our study highlights the potential of the circulating microbiome as a diagnostic and prognostic tool.


2021 ◽  
Vol 4 ◽  
Author(s):  
Florian Weinberger ◽  
Sophie Steinhagen ◽  
Rolf Karez ◽  
Guido Bonthond

Ulva-like green algae are notoriously difficult to distinguish due to their morphological variability and/or similarity. DNA barcoding approaches are therefore currently essential for their reliable identification. However, such approaches often fail when rare or inconspicuous species are to be detected in large mixed populations of Ulva species, for example, at early stages following the introduction of species into new habitats. We therefore developed a detection method based on next-generation DNA sequencing. The approach is suitable for the analysis of DNA traces in preserved water samples or in particles enriched by filtration from such samples. A new pair of primers was designed to amplify a 475 bp segment within the tufA marker gene. The primers were relatively group specific. 68.5% of all reads obtained after quality filtering represented the genus Ulva, 11.1% other Ulvophyceae, and only 20% other Chlorophyta, despite their relatively higher abundance in phytoplankton. The relatively short target amplicon still allows good differentiation of Ulvales and Ulothrichales at the species level. Using a database containing tufA sequences of 879 species - 281 of which were Ulvophyceae and 35 Ulva - we were able to detect mostly Ulvophyceae that had been previously detected in our study area in northern Germany using Sanger sequencing. However, the number of species detected at individual sites was generally higher than in previous studies, which could be due to drifting DNA: Analysis of samples collected at different distances from shore suggests that a sample collected at a given site may be influenced by Ulvophyceae within a radius of up to about 1 km in winter. In summer, this radius is reduced to less than 100 m, possibly due to the less frequent occurrence of strong wind events. Nonetheless, rare species may be detected with this new approach: At one site, an undescribed Blidingia species that was not previously known from our study area was repeatedly detected. Based on these findings, the species was searched for and found, and its identity confirmed by traditional tufA barcoding.


2021 ◽  
Vol 444 ◽  
pp. 109453
Author(s):  
Camille Van Eupen ◽  
Dirk Maes ◽  
Marc Herremans ◽  
Kristijn R.R. Swinnen ◽  
Ben Somers ◽  
...  

NeuroImage ◽  
2021 ◽  
Vol 227 ◽  
pp. 117657
Author(s):  
Nelson Gil ◽  
Michael L. Lipton ◽  
Roman Fleysher

2021 ◽  
Author(s):  
Jean-Romain Delaloye ◽  
David Vernez ◽  
Guillaume Suarez ◽  
Damien de Courten ◽  
Walter Zingg ◽  
...  

2021 ◽  
Vol 11 ◽  
Author(s):  
Zhenwu Luo ◽  
Alexander V. Alekseyenko ◽  
Elizabeth Ogunrinde ◽  
Min Li ◽  
Quan-Zhen Li ◽  
...  

Blood microbiome is important to investigate microbial-host interactions and the effects on systemic immune perturbations. However, this effort has met with major challenges due to low microbial biomass and background artifacts. In the current study, microbial 16S DNA sequencing was applied to analyze plasma microbiome. We have developed a quality-filtering strategy to evaluate and exclude low levels of microbial sequences, potential contaminations, and artifacts from plasma microbial 16S DNA sequencing analyses. Furthermore, we have applied our technique in three cohorts, including tobacco-smokers, HIV-infected individuals, and individuals with systemic lupus erythematosus (SLE), as well as corresponding controls. More than 97% of total sequence data was removed using stringent quality-filtering strategy analyses; those removed amplicon sequence variants (ASVs) were low levels of microbial sequences, contaminations, and artifacts. The specifically enriched pathobiont bacterial ASVs have been identified in plasmas from tobacco-smokers, HIV-infected individuals, and individuals with SLE but not from control subjects. The associations between these ASVs and disease pathogenesis were demonstrated. The pathologic activities of some identified bacteria were further verified in vitro. We present a quality-filtering strategy to identify pathogenesis-associated plasma microbiome. Our approach provides a method for studying the diagnosis of subclinical microbial infection as well as for understanding the roles of microbiome-host interaction in disease pathogenesis.


Sign in / Sign up

Export Citation Format

Share Document