scholarly journals A comprehensive evaluation of single-end sequencing data analyses for environmental microbiome research

Author(s):  
Meganathan P. Ramakodi
2021 ◽  
Author(s):  
Meganathan Ramakodi

Abstract Illumina sequencing platforms have been widely used for amplicon-based environmental microbiome research. Analyses of amplicon data of environmental samples, generated from Illumina MiSeq platform illustrate the reverse (R2) reads in the PE datasets to have low quality towards the 3’ end of the reads which affect the sequencing depth of samples and ultimately impact the sample size which may possibly lead to an altered outcome. This study evaluates the usefulness of single-end (SE) sequencing data in microbiome research when the Illumina MiSeq PE dataset shows significantly high number of low quality reverse reads. In this study, the amplicon data (V1V3, V3V4, V4V5 and V6V8) from 128 environmental (soil) samples, downloaded from SRA, demonstrate the efficiency of single-end (SE) sequencing data analyses in microbiome research. The SE datasets were found to infer the core microbiome structure as comparable to the PE dataset. Conspicuously, the forward (R1) datasets inferred a higher number of taxa as compared to PE datasets for most of the amplicon regions, except V3V4. Thus, analyses of SE sequencing data, especially R1 reads, in environmental microbiome studies could ameliorate the problems arising on sample size of the study due to low quality reverse reads in the dataset. However, care must be taken while interpreting the microbiome structure as few taxa observed in the PE datasets were absent in the SE datasets. In conclusion, this study demonstrates the availability of choices in analyzing the amplicon data without having the need to remove samples with low quality reverse reads.


1986 ◽  
Vol 14 (4) ◽  
pp. 201-218 ◽  
Author(s):  
A. G. Veith

Abstract This four-part series of papers addresses the problem of systematic determination of the influence of several tire factors on tire treadwear. Both the main effect of each factor and some of their interactive effects are included. The program was also structured to evaluate the influence of some external-to-tire conditions on the relationship of tire factors to treadwear. Part I describes the experimental design used to evaluate the effects on treadwear of generic tire type, aspect ratio, tread pattern (groove or void level), type of pattern (straight rib or block), and tread compound. Construction procedures and precautions used to obtain a valid and functional test method are included. Two guiding principles to be used in the data analyses of Parts II and III are discussed. These are the fractional groove and void concept, to characterize tread pattern geometry, and a demonstration of the equivalence of wear rate for identical compounds on whole tread or multi-section tread tires.


2021 ◽  
Author(s):  
Gongjun Wang ◽  
Libin Sun ◽  
Shasha Wang ◽  
Jing Guo ◽  
Hui Li ◽  
...  

Abstract Background: Ferroptosis is a form of cell death involved in diverse physiological context. Increasing evidence suggests that there is a closely regulatory relationship between ferroptosis and long noncoding RNAs (lncRNAs).Method: RNA-sequencing data from The Cancer Genome Atlas (TCGA) data resource and ferroptosis-related genes from FerrDb (http://www.zhounan.org/ferrdb/) data resource were employed to select differentially expressed lncRNAs. We performed Univariate Cox regression and multivariate Cox analyses analysis on these differentially expressed lncRNAs to screen independent predictive factors. Subsequently, we established two signatures for predicting overall survival (OS) and progression-free survival (PFS). Finally, experiments were conducted to verify the roles of LASTR in gastric cancer (GC).Results: We identified 12 differentially expressed lncRNAs linked with OS and 13 associated with PFS. Kaplan-Meier(K-M) analyses exhibited that the high-risk group was related to a poor prognosis of stomach adenocarcinoma (STAD). The AUCs of the OS, as well as PFS signatures of lncRNAs were 0.734 and 0.771, respectively, indicating their excellent efficacy in predicting STAD prognosis. Our experimental results illustrated that the inhibition of LASTR inhibited tumor proliferation and migration in GC.Conclusion: This comprehensive evaluation of the ferroptosis-related lncRNA landscape in STAD unearthed novel lncRNAs related to carcinogenesis. In addition, we also experimentally confirmed the effects of LASTR on proliferation, migration and ferroptosis. These results provide potential novel targets for tumor treatment and promote personalized medicine.


2018 ◽  
Author(s):  
Will P. M. Rowe ◽  
Anna Paola Carrieri ◽  
Cristina Alcon-Giner ◽  
Shabhonam Caim ◽  
Alex Shaw ◽  
...  

AbstractMotivationThe growth in publically available microbiome data in recent years has yielded an invaluable resource for genomic research; allowing for the design of new studies, augmentation of novel datasets and reanalysis of published works. This vast amount of microbiome data, as well as the widespread proliferation of microbiome research and the looming era of clinical metagenomics, means there is an urgent need to develop analytics that can process huge amounts of data in a short amount of time.To address this need, we propose a new method for the compact representation of microbiome sequencing data using similarity-preserving sketches of streaming k-mer spectra. These sketches allow for dissimilarity estimation, rapid microbiome catalogue searching, and classification of microbiome samples in near real-time.ResultsWe apply streaming histogram sketching to microbiome samples as a form of dimensionality reduction, creating a compressed ‘histosketch’ that can be used to efficiently represent microbiome k-mer spectra. Using public microbiome datasets, we show that histosketches can be clustered by sample type using pairwise Jaccard similarity estimation, consequently allowing for rapid microbiome similarity searches via a locality sensitive hashing indexing scheme. Furthermore, we show that histosketches can be used to train machine learning classifiers to accurately label microbiome samples. Specifically, using a collection of 108 novel microbiome samples from a cohort of premature neonates, we trained and tested a Random Forest Classifier that could accurately predict whether the neonate had received antibiotic treatment (95% accuracy, precision 97%) and could subsequently be used to classify microbiome data streams in less than 12 seconds.We provide our implementation, Histosketching Using Little K-mers (HULK), which can histosketch a typical 2GB microbiome in 50 seconds on a standard laptop using 4 cores, with the sketch occupying 3000 bytes of disk space.AvailabilityOur implementation (HULK) is written in Go and is available at: https://github.com/will-rowe/hulk (MIT License)


2020 ◽  
Author(s):  
Stevenn Volant ◽  
Pierre Lechat ◽  
Perrine Woringer ◽  
Laurence Motreff ◽  
Christophe Malabat ◽  
...  

Abstract BackgroundComparing the composition of microbial communities among groups of interest (e.g., patients vs healthy individuals) is a central aspect in microbiome research. It typically involves sequencing, data processing, statistical analysis and graphical representation of the detected signatures. Such an analysis is normally obtained by using a set of different applications that require specific expertise for installation, data processing and in some case, programming skills. ResultsHere, we present SHAMAN, an interactive web application we developed in order to facilitate the use of (i) a bioinformatic workflow for metataxonomic analysis, (ii) a reliable statistical modelling and (iii) to provide among the largest panels of interactive visualizations as compared to the other options that are currently available. SHAMAN is specifically designed for non-expert users who may benefit from using an integrated version of the different analytic steps underlying a proper metagenomic analysis. The application is freely accessible at http://shaman.pasteur.fr/, and may also work as a standalone application with a Docker container (aghozlane/shaman), conda and R. The source code is written in R and is available at https://github.com/aghozlane/shaman. Using two datasets (a mock community sequencing and published 16S rRNA metagenomic data), we illustrate the strengths of SHAMAN in quickly performing a complete metataxonomic analysis. ConclusionsWe aim with SHAMAN to provide the scientific community with a platform that simplifies reproducible quantitative analysis of metagenomic data.


2019 ◽  
Author(s):  
Emmi Jokinen ◽  
Jani Huuhtanen ◽  
Satu Mustjoki ◽  
Markus Heinonen ◽  
Harri Lähdesmäki

T cell receptors (TCRs) can recognize various pathogens and consequently start immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals’ immune status in different diseases. We have developed TCRGP, a novel Gaussian process method to predict if TCRs recognize certain epitopes. This method can utilize CDR sequences from TCRα and TCRβ chains and learn which CDRs are important in recognizing different epitopes. We have experimented with with epitope-specific data against 29 epitopes and performed a comprehensive evaluation with existing prediction methods. On this data, TCRGP outperforms other state-of-the-art methods in epitope-specificity predictions. We also propose a novel analysis approach for combined single-cell RNA and TCRαβ (scRNA+TCRαβ) sequencing data by quantifying epitope-specific TCRs with TCRGP in phenotypes identified from scRNA-seq data. With this approach, we find HBV-epitope specific T cells and their transcriptomic states in hepatocellular carcinoma patients.


2017 ◽  
Author(s):  
Ben Langmead

AbstractRead alignment is the first step in most sequencing data analyses. Because a read’s point of origin can be ambiguous, aligners report a mapping quality: the probability the reported alignment is incorrect. Despite its importance, there is no established and general method for calculating mapping quality. We describe a framework for predicting mapping qualities that works by simulating a set of tandem reads, similar to the input reads in important ways, but for which the true point of origin is known. We implement this in an accurate and low-overhead tool called Qtip, which is compatible with popular aligners.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008814
Author(s):  
Emmi Jokinen ◽  
Jani Huuhtanen ◽  
Satu Mustjoki ◽  
Markus Heinonen ◽  
Harri Lähdesmäki

Adaptive immune system uses T cell receptors (TCRs) to recognize pathogens and to consequently initiate immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals’ immune status in different disorders. For this task, we have developed TCRGP, a novel Gaussian process method that predicts if TCRs recognize specified epitopes. TCRGP can utilize the amino acid sequences of the complementarity determining regions (CDRs) from TCRα and TCRβ chains and learn which CDRs are important in recognizing different epitopes. Our comprehensive evaluation with epitope-specific TCR sequencing data shows that TCRGP achieves on average higher prediction accuracy in terms of AUROC score than existing state-of-the-art methods in epitope-specificity predictions. We also propose a novel analysis approach for combined single-cell RNA and TCRαβ (scRNA+TCRαβ) sequencing data by quantifying epitope-specific TCRs with TCRGP and identify HBV-epitope specific T cells and their transcriptomic states in hepatocellular carcinoma patients.


Sign in / Sign up

Export Citation Format

Share Document