downstream analysis
Recently Published Documents


TOTAL DOCUMENTS

542
(FIVE YEARS 389)

H-INDEX

20
(FIVE YEARS 9)

2022 ◽  
Vol 8 ◽  
Author(s):  
Na Wang ◽  
Shuai Yuan ◽  
Cheng Fang ◽  
Xiao Hu ◽  
Yu-Sen Zhang ◽  
...  

Extracellular vesicles (EVs) are natural nanoparticles secreted by cells in the body and released into the extracellular environment. They are associated with various physiological or pathological processes, and considered as carriers in intercellular information transmission, so that EVs can be used as an important marker of liquid biopsy for disease diagnosis and prognosis. EVs are widely present in various body fluids, among which, urine is easy to obtain in large amount through non-invasive methods and has a small dynamic range of proteins, so it is a good object for studying EVs. However, most of the current isolation and detection of EVs still use traditional methods, which are of low purity, time consuming, and poor efficiency; therefore, more efficient and highly selective techniques are urgently needed. Recently, inspired by the nanoscale of EVs, platforms based on nanomaterials have been innovatively explored for isolation and detection of EVs from body fluids. These newly developed nanotechnologies, with higher selectivity and sensitivity, greatly improve the precision of isolation target EVs from urine. This review focuses on the nanomaterials used in isolation and detection of urinary EVs, discusses the advantages and disadvantages between traditional methods and nanomaterials-based platforms, and presents urinary EV-derived biomarkers for prostate cancer (PCa) diagnosis. We aim to provide a reference for researchers who want to carry out studies about nanomaterial-based platforms to identify urinary EVs, and we hope to summarize the biomarkers in downstream analysis of urinary EVs for auxiliary diagnosis of PCa disease in detail.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Andrea Hita ◽  
Gilles Brocart ◽  
Ana Fernandez ◽  
Marc Rehmsmeier ◽  
Anna Alemany ◽  
...  

Abstract Background Total-RNA sequencing (total-RNA-seq) allows the simultaneous study of both the coding and the non-coding transcriptome. Yet, computational pipelines have traditionally focused on particular biotypes, making assumptions that are not fullfilled by total-RNA-seq datasets. Transcripts from distinct RNA biotypes vary in length, biogenesis, and function, can overlap in a genomic region, and may be present in the genome with a high copy number. Consequently, reads from total-RNA-seq libraries may cause ambiguous genomic alignments, demanding for flexible quantification approaches. Results Here we present Multi-Graph count (MGcount), a total-RNA-seq quantification tool combining two strategies for handling ambiguous alignments. First, MGcount assigns reads hierarchically to small-RNA and long-RNA features to account for length disparity when transcripts overlap in the same genomic position. Next, MGcount aggregates RNA products with similar sequences where reads systematically multi-map using a graph-based approach. MGcount outputs a transcriptomic count matrix compatible with RNA-sequencing downstream analysis pipelines, with both bulk and single-cell resolution, and the graphs that model repeated transcript structures for different biotypes. The software can be used as a python module or as a single-file executable program. Conclusions MGcount is a flexible total-RNA-seq quantification tool that successfully integrates reads that align to multiple genomic locations or that overlap with multiple gene features. Its approach is suitable for the simultaneous estimation of protein-coding, long non-coding and small non-coding transcript concentration, in both precursor and processed forms. Both source code and compiled software are available at https://github.com/hitaandrea/MGcount.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Xuemin Dong ◽  
Shanshan Dong ◽  
Shengkai Pan ◽  
Xiangjiang Zhan

Abstract Background Understanding the transcriptome has become an essential step towards the full interpretation of the biological function of a cell, a tissue or even an organ. Many tools are available for either processing, analysing transcriptome data, or visualizing analysis results. However, most existing tools are limited to data from a single sequencing platform and only several of them could handle more than one analysis module, which are far from enough to meet the requirements of users, especially those without advanced programming skills. Hence, we still lack an open-source toolkit that enables both bioinformatician and non-bioinformatician users to process and analyze the large transcriptome data from different sequencing platforms and visualize the results. Results We present a Linux-based toolkit, RNA-combine, to automatically perform the quality assessment, downstream analysis of the transcriptome data generated from different sequencing platforms, including bulk RNA-seq (Illumina platform), single cell RNA-seq (10x Genomics) and Iso-Seq (PacBio) and visualization of the results. Besides, this toolkit is implemented with at least 10 analysis modules more than other toolkits examined in this study. Source codes of RNA-combine are available on GitHub: https://github.com/dongxuemin666/RNA-combine. Conclusion Our results suggest that RNA-combine is a reliable tool for transcriptome data processing and result interpretation for both bioinformaticians and non-bioinformaticians.


Author(s):  
Jiaci Chen ◽  
Peilong Li ◽  
Taiyi Zhang ◽  
Zhipeng Xu ◽  
Xiaowen Huang ◽  
...  

Exosomes, a nano-sized subtype of extracellular vesicles secreted from almost all living cells, are capable of transferring cell-specific constituents of the source cell to the recipient cell. Cumulative evidence has revealed exosomes play an irreplaceable role in prognostic, diagnostic, and even therapeutic aspects. A method that can efficiently provide intact and pure exosomes samples is the first step to both exosome-based liquid biopsies and therapeutics. Unfortunately, common exosomal separation techniques suffer from operation complexity, time consumption, large sample volumes and low purity, posing significant challenges for exosomal downstream analysis. Efficient, simple, and affordable methods to isolate exosomes are crucial to carrying out relevant researches. In the last decade, emerging technologies, especially microfluidic chips, have proposed superior strategies for exosome isolation and exhibited fascinating performances. While many excellent reviews have overviewed various methods, a compressive review including updated/improved methods for exosomal isolation is indispensable. Herein, we first overview exosomal properties, biogenesis, contents, and functions. Then, we briefly outline the conventional technologies and discuss the challenges of clinical applications of these technologies. Finally, we review emerging exosomal isolation strategies and large-scale GMP production of engineered exosomes to open up future perspectives of next-generation Exo-devices for cancer diagnosis and treatment.


2022 ◽  
Vol 12 ◽  
Author(s):  
Bingdong Liu ◽  
Liujing Huang ◽  
Zhihong Liu ◽  
Xiaohan Pan ◽  
Zongbing Cui ◽  
...  

Advances in next-generation sequencing (NGS) have revolutionized microbial studies in many fields, especially in clinical investigation. As the second human genome, microbiota has been recognized as a new approach and perspective to understand the biological and pathologic basis of various diseases. However, massive amounts of sequencing data remain a huge challenge to researchers, especially those who are unfamiliar with microbial data analysis. The mathematic algorithm and approaches introduced from another scientific field will bring a bewildering array of computational tools and acquire higher quality of script experience. Moreover, a large cohort research together with extensive meta-data including age, body mass index (BMI), gender, medical results, and others related to subjects also aggravate this situation. Thus, it is necessary to develop an efficient and convenient software for clinical microbiome data analysis. EasyMicroPlot (EMP) package aims to provide an easy-to-use microbial analysis tool based on R platform that accomplishes the core tasks of metagenomic downstream analysis, specially designed by incorporation of popular microbial analysis and visualization used in clinical microbial studies. To illustrate how EMP works, 694 bio-samples from Guangdong Gut Microbiome Project (GGMP) were selected and analyzed with EMP package. Our analysis demonstrated the influence of dietary style on gut microbiota and proved EMP package's powerful ability and excellent convenience to address problems for this field.


2021 ◽  
Author(s):  
Wei Liu ◽  
Xu Liao ◽  
Xiang Zhou ◽  
Xingjie Shi ◽  
Jin Liu

Dimension reduction and (spatial) clustering are two key steps for the analysis of both single-cell RNA-sequencing (scRNA-seq) and spatial transcriptomics data collected from different platforms. Most existing methods perform dimension reduction and (spatial) clustering sequentially, treating them as two consecutive stages in tandem analysis. However, the low-dimensional embeddings estimated in the dimension reduction step may not necessarily be relevant to the class labels inferred in the clustering step and thus may impair the performance of the clustering and other downstream analysis. Here, we develop a computation method, DR-SC, to perform both dimension reduction and (spatial) clustering jointly in a unified framework. Joint analysis in DR-SC ensures accurate (spatial) clustering results and effective extraction of biologically informative low-dimensional features. Importantly, DR-SC is not only applicable for cell type clustering in scRNA-seq studies but also applicable for spatial clustering in spatial transcriptimics that characterizes the spatial organization of the tissue by segregating it into multiple tissue structures. For spatial transcriptoimcs analysis, DR-SC relies on an underlying latent hidden Markov random field model to encourage the spatial smoothness of the detected spatial cluster boundaries. We also develop an efficient expectation-maximization algorithm based on an iterative conditional mode. DR-SC is not only scalable to large sample sizes, but is also capable of optimizing the spatial smoothness parameter in a data-driven manner. Comprehensive simulations show that DR-SC outperforms existing clustering methods such as Seurat and spatial clustering methods such as BayesSpace and SpaGCN and extracts more biologically relevant features compared to the conventional dimension reduction methods such as PCA and scVI. Using 16 benchmark scRNA-seq datasets, we demonstrate that the low-dimensional embeddings and class labels estimated from DR-SC lead to improved trajectory inference. In addition, analyzing three published scRNA-seq and spatial transcriptomics data in three platforms, we show DR-SC can improve both the spatial and non-spatial clustering performance, resolving a low-dimensional representation with improved visualization, and facilitate the downstream analysis such as trajectory inference.


2021 ◽  
Author(s):  
Kristiina Ausmees ◽  
Federico Sanchez-Quinto ◽  
Mattias Jakobsson ◽  
Carl Nettelblad

With capabilities of sequencing ancient DNA to high coverage often limited by sample quality or cost, imputation of missing genotypes presents a possibility to increase power of inference as well as cost-effectiveness for the analysis of ancient data. However, the high degree of uncertainty often associated with ancient DNA poses several methodological challenges, and performance of imputation methods in this context has not been fully explored. To gain further insights, we performed a systematic evaluation of imputation of ancient data using Beagle 4.0 and reference data from phase 3 of the 1000 Genomes project, investigating the effects of coverage, phased reference and study sample size. Making use of five ancient samples with high-coverage data available, we evaluated imputed data with respect to accuracy, reference bias and genetic affinities as captured by PCA. We obtained genotype concordance levels of over 99% for data with 1x coverage, and similar levels of accuracy and reference bias at levels as low as 0.75x. Our findings suggest that using imputed data can be a realistic option for various population genetic analyses even for data in coverage ranges below 1x. We also show that a large and varied phased reference set as well as the inclusion of low- to moderate-coverage ancient samples can increase imputation performance, particularly for rare alleles. In-depth analysis of imputed data with respect to genetic variants and allele frequencies gave further insight into the nature of errors arising during imputation, and can provide practical guidelines for post-processing and validation prior to downstream analysis.


Author(s):  
Ming Cao ◽  
Qinke Peng ◽  
Ze-Gang Wei ◽  
Fei Liu ◽  
Yi-Fan Hou

The development of high-throughput technologies has produced increasing amounts of sequence data and an increasing need for efficient clustering algorithms that can process massive volumes of sequencing data for downstream analysis. Heuristic clustering methods are widely applied for sequence clustering because of their low computational complexity. Although numerous heuristic clustering methods have been developed, they suffer from two limitations: overestimation of inferred clusters and low clustering sensitivity. To address these issues, we present a new sequence clustering method (edClust) based on Edlib, a C/C[Formula: see text] library for fast, exact semi-global sequence alignment to group similar sequences. The new method edClust was tested on three large-scale sequence databases, and we compared edClust to several classic heuristic clustering methods, such as UCLUST, CD-HIT, and VSEARCH. Evaluations based on the metrics of cluster number and seed sensitivity (SS) demonstrate that edClust can produce fewer clusters than other methods and that its SS is higher than that of other methods. The source codes of edClust are available from https://github.com/zhang134/EdClust.git under the GNU GPL license.


2021 ◽  
Author(s):  
Matthew Hartley ◽  
Gerard Kleywegt ◽  
Ardan Patwardhan ◽  
Ugis Sarkans ◽  
Jason R Swedlow ◽  
...  

Despite the importance of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. It accepts bioimaging data associated with publication in any format, from any imaging modality at any scale, as well as reference datasets. The BioImage Archive will improve reproducibility of published studies that derive results from image data. In addition, providing reference datasets to the scientific community reduces duplication of effort and allows downstream analysis to focus on a consistent set of data. The BioImage Archive will also help to generate new insights through reuse of existing data to answer new biological questions, or provision of training, testing and benchmarking data for image analysis tool development. The Archive is available at https://www.ebi.ac.uk/bioimage-archive/.


2021 ◽  
Author(s):  
Yilei Huang ◽  
Harald Ringbauer

Human ancient DNA (aDNA) studies have surged in recent years, revolutionizing the study of the human past. Typically, aDNA is preserved poorly, making such data prone to contamination from other human DNA. Therefore, it is important to rule out substantial contamination before proceeding to downstream analysis. As most aDNA samples can only be sequenced to low coverages (<1x average depth), computational methods that can robustly estimate contamination in the low coverage regime are needed. However, the ultra low-coverage regime (0.1x and below) remains a challenging task for existing approaches. We present a new method to estimate contamination in aDNA for male individuals. It utilizes a Li&Stephen's haplotype copying model for haploid X chromosomes, with mismatches modelled as genotyping error or contamination. We assessed an implementation of this new approach, hapCon, on simulated and down-sampled empirical aDNA data. Our results demonstrate that hapCon outperforms a commonly used tool for estimating male X contamination (ANGSD), with substantially lower variance and narrower confidence intervals, especially in the low coverage regime. We found that hapCon provides useful contamination estimates for coverages as low as 0.1x for SNP capture data (1240k) and 0.02x for whole genome sequencing data (WGS), substantially extending the coverage limit of previous male X chromosome based contamination estimation methods.


Sign in / Sign up

Export Citation Format

Share Document