scholarly journals Effective identification of sequence patterns via a new convolutional model with adaptively learned kernels

2018 ◽  
Author(s):  
Jing-Yi Li ◽  
Shen Jin ◽  
Xin-Ming Tu ◽  
Yang Ding ◽  
Ge Gao

AbstractMotif identification is among the most classical and essential computational tasks for bioinformatics and genomics. Here we propose a novel convolution-based model, Variable CNN (vCNN), for effective motif identification in high-throughput omics data based on dynamic learning of kernel length. Multiple empirical evaluations well demonstrate vCNN’s superior performance in not only identification performance but also hyperparameter robustness. All source code and data are freely available on GitHub (https://github.com/gao-lab/vCNN) for academic usage.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Kalpana Raja ◽  
Matthew Patrick ◽  
Yilin Gao ◽  
Desmond Madu ◽  
Yuyang Yang ◽  
...  

In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information.



Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 245 ◽  
Author(s):  
Gary Hardiman

A major technological shift in the research community in the past decade has been the adoption of high throughput (HT) technologies to interrogate the genome, epigenome, transcriptome, and proteome in a massively parallel fashion [...]



2019 ◽  
Author(s):  
Elizaveta V. Starikova ◽  
Polina O. Tikhonova ◽  
Nikita A. Prianichnikov ◽  
Chris M. Rands ◽  
Evgeny M. Zdobnov ◽  
...  

AbstractSummaryPhigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces dynamic annotated “prophage genome maps” and marks possible transposon insertion spots inside prophages. It provides putative taxonomic annotations that can distinguish tailed from non-tailed phages. It is applicable for mining prophage regions from large metagenomic datasets.AvailabilitySource code for Phigaro is freely available for download at https://github.com/bobeobibo/phigaro along with test data. The code is written in Python.



2021 ◽  
Author(s):  
Tommi Valikangas ◽  
Tomi Suomi ◽  
Courtney E Chandler ◽  
Alison J Scott ◽  
Bao Q Tran ◽  
...  

Quantitative proteomics has matured into an established tool and longitudinal proteomic experiments have begun to emerge. However, no effective, simple-to-use differential expression method for longitudinal proteomics data has been released. Typically, such data is noisy, contains missing values, has only few time points and biological replicates. To address this need, we provide a comprehensive evaluation of several existing differential expression methods for high-throughput longitudinal omics data and introduce a new method, Robust longitudinal Differential Expression (RolDE). The methods were evaluated using nearly 2000 semi-simulated spike-in proteomic datasets and a large experimental dataset. The RolDE method performed overall best; it was most tolerant to missing values, displayed good reproducibility and was the top method in ranking the results in a biologically meaningful way. Furthermore, contrary to many approaches, the open source RolDE does not require prior knowledge concerning the types of differences searched, but can easily be applied even by non-experienced users.



2021 ◽  
Author(s):  
Anjun Ma ◽  
Xiaoying Wang ◽  
Cankun Wang ◽  
Jingxian Li ◽  
Tong Xiao ◽  
...  

We present DeepMAPS, a deep learning platform for cell-type-specific biological gene network inference from single-cell multi-omics (scMulti-omics). DeepMAPS includes both cells and genes in a heterogeneous graph to infer cell-cell, cell-gene, and gene-gene relations simultaneously. The graph attention neural network considers a cell and a gene with both local and global information, making DeepMAPS more robust to data noises. We benchmarked DeepMAPS on 18 datasets for cell clustering and network inference, and the results showed that our method outperforms various existing tools. We further applied DeepMAPS on a case study of lung tumor leukocyte CITE-seq data and observed superior performance in cell clustering, and predicted biologically meaningful cell-cell communication pathways based on the inferred gene networks. To improve the feasibility and ensure the reproducibility of analyzing scMulti-omics data, we deployed a webserver with multi-functions and various visualizations. Overall, we valued DeepMAPS as a novel platform of the state-of-the-art deep learning model in the single-cell study and can promote the use of scMulti-omics data in the community.





2021 ◽  
Vol 10 ◽  
Author(s):  
Yansheng Xu ◽  
Xin Ma ◽  
Xing Ai ◽  
Jiangping Gao ◽  
Yiming Liang ◽  
...  

BackgroundConventional clinical detection methods such as CT, urine cytology, and ureteroscopy display low sensitivity and/or are invasive in the diagnosis of upper tract urinary carcinoma (UTUC), a factor precluding their use. Previous studies on urine biopsy have not shown satisfactory sensitivity and specificity in the application of both gene mutation or gene methylation panels. Therefore, these unfavorable factors call for an urgent need for a sensitive and non-invasive method for the diagnosis of UTUC.MethodsIn this study, a total of 161 hematuria patients were enrolled with (n = 69) or without (n = 92) UTUC. High-throughput sequencing of 17 genes and methylation analysis for ONECUT2 CpG sites were combined as a liquid biopsy test panel. Further, a logistic regression prediction model that contained several significant features was used to evaluate the risk of UTUC in these patients.ResultsIn total, 86 UTUC− and 64 UTUC+ case samples were enrolled for the analysis. A logistic regression analysis of significant features including age, the mutation status of TERT promoter, and ONECUT2 methylation level resulted in an optimal model with a sensitivity of 94.0%, a specificity of 93.1%, the positive predictive value of 92.2% and a negative predictive value of 94.7%. Notably, the area under the curve (AUC) was 0.957 in the training dataset while internal validation produced an AUC of 0.962. It is worth noting that during follow-up, a patient diagnosed with ureteral inflammation at the time of diagnosis exhibiting both positive mutation and methylation test results was diagnosed with ureteral carcinoma 17 months after his enrollment.ConclusionThis work utilized the epigenetic biomarker ONECUT2 for the first time in the detection of UTUC and discovered its superior performance. To improve its sensitivity, we combined the biomarker with high-throughput sequencing of 17 genes test. It was found that the selected logistic regression model diagnosed with ureteral cancer can evaluate upper tract urinary carcinoma risk of patients with hematuria and outperform other existing panels in providing clinical recommendations for the diagnosis of UTUC. Moreover, its high negative predictive value is conducive to rule to exclude patients without UTUC.



2021 ◽  
Author(s):  
Cansu H Demirel ◽  
Kaan M Arici ◽  
Nurcan Tuncbag

In line with the advances in high-throughput technologies, multiple omic datasets have accumulated to study biological systems and diseases coherently. No single omics data type is capable of fully representing...



2016 ◽  
Vol 12 (10) ◽  
pp. 2953-2964 ◽  
Author(s):  
Jonathan L. Robinson ◽  
Jens Nielsen

Biomolecular networks, such as genome-scale metabolic models and protein–protein interaction networks, facilitate the extraction of new information from high-throughput omics data.



Sign in / Sign up

Export Citation Format

Share Document