FIDDLE: An integrative deep learning framework for functional genomic data inference

2016 ◽  
Author(s):  
Umet Eser ◽  
L. Stirling Churchman

AbstractNumerous advances in sequencing technologies have revolutionized genomics through generating many types of genomic functional data. Statistical tools have been developed to analyze individual data types, but there lack strategies to integrate disparate datasets under a unified framework. Moreover, most analysis techniques heavily rely on feature selection and data preprocessing which increase the difficulty of addressing biological questions through the integration of multiple datasets. Here, we introduce FIDDLE (Flexible Integration of Data with Deep LEarning) an open source data-agnostic flexible integrative framework that learns a unified representation from multiple data types to infer another data type. As a case study, we use multiple Saccharomyces cerevisiae genomic datasets to predict global transcription start sites (TSS) through the simulation of TSS-seq data. We demonstrate that a type of data can be inferred from other sources of data types without manually specifying the relevant features and preprocessing. We show that models built from multiple genome-wide datasets perform profoundly better than models built from individual datasets. Thus FIDDLE learns the complex synergistic relationship within individual datasets and, importantly, across datasets.

2019 ◽  
Author(s):  
Kevin Menden ◽  
Mohamed Marouf ◽  
Sergio Oller ◽  
Anupriya Dalmia ◽  
Karin Kloiber ◽  
...  

AbstractWe present Scaden, a deep neural network for cell deconvolution that uses gene expression information to infer the cellular composition of tissues. Scaden is trained on single cell RNA-seq data to engineer discriminative features that confer robustness to bias and noise, making complex data preprocessing and feature selection unnecessary. We demonstrate that Scaden outperforms existing deconvolution algorithms in both precision and robustness. A single trained network reliably deconvolves bulk RNA-seq and microarray, human and mouse tissue expression data and leverages the combined information of multiple data sets. Due to this stability and flexibility, we surmise that deep learning will become an algorithmic mainstay for cell deconvolution of various data types. Scaden’s comprehensive software package is easy to use on novel as well as diverse existing expression datasets available in public resources, deepening the molecular and cellular understanding of developmental and disease processes.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xin Zeng ◽  
Sung-Joon Park ◽  
Kenta Nakai

Promoters and enhancers are well-known regulatory elements modulating gene expression. As confirmed by high-throughput sequencing technologies, these regulatory elements are bidirectionally transcribed. That is, promoters produce stable mRNA in the sense direction and unstable RNA in the antisense direction, while enhancers transcribe unstable RNA in both directions. Although it is thought that enhancers and promoters share a similar architecture of transcription start sites (TSSs), how the transcriptional machinery distinctly uses these genomic regions as promoters or enhancers remains unclear. To address this issue, we developed a deep learning (DL) method by utilizing a convolutional neural network (CNN) and the saliency algorithm. In comparison with other classifiers, our CNN presented higher predictive performance, suggesting the overarching importance of the high-order sequence features, captured by the CNN. Moreover, our method revealed that there are substantial sequence differences between the enhancers and promoters. Remarkably, the 20–120 bp downstream regions from the center of bidirectional TSSs seemed to contribute to the RNA stability. These regions in promoters tend to have a larger number of guanines and cytosines compared to those in enhancers, and this feature contributed to the classification of the regulatory elements. Our CNN-based method can capture the complex TSS architectures. We found that the genomic regions around TSSs for promoters and enhancers contribute to RNA stability and show GC-biased characteristics as a critical determinant for promoter TSSs.


2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2020 ◽  
Vol 15 ◽  
Author(s):  
Deeksha Saxena ◽  
Mohammed Haris Siddiqui ◽  
Rajnish Kumar

Background: Deep learning (DL) is an Artificial neural network-driven framework with multiple levels of representation for which non-linear modules combined in such a way that the levels of representation can be enhanced from lower to a much abstract level. Though DL is used widely in almost every field, it has largely brought a breakthrough in biological sciences as it is used in disease diagnosis and clinical trials. DL can be clubbed with machine learning, but at times both are used individually as well. DL seems to be a better platform than machine learning as the former does not require an intermediate feature extraction and works well with larger datasets. DL is one of the most discussed fields among the scientists and researchers these days for diagnosing and solving various biological problems. However, deep learning models need some improvisation and experimental validations to be more productive. Objective: To review the available DL models and datasets that are used in disease diagnosis. Methods: Available DL models and their applications in disease diagnosis were reviewed discussed and tabulated. Types of datasets and some of the popular disease related data sources for DL were highlighted. Results: We have analyzed the frequently used DL methods, data types and discussed some of the recent deep learning models used for solving different biological problems. Conclusion: The review presents useful insights about DL methods, data types, selection of DL models for the disease diagnosis.


2021 ◽  
Vol 12 (3) ◽  
pp. 46-47
Author(s):  
Nikita Saxena

Space-borne satellite radiometers measure Sea Surface Temperature (SST), which is pivotal to studies of air-sea interactions and ocean features. Under clear sky conditions, high resolution measurements are obtainable. But under cloudy conditions, data analysis is constrained to the available low resolution measurements. We assess the efficiency of Deep Learning (DL) architectures, particularly Convolutional Neural Networks (CNN) to downscale oceanographic data from low spatial resolution (SR) to high SR. With a focus on SST Fields of Bay of Bengal, this study proves that Very Deep Super Resolution CNN can successfully reconstruct SST observations from 15 km SR to 5km SR, and 5km SR to 1km SR. This outcome calls attention to the significance of DL models explicitly trained for the reconstruction of high SR SST fields by using low SR data. Inference on DL models can act as a substitute to the existing computationally expensive downscaling technique: Dynamical Downsampling. The complete code is available on this Github Repository.


2021 ◽  
Vol 11 (6) ◽  
pp. 2742
Author(s):  
Fatih Ünal ◽  
Abdulaziz Almalaq ◽  
Sami Ekici

Short-term load forecasting models play a critical role in distribution companies in making effective decisions in their planning and scheduling for production and load balancing. Unlike aggregated load forecasting at the distribution level or substations, forecasting load profiles of many end-users at the customer-level, thanks to smart meters, is a complicated problem due to the high variability and uncertainty of load consumptions as well as customer privacy issues. In terms of customers’ short-term load forecasting, these models include a high level of nonlinearity between input data and output predictions, demanding more robustness, higher prediction accuracy, and generalizability. In this paper, we develop an advanced preprocessing technique coupled with a hybrid sequential learning-based energy forecasting model that employs a convolution neural network (CNN) and bidirectional long short-term memory (BLSTM) within a unified framework for accurate energy consumption prediction. The energy consumption outliers and feature clustering are extracted at the advanced preprocessing stage. The novel hybrid deep learning approach based on data features coding and decoding is implemented in the prediction stage. The proposed approach is tested and validated using real-world datasets in Turkey, and the results outperformed the traditional prediction models compared in this paper.


Author(s):  
Lijing Wang ◽  
Aniruddha Adiga ◽  
Srinivasan Venkatramanan ◽  
Jiangzhuo Chen ◽  
Bryan Lewis ◽  
...  

2021 ◽  
Vol 108 (Supplement_3) ◽  
Author(s):  
L F Sánchez Peralta ◽  
J F Ortega Morán ◽  
Cr L Saratxaga ◽  
J B Pagador ◽  
A Picón ◽  
...  

Abstract INTRODUCTION Deep learning techniques have significantly contributed to the field of medical imaging analysis. In case of colorectal cancer, they have shown a great utility for increasing the adenoma detection rate at colonoscopy, but a common validation methodology is still missing. In this study, we present preliminary efforts towards the definition of a validation framework. MATERIAL AND METHODS Different models based on different backbones and encoder-decoder architectures have been trained with a publicly available dataset that contains white light and NBI colonoscopy videos, with 76 different lesions from colonoscopy procedures in 48 human patients. A computer aided detection (CADe) demonstrator has been implemented to show the performance of the models. RESULTS This CADe demonstrator shows the areas detected as polyp by overlapping the predicted mask on the endoscopic image. It allows selecting the video to be used, among those from the test set. Although it only present basic features such as play, pause and moving to the next video, it easily loads the model and allows for visualization of results. The demonstrator is accompanied by a set of metrics to be used depending on the aimed task: polyp detection, localization and segmentation. CONCLUSIONS The use of this CADe demonstrator, together with a publicly available dataset and predefined metrics will allow for an easier and more fair comparison of methods. Further work is still required to validate the proposed framework.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5039
Author(s):  
Tae-Hyun Kim ◽  
Hye-Rin Kim ◽  
Yeong-Jun Cho

In this study, we present a framework for product quality inspection based on deep learning techniques. First, we categorize several deep learning models that can be applied to product inspection systems. In addition, we explain the steps for building a deep-learning-based inspection system in detail. Second, we address connection schemes that efficiently link deep learning models to product inspection systems. Finally, we propose an effective method that can maintain and enhance a product inspection system according to improvement goals of the existing product inspection systems. The proposed system is observed to possess good system maintenance and stability owing to the proposed methods. All the proposed methods are integrated into a unified framework and we provide detailed explanations of each proposed method. In order to verify the effectiveness of the proposed system, we compare and analyze the performance of the methods in various test scenarios. We expect that our study will provide useful guidelines to readers who desire to implement deep-learning-based systems for product inspection.


Sign in / Sign up

Export Citation Format

Share Document