scholarly journals Fully Interpretable Deep Learning Model of Transcriptional Control

2019 ◽  
Author(s):  
Yi Liu ◽  
Kenneth Barr ◽  
John Reinitz

AbstractThe universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent work in the system biology community to employ DNNs to solve important problems in functional genomics and molecular genetics. Because of the black box nature of DNNs, such assumptions, while useful in practice, are unsatisfactory for scientific analysis. In this paper, we give an example of a DNN in which every layer is interpretable. Moreover, this DNN is biologically validated and predictive. We derive our DNN from a systems biology model that was not previously recognized as having a DNN structure. This DNN is concerned with a key unsolved biological problem, which is to understand the DNA regulatory code which controls how genes in multicellular organisms are turned on and off. Although we apply our DNN to data from the early embryo of the fruit fly Drosophila, this system serves as a testbed for analysis of much larger data sets obtained by systems biology studies on a genomic scale.

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i499-i507 ◽  
Author(s):  
Yi Liu ◽  
Kenneth Barr ◽  
John Reinitz

Abstract Motivation The universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent worksin the systems biology community to employDNNs to solve important problems in functional genomics and moleculargenetics. Typically, such investigations have taken a ‘black box’ approach in which the internal structure of themodel used is set purely by machine learning considerations with little consideration of representing the internalstructure of the biological system by the mathematical structure of the DNN. DNNs have not yet been applied to thedetailed modeling of transcriptional control in which mRNA production is controlled by the binding of specific transcriptionfactors to DNA, in part because such models are in part formulated in terms of specific chemical equationsthat appear different in form from those used in neural networks. Results In this paper, we give an example of a DNN whichcan model the detailed control of transcription in a precise and predictive manner. Its internal structure is fully interpretableand is faithful to underlying chemistry of transcription factor binding to DNA. We derive our DNN from asystems biology model that was not previously recognized as having a DNN structure. Although we apply our DNNto data from the early embryo of the fruit fly Drosophila, this system serves as a test bed for analysis of much larger datasets obtained by systems biology studies on a genomic scale. . Availability and implementation The implementation and data for the models used in this paper are in a zip file in the supplementary material. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
pp. 002203452110120
Author(s):  
C. Gluck ◽  
S. Min ◽  
A. Oyelakin ◽  
M. Che ◽  
E. Horeth ◽  
...  

The parotid, submandibular, and sublingual glands represent a trio of oral secretory glands whose primary function is to produce saliva, facilitate digestion of food, provide protection against microbes, and maintain oral health. While recent studies have begun to shed light on the global gene expression patterns and profiles of salivary glands, particularly those of mice, relatively little is known about the location and identity of transcriptional control elements. Here we have established the epigenomic landscape of the mouse submandibular salivary gland (SMG) by performing chromatin immunoprecipitation sequencing experiments for 4 key histone marks. Our analysis of the comprehensive SMG data sets and comparisons with those from other adult organs have identified critical enhancers and super-enhancers of the mouse SMG. By further integrating these findings with complementary RNA-sequencing based gene expression data, we have unearthed a number of molecular regulators such as members of the Fox family of transcription factors that are enriched and likely to be functionally relevant for SMG biology. Overall, our studies provide a powerful atlas of cis-regulatory elements that can be leveraged for better understanding the transcriptional control mechanisms of the mouse SMG, discovery of novel genetic switches, and modulating tissue-specific gene expression in a targeted fashion.


2010 ◽  
Vol 28 (16) ◽  
pp. 2777-2783 ◽  
Author(s):  
Ana Maria Gonzalez-Angulo ◽  
Bryan T.J. Hennessy ◽  
Gordon B. Mills

The development of cost-effective technologies able to comprehensively assess DNA, RNA, protein, and metabolites in patient tumors has fueled efforts to tailor medical care. Indeed validated molecular tests assessing tumor tissue or patient germline DNA already drive therapeutic decision making. However, many theoretical and regulatory challenges must still be overcome before fully realizing the promise of personalized molecular medicine. The masses of data generated by high-throughput technologies are challenging to manage, visualize, and convert to the knowledge required to improve patient outcomes. Systems biology integrates engineering, physics, and mathematical approaches with biologic and medical insights in an iterative process to visualize the interconnected events within a cell that determine how inputs from the environment and the network rewiring that occurs due to the genomic aberrations acquired by patient tumors determines cellular behavior and patient outcomes. A cross-disciplinary systems biology effort will be necessary to convert the information contained in multidimensional data sets into useful biomarkers that can classify patient tumors by prognosis and response to therapeutic modalities and to identify the drivers of tumor behavior that are optimal targets for therapy. An understanding of the effects of targeted therapeutics on signaling networks and homeostatic regulatory loops will be necessary to prevent inadvertent effects as well as to develop rational combinatorial therapies. Systems biology approaches identifying molecular drivers and biomarkers will lead to the implementation of smaller, shorter, cheaper, and individualized clinical trials that will increase the success rate and hasten the implementation of effective therapies into the clinical armamentarium.


Cancers ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 12
Author(s):  
Jose M. Castillo T. ◽  
Muhammad Arif ◽  
Martijn P. A. Starmans ◽  
Wiro J. Niessen ◽  
Chris H. Bangma ◽  
...  

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.


Metabolites ◽  
2019 ◽  
Vol 9 (4) ◽  
pp. 76 ◽  
Author(s):  
Farhana R. Pinu ◽  
David J. Beale ◽  
Amy M. Paten ◽  
Konstantinos Kouremenos ◽  
Sanjay Swarup ◽  
...  

The use of multiple omics techniques (i.e., genomics, transcriptomics, proteomics, and metabolomics) is becoming increasingly popular in all facets of life science. Omics techniques provide a more holistic molecular perspective of studied biological systems compared to traditional approaches. However, due to their inherent data differences, integrating multiple omics platforms remains an ongoing challenge for many researchers. As metabolites represent the downstream products of multiple interactions between genes, transcripts, and proteins, metabolomics, the tools and approaches routinely used in this field could assist with the integration of these complex multi-omics data sets. The question is, how? Here we provide some answers (in terms of methods, software tools and databases) along with a variety of recommendations and a list of continuing challenges as identified during a peer session on multi-omics integration that was held at the recent ‘Australian and New Zealand Metabolomics Conference’ (ANZMET 2018) in Auckland, New Zealand (Sept. 2018). We envisage that this document will serve as a guide to metabolomics researchers and other members of the community wishing to perform multi-omics studies. We also believe that these ideas may allow the full promise of integrated multi-omics research and, ultimately, of systems biology to be realized.


2007 ◽  
Vol 7 (6) ◽  
pp. 16227-16251
Author(s):  
G. Wetzel ◽  
T. Sugita ◽  
H. Nakajima ◽  
T. Tanaka ◽  
T. Yokota ◽  
...  

Abstract. The Improved Limb Atmospheric Spectrometer (ILAS)-II sensor aboard the Japanese ADEOS-II satellite was launched into its sun-synchronous orbit on 14 December 2002 and performed solar occultation measurements of trace species, aerosols, temperature, and pressure in the polar stratosphere until 25 October 2003. Vertical trace gas profiles obtained with the balloon version of the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS-B) provide one of the sparse data sets for validating ILAS-II version 2 and 1.4 data. The MIPAS-B limb emission spectra were collected on 20 March 2003 over Kiruna (Sweden, 68° N) at virtually the same location that has been sounded by ILAS-II about 5.5 h prior to the sampling of MIPAS-B. The intercomparison of the new ILAS-II version 2 (Northern Hemispheric sunrise) data to MIPAS-B vertical trace gas profiles shows a good to excellent agreement within the combined error limits for the species O3, N2O, CH4, H2O (above 21 km), HNO3, ClONO2, and CFC-11 (CCl3F) in the compared altitude range between 16 and 31 km such that these data appear to be very useful for scientific analysis. With regard to the previous version 1.4 ILAS-II data, significant improvements in the consistency with MIPAS-B are obvious especially for the species CH4 and H2O, but also for O3, HNO3, ClONO2, NO2, and N2O5. However, comparing gases like NO2, N2O5, and CFC-12 (CCl2F2) exhibits only poor agreement with MIPAS-B such that these species cannot be assumed to be validated at the present time.


2021 ◽  
Vol 7 ◽  
pp. e551
Author(s):  
Nihad Karim Chowdhury ◽  
Muhammad Ashad Kabir ◽  
Md. Muhtadir Rahman ◽  
Noortaz Rezoana

The goal of this research is to develop and implement a highly effective deep learning model for detecting COVID-19. To achieve this goal, in this paper, we propose an ensemble of Convolutional Neural Network (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 from chest X-rays. To make the proposed model more robust, we have used one of the largest open-access chest X-ray data sets named COVIDx containing three classes—COVID-19, normal, and pneumonia. For feature extraction, we have applied an effective CNN structure, namely EfficientNet, with ImageNet pre-training weights. The generated features are transferred into custom fine-tuned top layers followed by a set of model snapshots. The predictions of the model snapshots (which are created during a single training) are consolidated through two ensemble strategies, i.e., hard ensemble and soft ensemble, to enhance classification performance. In addition, a visualization technique is incorporated to highlight areas that distinguish classes, thereby enhancing the understanding of primal components related to COVID-19. The results of our empirical evaluations show that the proposed ECOVNet model outperforms the state-of-the-art approaches and significantly improves detection performance with 100% recall for COVID-19 and overall accuracy of 96.07%. We believe that ECOVNet can enhance the detection of COVID-19 disease, and thus, underpin a fully automated and efficacious COVID-19 detection system.


Author(s):  
Qusay Abdullah Abed ◽  
Osamah Mohammed Fadhil ◽  
Wathiq Laftah Al-Yaseen

In general, multidimensional data (mobile application for example) contain a large number of unnecessary information. Web app users find it difficult to get the information needed quickly and effectively due to the sheer volume of data (big data produced per second). In this paper, we tend to study the data mining in web personalization using blended deep learning model. So, one of the effective solutions to this problem is web personalization. As well as, explore how this model helps to analyze and estimate the huge amounts of operations. Providing personalized recommendations to improve reliability depends on the web application using useful information in the web application. The results of this research are important for the training and testing of large data sets for a map of deep mixed learning based on the model of back-spread neural network. The HADOOP framework was used to perform a number of experiments in a different environment with a learning rate between -1 and +1. Also, using the number of techniques to evaluate the number of parameters, true positive cases are represent and fall into positive cases in this example to evaluate the proposed model.


2021 ◽  
Author(s):  
Armando Reimer ◽  
Simon Alamos ◽  
Clay Westrum ◽  
Meghan A. Turner ◽  
Paul Talledo ◽  
...  

How enhancers interpret morphogen gradients to generate spatial patterns of gene expression is a central question in developmental biology. Although recent studies have begun to elucidate that enhancers can dictate whether, when, and at what rate a promoter will engage in transcription, the complexity of endogenous enhancers calls for theoretical models with too many free parameters to quantitatively dissect these regulatory strategies. To overcome this limitation, we established a minimal synthetic enhancer system in embryos of the fruit fly Drosophila melanogaster. Here, a gradient of the Dorsal activator is read by a single Dorsal binding site. By quantifying transcriptional activity using live imaging, our experiments revealed that this single Dorsal binding site is capable of regulating whether promoters engage in transcription in a Dorsal concentration-specific manner. By modulating binding-site affinity, we determined that a gene's decision to engage in transcription and its transcriptional onset time can be explained by a simple theoretical model where the promoter has to traverse multiple kinetic barriers before transcription can ensue. The experimental platform developed here pushes the boundaries of live-imaging in studying gene regulation in the early embryo by enabling the quantification of the transcriptional activity driven by a single transcription factor binding site, and making it possible to build more complex enhancers from the ground up in the context of a dialogue between theory and experiment.


Sign in / Sign up

Export Citation Format

Share Document