Identifying enhancer–promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism

Author(s):  
Zengyan Hong ◽  
Xiangxiang Zeng ◽  
Leyi Wei ◽  
Xiangrong Liu

Abstract Motivation Identification of enhancer–promoter interactions (EPIs) is of great significance to human development. However, experimental methods to identify EPIs cost too much in terms of time, manpower and money. Therefore, more and more research efforts are focused on developing computational methods to solve this problem. Unfortunately, most existing computational methods require a variety of genomic data, which are not always available, especially for a new cell line. Therefore, it limits the large-scale practical application of methods. As an alternative, computational methods using sequences only have great genome-scale application prospects. Results In this article, we propose a new deep learning method, namely EPIVAN, that enables predicting long-range EPIs using only genomic sequences. To explore the key sequential characteristics, we first use pre-trained DNA vectors to encode enhancers and promoters; afterwards, we use one-dimensional convolution and gated recurrent unit to extract local and global features; lastly, attention mechanism is used to boost the contribution of key features, further improving the performance of EPIVAN. Benchmarking comparisons on six cell lines show that EPIVAN performs better than state-of-the-art predictors. Moreover, we build a general model, which has transfer ability and can be used to predict EPIs in various cell lines. Availability and implementation The source code and data are available at: https://github.com/hzy95/EPIVAN.

2018 ◽  
Author(s):  
James M McFarland ◽  
Zandra V Ho ◽  
Guillaume Kugener ◽  
Joshua M Dempster ◽  
Phillip G Montgomery ◽  
...  

The availability of multiple datasets together comprising hundreds of genome-scale RNAi viability screens across a diverse range of cancer cell lines presents new opportunities for understanding cancer vulnerabilities. Integrated analyses of these data to assess differential dependency across genes and cell lines are challenging due to confounding factors such as batch effects and variable screen quality, as well as difficulty assessing gene dependency on an absolute scale. To address these issues, we incorporated estimation of cell line screen quality parameters and hierarchical Bayesian inference into an analytical framework for analyzing RNAi screens (DEMETER2; https://depmap.org/R2-D2). We applied this model to individual large-scale datasets and show that it substantially improves estimates of gene dependency across a range of performance measures, including identification of gold-standard essential genes as well as agreement with CRISPR-Cas9-based viability screens. This model also allows us to effectively integrate information across three large RNAi screening datasets, providing a unified resource representing the most extensive compilation of cancer cell line genetic dependencies to date.


2020 ◽  
Author(s):  
Tommaso Alberti ◽  
Anna Milillo ◽  
Monica Laurenza ◽  
Stefano Massetti ◽  
Stavro Ivanovski ◽  
...  

<p class="western" align="justify"><span>The interaction between the interplanetary medium and planetary environments gives rise to different phenomena according to the spatio-temporal scales. Here we apply for the first time a novel data analysis method, i.e., the Hilbert-Huang Transform, to discriminate both local and global properties of Venus’ and Mercury’s environments as seen during two MESSENGER flybys. Hence, we may infer that the near-Venus environment is similar in terms of local and global features to the ambient solar wind, possibly related to the induced nature of Venus’ magnetosphere. Conversely, the near-Mercury environment presents some different local features with respect to the ambient solar wind, due to both interaction processes and intrinsic structures of the Hermean environment. Our findings support the ion kinetic nature of the Hermean plasma structures, with the foreshock and the magnetosheath regions being characterized by inhomogeneous ion-kinetic intermittent fluctuations, together with MHD and large-scale fluctuations, the latter being representative of the main structure of the magnetosphere. We also show that the HHT analysis allow to capture and reproduce some interesting features of the Hermean environment as flux transfer events, Kelvin-Helmholtz vortex, and ULF wave activity, thus providing a suitable method for characterizing physical processes of different nature. Our approach demonstrate to be very promising for the characterization of the structure and dynamics of planetary magnetic field at different scales, for the identification of different planetary regions, and for the detection of the “effective” planetary magnetic field that can be used for modelling purposes.</span></p>


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Peng Li ◽  
Qian Wang

In order to further mine the deep semantic information of the microbial text of public health emergencies, this paper proposes a multichannel microbial sentiment analysis model MCMF-A. Firstly, we use word2vec and fastText to generate word vectors in the feature vector embedding layer and fuse them with lexical and location feature vectors; secondly, we build a multichannel layer based on CNN and BiLSTM to extract local and global features of the microbial text; then we build an attention mechanism layer to extract the important semantic features of the microbial text; thirdly, we merge the multichannel output in the fusion layer and use soft; finally, the results are merged in the fusion layer, and a surtax function is used in the output layer for sentiment classification. The results show that the F1 value of the MCMF-A sentiment analysis model reaches 90.21%, which is 9.71% and 9.14% higher than the benchmark CNN and BiLSTM models, respectively. The constructed dataset is small in size, and the multimodal information such as images and speech has not been considered.


Author(s):  
Yongfang Peng ◽  
Shengwei Tian ◽  
Long Yu ◽  
Yalong Lv ◽  
Ruijin Wang

To improve the accuracy and automation of malware Uniform Resource Locator (URL) recognition, a joint approach of Convolutional neural network (CNN) and Long-short term memory (LSTM) based on the Attention mechanism (JCLA) is proposed to identify and detect malicious URL. Firstly, the URL features including texture information, lexical information and host information are extracted and filtered, and pre-processed with encode. Then, the feature matrix more relevant to the output are chose according to the weight of the attention mechanism and input to the constructed parallel processing model called CNN_LSTM, combinating CNN and LSTM to get local features. Next, the extracted local features are merged to calculate the global features of the URLs to be detected. Finally, the URLs are classified by the SoftMax classifier using global features, the accuracy of the model in malicious URL recgonition is 98.26%. The experimental results show that the JCLA model proposed in this paper is better than the traditional deep learning model or CNN_LSTM combined model for detecting malicious URLs.


2019 ◽  
Author(s):  
Joshua M. Dempster ◽  
Clare Pacini ◽  
Sasha Pantel ◽  
Fiona M. Behan ◽  
Thomas Green ◽  
...  

AbstractGenome-scale CRISPR-Cas9 viability screens performed in cancer cell lines provide a systematic approach to identify cancer dependencies and new therapeutic targets. As multiple large-scale screens become available, a formal assessment of the reproducibility of these experiments becomes necessary. We analyzed data from recently published pan-cancer CRISPR-Cas9 screens performed at the Broad and Sanger institutes. Despite significant differences in experimental protocols and reagents, we found that the screen results are highly concordant across multiple metrics with both common and specific dependencies jointly identified across the two studies. Furthermore, robust biomarkers of gene dependency found in one dataset are recovered in the other. Through further analysis and replication experiments at each institute, we found that batch effects are driven principally by two key experimental parameters: the reagent library and the assay length. These results indicate that the Broad and Sanger CRISPR-Cas9 viability screens yield robust and reproducible findings.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Qianqian Wang ◽  
Fang’ai Liu ◽  
Xiaohui Zhao ◽  
Qiaoqiao Tan

AbstractClick-through rate prediction, which aims to predict the probability of the user clicking on an item, is critical to online advertising. How to capture the user evolving interests from the user behavior sequence is an important issue in CTR prediction. However, most existing models ignore the factor that the sequence is composed of sessions, and user behavior can be divided into different sessions according to the occurring time. The user behaviors are highly correlated in each session and are not relevant across sessions. We propose an effective model for CTR prediction, named Session Interest Model via Self-Attention (SISA). First, we divide the user sequential behavior into session layer. A self-attention mechanism with bias coding is used to model each session. Since different session interest may be related to each other or follow a sequential pattern, next, we utilize gated recurrent unit (GRU) to capture the interaction and evolution of user different historical session interests in session interest extractor module. Then, we use the local activation and GRU to aggregate their target ad to form the final representation of the behavior sequence in session interest interacting module. Experimental results show that the SISA model performs better than other models.


Author(s):  
Honegzhe Liu ◽  
Zhifang Deng ◽  
Cheng Xu

Gesture recognition aims at understanding dynamic gestures of the human body and is one of the most important ways of human–computer interaction; to extract more effective spatiotemporal features in gesture videos for more accurate gesture classification, a novel feature extractor network, spatiotemporal attention 3D DenseNet is proposed in this study. We extend DenseNet with 3D kernels and Refined Temporal Transition Layer based on Temporal Transition Layer, and we also explore attention mechanism in 3D ConvNets. We embed the Refined Temporal Transition Layer and attention mechanism in DenseNet3D, named the proposed network “spatiotemporal attention 3D DenseNet.” Our experiments show that our Refined Temporal Transition Layer performs better than Temporal Transition Layer and the proposed spatiotemporal attention 3D DenseNet in each modality outperforms the current state-of-the-art methods on the ChaLearn LAP Large-Scale Isolated gesture dataset. The code and pretrained model are released in https://github.com/dzf19927/STA3D .


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Xueli Shen ◽  
Zhenxing Liang ◽  
Shiyin Li ◽  
Yanji Jiang

Speech enhancement in a vehicle environment remains a challenging task for the complex noise. The paper presents a feature extraction method that we use interchannel attention mechanism frame by frame for learning spatial features directly from the multichannel speech waveforms. The spatial features of the individual signals learned through the proposed method are provided as an input so that the two-stage BiLSTM network is trained to perform adaptive spatial filtering as time-domain filters spanning signal channels. The two-stage BiLSTM network is capable of local and global features extracting and reaches competitive results. Using scenarios and data based on car cockpit simulations, in contrast to other methods that extract the feature from multichannel data, the results show the proposed method has a significant performance in terms of all SDR, SI-SNR, PESQ, and STOI.


2018 ◽  
Vol 16 (1) ◽  
pp. 67-76
Author(s):  
Disyacitta Neolia Firdana ◽  
Trimurtini Trimurtini

This research aimed to determine the properness and effectiveness of the big book media on learning equivalent fractions of fourth grade students. The method of research is Research and Development  (R&D). This study was conducted in fourth grade of SDN Karanganyar 02 Kota Semarang. Data sources from media validation, material validation, learning outcomes, and teacher and students responses on developed media. Pre-experimental research design with one group pretest-posttest design. Big book developed consist of equivalent fractions material, students learning activities sheets with rectangle and circle shape pictures, and questions about equivalent fractions. Big book was developed based on students and teacher needs. This big book fulfill the media validity of 3,75 with very good criteria and scored 3 by material experts with good criteria. In large-scale trial, the result of students posttest have learning outcomes completness 82,14%. The result of N-gain calculation with result 0,55 indicates the criterion “medium”. The t-test result 9,6320 > 2,0484 which means the average of posttest outcomes is better than the average of pretest outcomes. Based on that data, this study has produced big book media which proper and effective as a media of learning equivalent fractions of fourth grade elementary school.


Genetics ◽  
2001 ◽  
Vol 159 (4) ◽  
pp. 1765-1778
Author(s):  
Gregory J Budziszewski ◽  
Sharon Potter Lewis ◽  
Lyn Wegrich Glover ◽  
Jennifer Reineke ◽  
Gary Jones ◽  
...  

Abstract We have undertaken a large-scale genetic screen to identify genes with a seedling-lethal mutant phenotype. From screening ~38,000 insertional mutant lines, we identified >500 seedling-lethal mutants, completed cosegregation analysis of the insertion and the lethal phenotype for >200 mutants, molecularly characterized 54 mutants, and provided a detailed description for 22 of them. Most of the seedling-lethal mutants seem to affect chloroplast function because they display altered pigmentation and affect genes encoding proteins predicted to have chloroplast localization. Although a high level of functional redundancy in Arabidopsis might be expected because 65% of genes are members of gene families, we found that 41% of the essential genes found in this study are members of Arabidopsis gene families. In addition, we isolated several interesting classes of mutants and genes. We found three mutants in the recently discovered nonmevalonate isoprenoid biosynthetic pathway and mutants disrupting genes similar to Tic40 and tatC, which are likely to be involved in chloroplast protein translocation. Finally, we directly compared T-DNA and Ac/Ds transposon mutagenesis methods in Arabidopsis on a genome scale. In each population, we found only about one-third of the insertion mutations cosegregated with a mutant phenotype.


Sign in / Sign up

Export Citation Format

Share Document