scholarly journals Transcription Factor Bound Regions Prediction: Word2Vec Technique with Convolutional Neural Network

2020 ◽  
Vol 12 (01) ◽  
pp. 1-13
Author(s):  
Rixin Chen ◽  
Ruoxi Dai ◽  
Mingye Wang
2021 ◽  
Author(s):  
Lifan Liang ◽  
Xinghua Lu ◽  
Songjian Lu

Transcription factor (TF) binding sites in ATAC-seq are typically determined by footprint analysis. However, the performance of footprint analysis remains unsatisfying and most TFs do not exhibit footprint patterns. In this study, we modified the convolutional neural network to project sequences into an embedding space. Sequences with similar nucleotide patterns will stay close together in the embedding. The dimensionality of this embedding space represents binding specificities of various TFs. In the simulation experiment, peak2vec accurately distinguished the three TFs in the embedding space while conventional deep learning cannot. When applied to the ATAC-seq profiles of hepatitis carcinoma, peak2vec recovered multiple motifs curated in database, while significant portion of sequences corresponding to the TF are located at the promoter region of its regulated genes.


2021 ◽  
Vol 22 (11) ◽  
pp. 5521
Author(s):  
Lei Deng ◽  
Hui Wu ◽  
Xuejun Liu ◽  
Hui Liu

Predicting in vivo protein–DNA binding sites is a challenging but pressing task in a variety of fields like drug design and development. Most promoters contain a number of transcription factor (TF) binding sites, but only a small minority has been identified by biochemical experiments that are time-consuming and laborious. To tackle this challenge, many computational methods have been proposed to predict TF binding sites from DNA sequence. Although previous methods have achieved remarkable performance in the prediction of protein–DNA interactions, there is still considerable room for improvement. In this paper, we present a hybrid deep learning framework, termed DeepD2V, for transcription factor binding sites prediction. First, we construct the input matrix with an original DNA sequence and its three kinds of variant sequences, including its inverse, complementary, and complementary inverse sequence. A sliding window of size k with a specific stride is used to obtain its k-mer representation of input sequences. Next, we use word2vec to obtain a pre-trained k-mer word distributed representation model. Finally, the probability of protein–DNA binding is predicted by using the recurrent and convolutional neural network. The experiment results on 50 public ChIP-seq benchmark datasets demonstrate the superior performance and robustness of DeepD2V. Moreover, we verify that the performance of DeepD2V using word2vec-based k-mer distributed representation is better than one-hot encoding, and the integrated framework of both convolutional neural network (CNN) and bidirectional LSTM (bi-LSTM) outperforms CNN or the bi-LSTM model when used alone. The source code of DeepD2V is available at the github repository.


2020 ◽  
Author(s):  
S Kashin ◽  
D Zavyalov ◽  
A Rusakov ◽  
V Khryashchev ◽  
A Lebedev

2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


Sign in / Sign up

Export Citation Format

Share Document