motif information
Recently Published Documents


TOTAL DOCUMENTS

26
(FIVE YEARS 10)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ying Li ◽  
Hang Sun ◽  
Shiyao Feng ◽  
Qi Zhang ◽  
Siyu Han ◽  
...  

Abstract Background Long noncoding RNAs (lncRNAs) play important roles in multiple biological processes. Identifying LncRNA–protein interactions (LPIs) is key to understanding lncRNA functions. Although some LPIs computational methods have been developed, the LPIs prediction problem remains challenging. How to integrate multimodal features from more perspectives and build deep learning architectures with better recognition performance have always been the focus of research on LPIs. Results We present a novel multichannel capsule network framework to integrate multimodal features for LPI prediction, Capsule-LPI. Capsule-LPI integrates four groups of multimodal features, including sequence features, motif information, physicochemical properties and secondary structure features. Capsule-LPI is composed of four feature-learning subnetworks and one capsule subnetwork. Through comprehensive experimental comparisons and evaluations, we demonstrate that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI performs better than the existing state-of-the-art tools. The precision of Capsule-LPI is 87.3%, which represents a 1.7% improvement. The F-value of Capsule-LPI is 92.2%, which represents a 1.4% improvement. Conclusions This study provides a novel and feasible LPI prediction tool based on the integration of multimodal features and a capsule network. A webserver (http://csbg-jlu.site/lpc/predict) is developed to be convenient for users.


2021 ◽  
Vol 7 (17) ◽  
pp. eabf1754
Author(s):  
Huta R. Banjade ◽  
Sandro Hauri ◽  
Shanshan Zhang ◽  
Francesco Ricci ◽  
Weiyi Gong ◽  
...  

Incorporation of physical principles in a machine learning (ML) architecture is a fundamental step toward the continued development of artificial intelligence for inorganic materials. As inspired by the Pauling’s rule, we propose that structure motifs in inorganic crystals can serve as a central input to a machine learning framework. We demonstrated that the presence of structure motifs and their connections in a large set of crystalline compounds can be converted into unique vector representations using an unsupervised learning algorithm. To demonstrate the use of structure motif information, a motif-centric learning framework is created by combining motif information with the atom-based graph neural networks to form an atom-motif dual graph network (AMDNet), which is more accurate in predicting the electronic structures of metal oxides such as bandgaps. The work illustrates the route toward fundamental design of graph neural network learning architecture for complex materials by incorporating beyond-atom physical principles.


Genetics ◽  
2020 ◽  
Vol 216 (2) ◽  
pp. 353-358
Author(s):  
Mengchi Wang ◽  
David Wang ◽  
Kai Zhang ◽  
Vu Ngo ◽  
Shicai Fan ◽  
...  

Sequence analysis frequently requires intuitive understanding and convenient representation of motifs. Typically, motifs are represented as position weight matrices (PWMs) and visualized using sequence logos. However, in many scenarios, in order to interpret the motif information or search for motif matches, it is compact and sufficient to represent motifs by wildcard-style consensus sequences (such as [GC][AT]GATAAG[GAC]). Based on mutual information theory and Jensen-Shannon divergence, we propose a mathematical framework to minimize the information loss in converting PWMs to consensus sequences. We name this representation as sequence Motto and have implemented an efficient algorithm with flexible options for converting motif PWMs into Motto from nucleotides, amino acids, and customized characters. We show that this representation provides a simple and efficient way to identify the binding sites of 1156 common transcription factors (TFs) in the human genome. The effectiveness of the method was benchmarked by comparing sequence matches found by Motto with PWM scanning results found by FIMO. On average, our method achieves a 0.81 area under the precision-recall curve, significantly (P-value < 0.01) outperforming all existing methods, including maximal positional weight, Cavener’s method, and minimal mean square error. We believe this representation provides a distilled summary of a motif, as well as the statistical justification.


2020 ◽  
Vol 48 (W1) ◽  
pp. W208-W217
Author(s):  
Clémentine Leporcq ◽  
Yannick Spill ◽  
Delphine Balaramane ◽  
Christophe Toussaint ◽  
Michaël Weber ◽  
...  

Abstract Transcription factors (TFs) regulate the expression of gene expression. The binding specificities of many TFs have been deciphered and summarized as position-weight matrices, also called TF motifs. Despite the availability of hundreds of known TF motifs in databases, it remains non-trivial to quickly query and visualize the enrichment of known TF motifs in genomic regions of interest. Towards this goal, we developed TFmotifView, a web server that allows to study the distribution of known TF motifs in genomic regions. Based on input genomic regions and selected TF motifs, TFmotifView performs an overlap of the genomic regions with TF motif occurrences identified using a dynamic P-value threshold. TFmotifView generates three different outputs: (i) an enrichment table and scatterplot calculating the significance of TF motif occurrences in genomic regions compared to control regions, (ii) a genomic view of the organisation of TF motifs in each genomic region and (iii) a metaplot summarizing the position of TF motifs relative to the center of the regions. TFmotifView will contribute to the integration of TF motif information with a wide range of genomic datasets towards the goal to better understand the regulation of gene expression by transcription factors. TFmotifView is freely available at http://bardet.u-strasbg.fr/tfmotifview/.


Performance ◽  
2020 ◽  
Vol 27 (1) ◽  
pp. 1
Author(s):  
Abdul Lathif ◽  
Mohamad Syahriar Sugandi

Berdasarkan perkembangan teknologi yakni generasi web berbasis web 2.0 menjadi latar belakang kemunculan media yang berbasis user generated content memiliki karakteristik yang berbeda. Perbedaan karakter dan fitur setiap media dengan perkembangannya menimbulkan motif dan kepuasan yang beragam. Penelitian ini bertujuan untuk mengetahui gratification sought dan gratifikasi obtained serta nilai selisih dari keduanya pada pengguna website Zomato.com yang berbasis UGC di wilayah Jakarta. Penelitian ini menggunakan teori uses and gratification 2.0 sebagai kerangka kerja dan jurnal Shao (2008) sebagai motif acuan media berbasis UGC. Metode penelitian ini yakni metode survei deskriptif dengan teknik pengumpulan data survei yang dilakukan pada 100 responden di wilayah Jakarta. Pengambilan sampel penelitian ini menggunakan teknik secara non probability sampling. Pernyataan hasil survei diolah secara univariat dan statistik deskriptif. Hasil penelitian ini menunjukan motif harapan tertinggi penggunaan website Zomato.com yakni self-expression dengan persentase 14,72% dan motif harapan terendah yakni virtual communities dengan persetase 13,71%. Sedangkan motif kepuasan, persentase tertinggi yakni motif information seeking 15,48% dan motif terendah yakni motif virtual communities 12,64%. Nilai selisih dari gratification sought dan gratifikasi obtained didominasi oleh penurunan rata-rata tiap motif-nya seperti information seeking, self-expression, mood management, entertainment, self-actualization dan virtual communities. Sedangkan motif social interaction mengalami peningkatan nilai rata-rata.


2020 ◽  
pp. 1306-1327
Author(s):  
Gowri Rajasekaran ◽  
Rathipriya R

Nowadays there are many people affected by the genetic disorder, hereditary diseases, etc. The protein complexes and their functions are detected, in order to find the irregularity in the gene expression. In a group of related proteins, there exist some conserved sequence patterns (motifs) either functionally or structurally similar. The main objective of this work is to find the motif information from the given protein sequence dataset. The functionalities of the proteins are ideally found from their motif information. Clustering approach is a main data mining technique. Besides the clustering approach, the biclustering is also used in many Bioinformatics related research works. The PSO K-Means clustering and biclustering approach is proposed in this work to extract the motif information. The Motif is extracted based on the structure homogeneity of the protein sequence. In this work, the clusters and biclusters are compared based on homogeneity and motif information extracted. This study shows that biclustering approach yields better result than the clustering approach.


2019 ◽  
Author(s):  
Mengchi Wang ◽  
David Wang ◽  
Kai Zhang ◽  
Vu Ngo ◽  
Shicai Fan ◽  
...  

ABSTRACTSequence analysis frequently requires intuitive understanding and convenient representation of motifs. Typically, motifs are represented as position weight matrices (PWMs) and visualized using sequence logos. However, in many scenarios, representing motifs by wildcard-style consensus sequences is compact and sufficient for interpreting the motif information and search for motif match. Based on mutual information theory and Jenson-Shannon Divergence, we propose a mathematical framework to minimize the information loss in converting PWMs to consensus sequences. We name this representation as sequence Motto and have implemented an efficient algorithm with flexible options for converting motif PWMs into Motto from nucleotides, amino acids, and customized alphabets. Here we show that this representation provides a simple and efficient way to identify the binding sites of 1156 common TFs in the human genome. The effectiveness of the method was benchmarked by comparing sequence matches found by Motto with PWM scanning results found by FIMO. On average, our method achieves 0.81 area under the precision-recall curve, significantly (p-value < 0.01) outperforming all existing methods, including maximal positional weight, Douglas and minimal mean square error. We believe this representation provides a distilled summary of a motif, as well as the statistical justification.AVAILABILITYMotto is freely available at http://wanglab.ucsd.edu/star/motto.


Sign in / Sign up

Export Citation Format

Share Document