scholarly journals Functional classification of transcription factor binding sites: Information content as a metric

2006 ◽  
Vol 3 (1) ◽  
pp. 32-44 ◽  
Author(s):  
D. Ashok Reddy ◽  
B. V. L. S. Prasad ◽  
Chanchal K. Mitra

Summary The information content (relative entropy) of transcription factor binding sites (TFBS) is used to classify the transcription factors (TFs). The TF classes are clustered based on the TFBS clustering using information content. Any TF belonging to the TF class cluster has a chance of binding to any TFBS of the clustered group. Thus, out of the 41 TFBS (in humans), perhaps only 5 -10 TFs may be actually needed and in case of mouse instead of 13 TFs, we may have actually 5 or so TFs. The JASPAR database of TFBS are used in this study. The experimental data on TFs of specific gene expression from TRRD database is also coinciding with our computational results. This gives us a new way to look at the protein classification- not based on their structure or function but by the nature of their TFBS.

PLoS ONE ◽  
2011 ◽  
Vol 6 (11) ◽  
pp. e26160 ◽  
Author(s):  
Hollis Wright ◽  
Aaron Cohen ◽  
Kemal Sönmez ◽  
Gregory Yochum ◽  
Shannon McWeeney

2021 ◽  
Vol 11 (11) ◽  
pp. 5123
Author(s):  
Maiada M. Mahmoud ◽  
Nahla A. Belal ◽  
Aliaa Youssif

Transcription factors (TFs) are proteins that control the transcription of a gene from DNA to messenger RNA (mRNA). TFs bind to a specific DNA sequence called a binding site. Transcription factor binding sites have not yet been completely identified, and this is considered to be a challenge that could be approached computationally. This challenge is considered to be a classification problem in machine learning. In this paper, the prediction of transcription factor binding sites of SP1 on human chromosome1 is presented using different classification techniques, and a model using voting is proposed. The highest Area Under the Curve (AUC) achieved is 0.97 using K-Nearest Neighbors (KNN), and 0.95 using the proposed voting technique. However, the proposed voting technique is more efficient with noisy data. This study highlights the applicability of the voting technique for the prediction of binding sites, and highlights the outperformance of KNN on this type of data. The study also highlights the significance of using voting.


Sign in / Sign up

Export Citation Format

Share Document