Motif Discovery Using Expectation Maximization and Gibbs’ Sampling

Author(s):  
Gary D. Stormo
2020 ◽  
Author(s):  
Osamu Maruyama ◽  
Fumiko Matsuzaki

Abstract Background: The ubiquitin-proteasome system is a pathway in eukaryotic cells for degrading polyubiquitin-tagged proteins through the proteasomal machinery to control various cellular processes and maintain intracellular homeostasis. In this system, the E3 ubiquitin ligase (hereinafter E3) plays an important role in selectively recognizing and binding to specific regions of its substrate proteins. The relationship between a substrate protein and its sites bound by E3s is not well understood. Thus, it is challenging to computationally identify such sites in substrate proteins. Results: In this study, we proposed a collapsed Gibbs sampling algorithm called DegSampler (Degron Sampler) to identify the binding sites of E3s. DegSampler employs a position-specific prior probability distribution, based on the estimated information of the disorder-to-order region bound by any protein. Conclusions: Our computational experiments show that DegSampler achieved 5 and 3.5 times higher the F-measure values of MEME and GLAM2, respectively. Thus DegSampler is the first model demonstrating an effective way of using estimated information on disorder-to-order binding regions in motif discovery. We expect our results to improve further as higher quality proteome-wide disorder-to-order binding region data become available.


Sign in / Sign up

Export Citation Format

Share Document