scholarly journals Identification of D Modification Sites by Integrating Heterogeneous Features in Saccharomyces cerevisiae

Molecules ◽  
2019 ◽  
Vol 24 (3) ◽  
pp. 380 ◽  
Author(s):  
Pengmian Feng ◽  
Zhaochun Xu ◽  
Hui Yang ◽  
Hao Lv ◽  
Hui Ding ◽  
...  

As an abundant post-transcriptional modification, dihydrouridine (D) has been found in transfer RNA (tRNA) from bacteria, eukaryotes, and archaea. Nonetheless, knowledge of the exact biochemical roles of dihydrouridine in mediating tRNA function is still limited. Accurate identification of the position of D sites is essential for understanding their functions. Therefore, it is desirable to develop novel methods to identify D sites. In this study, an ensemble classifier was proposed for the detection of D modification sites in the Saccharomyces cerevisiae transcriptome by using heterogeneous features. The jackknife test results demonstrate that the proposed predictor is promising for the identification of D modification sites. It is anticipated that the proposed method can be widely used for identifying D modification sites in tRNA.

Author(s):  
Nataliia Koshkina

The paper proposes a method for improving the accuracy of steganoanalytical systems that use an ensemble classifier. The method involves a weighted final vote of several highly sensitive models of characteristic vectors. Its effectiveness was evaluated for the task of detecting steganograms created by the Jphide program. The accuracy obtained by usage of one of the models: LIU, CC-PEV, CC-C300, DCTR, PHARM, GFR and with using a combination of several models according to the developed method was compared. The test results proved that the weighted final voting of several highly sensitive models does increase the accuracy of the detection of steganograms with a relatively small payload (short secret messages) without compromising the accuracy of the detection of steganograms with a high payload.


2019 ◽  
Vol 35 (16) ◽  
pp. 2796-2800 ◽  
Author(s):  
Wei Chen ◽  
Hao Lv ◽  
Fulei Nie ◽  
Hao Lin

Abstract Motivation DNA N6-methyladenine (6mA) is associated with a wide range of biological processes. Since the distribution of 6mA site in the genome is non-random, accurate identification of 6mA sites is crucial for understanding its biological functions. Although experimental methods have been proposed for this regard, they are still cost-ineffective for detecting 6mA site in genome-wide scope. Therefore, it is desirable to develop computational methods to facilitate the identification of 6mA site. Results In this study, a computational method called i6mA-Pred was developed to identify 6mA sites in the rice genome, in which the optimal nucleotide chemical properties obtained by the using feature selection technique were used to encode the DNA sequences. It was observed that the i6mA-Pred yielded an accuracy of 83.13% in the jackknife test. Meanwhile, the performance of i6mA-Pred was also superior to other methods. Availability and implementation A user-friendly web-server, i6mA-Pred is freely accessible at http://lin-group.cn/server/i6mA-Pred.


2017 ◽  
Vol 7 (7) ◽  
pp. 2219-2226 ◽  
Author(s):  
Kinnari Matheson ◽  
Lance Parsons ◽  
Alison Gammie

Abstract The yeast Saccharomyces cerevisiae has emerged as a superior model organism. Selection of distinct laboratory strains of S. cerevisiae with unique phenotypic properties, such as superior mating or sporulation efficiencies, has facilitated advancements in research. W303 is one such laboratory strain that is closely related to the first completely sequenced yeast strain, S288C. In this work, we provide a high-quality, annotated genome sequence for W303 for utilization in comparative analyses and genome-wide studies. Approximately 9500 variations exist between S288C and W303, affecting the protein sequences of ∼700 genes. A listing of the polymorphisms and divergent genes is provided for researchers interested in identifying the genetic basis for phenotypic differences between W303 and S288C. Several divergent functional gene families were identified, including flocculation and sporulation genes, likely representing selection for desirable laboratory phenotypes. Interestingly, remnants of ancestor wine strains were found on several chromosomes. Finally, as a test of the utility of the high-quality reference genome, variant mapping revealed more accurate identification of accumulated mutations in passaged mismatch repair-defective strains.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Lu Zhang ◽  
Xinyi Qin ◽  
Min Liu ◽  
Guangzhong Liu ◽  
Yuxiao Ren

As one of the most prevalent posttranscriptional modifications of RNA, N7-methylguanosine (m7G) plays an essential role in the regulation of gene expression. Accurate identification of m7G sites in the transcriptome is invaluable for better revealing their potential functional mechanisms. Although high-throughput experimental methods can locate m7G sites precisely, they are overpriced and time-consuming. Hence, it is imperative to design an efficient computational method that can accurately identify the m7G sites. In this study, we propose a novel method via incorporating BERT-based multilingual model in bioinformatics to represent the information of RNA sequences. Firstly, we treat RNA sequences as natural sentences and then employ bidirectional encoder representations from transformers (BERT) model to transform them into fixed-length numerical matrices. Secondly, a feature selection scheme based on the elastic net method is constructed to eliminate redundant features and retain important features. Finally, the selected feature subset is input into a stacking ensemble classifier to predict m7G sites, and the hyperparameters of the classifier are tuned with tree-structured Parzen estimator (TPE) approach. By 10-fold cross-validation, the performance of BERT-m7G is measured with an ACC of 95.48% and an MCC of 0.9100. The experimental results indicate that the proposed method significantly outperforms state-of-the-art prediction methods in the identification of m7G modifications.


Sign in / Sign up

Export Citation Format

Share Document