pair composition
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 8)

H-INDEX

13
(FIVE YEARS 0)

2022 ◽  
Vol 12 ◽  
Author(s):  
Rulan Wang ◽  
Zhuo Wang ◽  
Zhongyan Li ◽  
Tzong-Yi Lee

Lysine crotonylation (Kcr) is involved in plenty of activities in the human body. Various technologies have been developed for Kcr prediction. Sequence-based features are typically adopted in existing methods, in which only linearly neighboring amino acid composition was considered. However, modified Kcr sites are neighbored by not only the linear-neighboring amino acid but also those spatially surrounding residues around the target site. In this paper, we have used residue–residue contact as a new feature for Kcr prediction, in which features encoded with not only linearly surrounding residues but also those spatially nearby the target site. Then, the spatial-surrounding residue was used as a new scheme for feature encoding for the first time, named residue–residue composition (RRC) and residue–residue pair composition (RRPC), which were used in supervised learning classification for Kcr prediction. As the result suggests, RRC and RRPC have achieved the best performance of RRC at an accuracy of 0.77 and an area under curve (AUC) value of 0.78, RRPC at an accuracy of 0.74, and an AUC value of 0.80. In order to show that the spatial feature is of a competitively high significance as other sequence-based features, feature selection was carried on those sequence-based features together with feature RRPC. In addition, different ranges of the surrounding amino acid compositions’ radii were used for comparison of the performance. After result assessment, RRC and RRPC features have shown competitively outstanding performance as others or in some cases even around 0.20 higher in accuracy or 0.3 higher in AUC values compared with sequence-based features.


Molecules ◽  
2021 ◽  
Vol 26 (14) ◽  
pp. 4315
Author(s):  
Antonija Erben ◽  
Igor Sviben ◽  
Branka Mihaljević ◽  
Ivo Piantanida ◽  
Nikola Basarić

A series of tripeptides TrpTrpPhe (1), TrpTrpTyr (2), and TrpTrpTyr[CH2N(CH3)2] (3) were synthesized, and their photophysical properties and non-covalent binding to polynucleotides were investigated. Fluorescent Trp residues (quantum yield in aqueous solvent ΦF = 0.03–0.06), allowed for the fluorometric study of non-covalent binding to DNA and RNA. Moreover, high and similar affinities of 2×HCl and 3×HCl to all studied double stranded (ds)-polynucleotides were found (logKa = 6.0–6.8). However, the fluorescence spectral responses were strongly dependent on base pair composition: the GC-containing polynucleotides efficiently quenched Trp emission, at variance to AT- or AU-polynucleotides, which induced bisignate response. Namely, addition of AT(U) polynucleotides at excess over studied peptide induced the quenching (attributed to aggregation in the grooves of polynucleotides), whereas at excess of DNA/RNA over peptide the fluorescence increase of Trp was observed. The thermal denaturation and circular dichroism (CD) experiments supported peptides binding within the grooves of polynucleotides. The photogenerated quinone methide (QM) reacts with nucleophiles giving adducts, as demonstrated by the photomethanolysis (quantum yield ΦR = 0.11–0.13). Furthermore, we have demonstrated photoalkylation of AT oligonucleotides by QM, at variance to previous reports describing the highest reactivity of QMs with the GC reach regions of polynucleotides. Our investigations show a proof of principle that QM precursor can be imbedded into a peptide and used as a photochemical switch to enable alkylation of polynucleotides, enabling further applications in chemistry and biology.


2021 ◽  
Vol 15 (1) ◽  
pp. 26-37
Author(s):  
Bin Wang ◽  
Michael S. Thompson ◽  
Kevin M. Adkins

Background: Iron-responsive Elements (IREs) are hairpin structures located in the 5’ or 3’ untranslated region of some animal mRNAs. IREs have a highly conserved terminal loop and a UGC/C or C bulge five bases upstream of the terminal loop, which divides the hairpin stem into an upper stem and a lower stem. Objective: The objective of this study was to investigate the base-pair composition of the upper and lower stems of IREs to determine whether they are highly conserved among mRNAs from different genes. Methods: The mRNA sequences of six 5’IREs and five 3’IREs from several animal species were retrieved from the National Center for Biotechnology Information. The folding free energy of each IRE mRNA sequence was predicted using the RNAfold WebServer. Results: We found that the upper and lower stems of IREs are not highly conserved among the mRNAs of different genes. There are no statistically significant differences in the IRE structures or folding free energies between mammalian and non-mammalian species relative to either the ferritin heavy chain 5’IRE or ferroportin 5’IRE. There are no overall significant differences in the folding free energies between UGC/C-containing 5’IREs and C-bulge-containing 5’IREs, or between 5’IREs and 3’IREs. Conclusion: Further studies are needed to investigate whether the variations in IRE stem composition are responsible for fine-tuning the IRE/Iron-Regulatory Protein interactions among different mRNAs to maintain the balance of cellular iron metabolism, and to identify whether evolutionary processes drive the base-pair composition of the upper and lower stems of IREs toward any particular configuration.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Kai-Yao Huang ◽  
Fang-Yu Hung ◽  
Hui-Ju Kao ◽  
Hui-Hsuan Lau ◽  
Shun-Long Weng

Abstract Background Protein phosphoglycerylation, the addition of a 1,3-bisphosphoglyceric acid (1,3-BPG) to a lysine residue of a protein and thus to form a 3-phosphoglyceryl-lysine, is a reversible and non-enzymatic post-translational modification (PTM) and plays a regulatory role in glucose metabolism and glycolytic process. As the number of experimentally verified phosphoglycerylated sites has increased significantly, statistical or machine learning methods are imperative for investigating the characteristics of phosphoglycerylation sites. Currently, research into phosphoglycerylation is very limited, and only a few resources are available for the computational identification of phosphoglycerylation sites. Result We present a bioinformatics investigation of phosphoglycerylation sites based on sequence-based features. The TwoSampleLogo analysis reveals that the regions surrounding the phosphoglycerylation sites contain a high relatively of positively charged amino acids, especially in the upstream flanking region. Additionally, the non-polar and aliphatic amino acids are more abundant surrounding phosphoglycerylated lysine following the results of PTM-Logo, which may play a functional role in discriminating between phosphoglycerylation and non-phosphoglycerylation sites. Many types of features were adopted to build the prediction model on the training dataset, including amino acid composition, amino acid pair composition, positional weighted matrix and position-specific scoring matrix. Further, to improve the predictive power, numerous top features ranked by F-score were considered as the final combination for classification, and thus the predictive models were trained using DT, RF and SVM classifiers. Evaluation by five-fold cross-validation showed that the selected features was most effective in discriminating between phosphoglycerylated and non-phosphoglycerylated sites. Conclusion The SVM model trained with the selected sequence-based features performed well, with a sensitivity of 77.5%, a specificity of 73.6%, an accuracy of 74.9%, and a Matthews Correlation Coefficient value of 0.49. Furthermore, the model also consistently provides the effective performance in independent testing set, yielding sensitivity of 75.7% and specificity of 64.9%. Finally, the model has been implemented as a web-based system, namely iDPGK, which is now freely available at http://mer.hc.mmh.org.tw/iDPGK/.


2020 ◽  
Author(s):  
Kai-Yao Huang ◽  
Fang-Yu Hung ◽  
Hui-Ju Kao ◽  
Hui-Hsuan Lau ◽  
Shun-Long Weng

Abstract Background:Protein phosphoglycerylation, the addition of a 1,3-bisphosphoglyceric acid (1,3-BPG) to a lysine residue of a protein and thus to form a 3-phosphoglyceryl-lysine (pgK), is a reversible and non-enzymatic post-translational modification (PTM) and plays a regulatory role in glucose metabolism and glycolytic process. As the number of experimentally verified phosphoglycerylated sites has increased significantly, statistical or machine learning methods are imperative for investigating the characteristics of phosphoglycerylation sites. Currently, research into phosphoglycerylation is very limited, and only a few resources are available for the computational identification of phosphoglycerylation sites. Result: We present a bioinformatics investigation of phosphoglycerylation sites based on sequence-based features. The TwoSampleLogo analysis reveals that the regions surrounding the phosphoglycerylation sites contain a high relatively of positively charged amino acids, especially in the upstream flanking region. Additionally, the non-polar and aliphatic amino acids are more abundant surrounding phosphoglycerylated lysine following the results of PTM-Logo, which may play a functional role in discriminating between phosphoglycerylation and non- phosphoglycerylation sites. Many types of features were adopted to build the prediction model on the training dataset, including amino acid composition, amino acid pair composition, positional weighted matrix and position-specific scoring matrix. Further, to improve the predictive power, numerous top features ranked by F-score were considered as the final combination for classification, and thus the predictive models were trained using DT, RF and SVM classifiers. Evaluation by five-fold cross-validation showed that the selected features was most effective in discriminating between phosphoglycerylated and non- phosphoglycerylated sites.Conclusion: The SVM model trained with the selected sequence-based features performed well, with a sensitivity of 77.5%, a specificity of 73.6%, an accuracy of 74.9%, and a Matthews Correlation Coefficient value of 0.49. Furthermore, the model also consistently provides the effective performance in independent testing set, yielding sensitivity of 75.7% and specificity of 64.9%. Finally, the model has been implemented as a web-based system, namely iDPGK, which is now freely available at http://mer.hc.mmh.org.tw/iDPGK/.


2020 ◽  
Vol 15 ◽  
Author(s):  
Shulin Zhao ◽  
Ying Ju ◽  
Xiucai Ye ◽  
Jun Zhang ◽  
Shuguang Han

Background: Bioluminescence is a unique and significant phenomenon in nature. Bioluminescence is important for the lifecycle of some organisms and is valuable in biomedical research, including for gene expression analysis and bioluminescence imaging technology.In recent years, researchers have identified a number of methods for predicting bioluminescent proteins (BLPs), which have increased in accuracy, but could be further improved. Method: In this paper, we propose a new bioluminescent proteins prediction method based on a voting algorithm. We used four methods of feature extraction based on the amino acid sequence. We extracted 314 dimensional features in total from amino acid composition, physicochemical properties and k-spacer amino acid pair composition. In order to obtain the highest MCC value to establish the optimal prediction model, then used a voting algorithm to build the model.To create the best performing model, we discuss the selection of base classifiers and vote counting rules. Results: Our proposed model achieved 93.4% accuracy, 93.4% sensitivity and 91.7% specificity in the test set, which was better than any other method. We also improved a previous prediction of bioluminescent proteins in three lineages using our model building method, resulting in greatly improved accuracy.


Ibis ◽  
2019 ◽  
Vol 162 (3) ◽  
pp. 613-626
Author(s):  
Romain Pigeault ◽  
Camille‐Sophie Cozzarolo ◽  
Olivier Glaizot ◽  
Philippe Christe

2017 ◽  
Vol 134 ◽  
pp. 183-191 ◽  
Author(s):  
Alessandra Costanzo ◽  
Roberto Ambrosini ◽  
Manuela Caprioli ◽  
Emanuele Gatti ◽  
Marco Parolini ◽  
...  

2016 ◽  
Vol 111 (10) ◽  
pp. 614-624 ◽  
Author(s):  
Vanessa Bellini Bardella ◽  
Sebastián Pita ◽  
André Luis Laforga Vanzela ◽  
Cleber Galvão ◽  
Francisco Panzera

Sign in / Sign up

Export Citation Format

Share Document