Bioinformatic Analysis of rbcL Gene in Lilium

2013 ◽  
Vol 395-396 ◽  
pp. 691-696
Author(s):  
Yong Xiong ◽  
Chun Yan Zhao ◽  
Cui Yang

Used sequence diversity of chloroplast rbcL gene sequence to analyzeLiliumphylogeny, structured model of the rbcL protein secondary structure and tertiary structure formLilium superbum. There were 26 variable sites and 12 parsimony-informative sites by analyzing 1378bp rbcL sequences of 17Liliumspecies. The main nucleotide variable type was base transversion. The main variable region form 560bp to 852bp and form1200bp to1371bp. Structured phylogeny tree with N-J method, four types of the 17Liliumspecies were clustered, respectively Asian hybrid group, American hybrid group (two branches), and longiflorum hybrids group. Longiflorum hybrid system (nineLilium species ) was divided into four sub-categories. Structured protein secondary structure of rbcL protein ofLilium superbum, there were 18 alpha helixes, 17 beta sheets and some turn angles. The hydrophobic analysis was hydrophilic protein. 3D model was structured by homology comparative SWISS-Model online. The scores of most amino acid residues in the 3D conformation of the rbcL protein were positive and within are asonable range.

Author(s):  
Roma Chandra

Protein structure prediction is one of the important goals in the area of bioinformatics and biotechnology. Prediction methods include structure prediction of both secondary and tertiary structures of protein. Protein secondary structure prediction infers knowledge related to presence of helixes, sheets and coils in a polypeptide chain whereas protein tertiary structure prediction infers knowledge related to three dimensional structures of proteins. Protein secondary structures represent the possible motifs or regular expressions represented as patterns that are predicted from primary protein sequence in the form of alpha helix, betastr and and coils. The secondary structure prediction is useful as it infers information related to the structure and function of unknown protein sequence. There are various secondary structure prediction methods used to predict about helixes, sheets and coils. Based on these methods there are various prediction tools under study. This study includes prediction of hemoglobin using various tools. The results produced inferred knowledge with reference to percentage of amino acids participating to produce helices, sheets and coils. PHD and DSC produced the best of the results out of all the tools used.


2010 ◽  
Vol 08 (05) ◽  
pp. 867-884 ◽  
Author(s):  
YUZHONG ZHAO ◽  
BABAK ALIPANAHI ◽  
SHUAI CHENG LI ◽  
MING LI

Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at .


Proteins are made up of basic units called amino acids which are held together by bonds namely hydrogen and ionic bond. The way in which the amino acids are sequenced has been categorized into two dimensional and three dimensional structures. The main advantage of predicting secondary structure is to produce tertiary structure likelihoods that are in great demand for continuous detection of proteins. This paper reviews the different methods adopted for predicting the protein secondary structure and provides a comparative analysis of accuracies obtained from various input datasets [1].


Author(s):  
Somasheker Akkaladevi ◽  
Ajay K. Katangur ◽  
Xin Luo

Prediction of protein secondary structure (alpha-helix, beta-sheet, coil) from primary sequence of amino acids is a very challenging and difficult task, and the problem has been approached from several angles. A protein is a sequence of amino acid residues and can thus be considered as a one dimensional chain of ‘beads’ where each bead correspond to one of the 20 different amino acid residues known to occur in proteins. The length of most protein sequence ranges from 50 residues to about 1000 residues but longer proteins are also known, e.g. myosin, the major protein of muscle fibers, consists of 1800 residues (Altschul et al. 1997). Many techniques were used many researchers to predict the protein secondary structure, but the most commonly used technique for protein secondary structure prediction is the neural network (Qian et al. 1988). This chapter discusses a new method combining profile-based neural networks (Rost et al. 1993b), Simulated Annealing (SA) (Akkaladevi et al. 2005; Simons et al. 1997), Genetic algorithm (GA) (Akkaladevi et al. 2005) and the decision fusion algorithms (Akkaladevi et al. 2005). Researchers used the neural network (Hopfield 1982) combined with GA and SA algorithms, and then applied the two decision fusion methods; committee method and the correlation methods and obtained improved results on the prediction accuracy (Akkaladevi et al. 2005). Sequence profiles of amino acids are fed as input to the profile-based neural network. The two decision fusion methods improved the prediction accuracy, but noticeably one method worked better in some cases and the other method for some other sequence profiles of amino acids as input (Akkaladevi et al. 2005). Instead of compromising on some of the good solutions that could have generated from either approach, a combination of these two approaches is used for obtaining better prediction accuracy. This criterion is the basis for the Bayesian inference method (Anandalingam et al. 1989; Schmidler et al. 2000; Simons et al. 1997). The results obtained show that the prediction accuracy improves by more than 2% using the combination of the decision fusion approach and the Bayesian inference method.


Protein Secondary Structure (PSS) is one of most complex problem in biology PSS is important for determining tertiary structure in the future, for studying protein fiction and drug design. However, Experimental PSS approaches are time consuming and difficult to implement, and its most essential to establish effective computing methods for predicting on protein sequence structure. Accuracy of prediction performance has been recently improved due to the rapid expansion of protein sequences and the design of libraries in deep learning techniques. In this research proposed a deep recurring network unit method called stacked bidirectional long-term memory (Stacked BLSTM) units to predict 3-class protein secondary structure from protein sequence information using a bidirectional LSTM layer. To evaluate the output of Stacked BLSTM, using publicly available datasets from the RCSB server. This study indicates that performance of our method is better than the of that latest stranded public dataset. The accuracy achieved is more than 89%.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Qi Zhang ◽  
Jianwei Zhu ◽  
Fusong Ju ◽  
Lupeng Kong ◽  
Shiwei Sun ◽  
...  

Abstract Background The formation of contacts among protein secondary structure elements (SSEs) is an important step in protein folding as it determines topology of protein tertiary structure; hence, inferring inter-SSE contacts is crucial to protein structure prediction. One of the existing strategies infers inter-SSE contacts directly from the predicted possibilities of inter-residue contacts without any preprocessing, and thus suffers from the excessive noises existing in the predicted inter-residue contacts. Another strategy defines SSEs based on protein secondary structure prediction first, and then judges whether each candidate SSE pair could form contact or not. However, it is difficult to accurately determine boundary of SSEs due to the errors in secondary structure prediction. The incorrectly-deduced SSEs definitely hinder subsequent prediction of the contacts among them. Results We here report an accurate approach to infer the inter-SSE contacts (thus called as ISSEC) using the deep object detection technique. The design of ISSEC is based on the observation that, in the inter-residue contact map, the contacting SSEs usually form rectangle regions with characteristic patterns. Therefore, ISSEC infers inter-SSE contacts through detecting such rectangle regions. Unlike the existing approach directly using the predicted probabilities of inter-residue contact, ISSEC applies the deep convolution technique to extract high-level features from the inter-residue contacts. More importantly, ISSEC does not rely on the pre-defined SSEs. Instead, ISSEC enumerates multiple candidate rectangle regions in the predicted inter-residue contact map, and for each region, ISSEC calculates a confidence score to measure whether it has characteristic patterns or not. ISSEC employs greedy strategy to select non-overlapping regions with high confidence score, and finally infers inter-SSE contacts according to these regions. Conclusions Comprehensive experimental results suggested that ISSEC outperformed the state-of-the-art approaches in predicting inter-SSE contacts. We further demonstrated the successful applications of ISSEC to improve prediction of both inter-residue contacts and tertiary structure as well.


Author(s):  
R Ruslin ◽  
Suci Rahmawati Putri ◽  
Muhammad Arba

Receptor Orphan Reseptor-1 (ROR-1) is a trans-membrane protein consists of 937 amino acid residues located on 1p31.3 chomosome. ROR-1 has an important role in leukemogenesis of CLL cells. Co-expressions of ROR-1 and TCL enhance leukemogenesis and caused AKT’s phosphorylation, cell’s proliferations, and resistances to apoptosis. This study had done physicochemical characteristics analysis, secondary structure and tertiary structure prediction of ROR-1 Protein using few types of bioinformation tools. The physicochemical characteristics analysis was done by using Expasy Protparam server and the prediction of secondary structure was using Sopma and Psipred. The prediction of ROR-1’s tertiary structure was done by using Modeller 9.19 software  and resulting five ROR-1’s tertiary structures predicted model. The accuration of the results then evaluated using Ramachandran’s plot analysis and it was showed Model 3 is the best models wich percentage is 82%.Keywords: ROR-1, Choronic Lymphocytic Leukemia (CLL), homology modeling, Modeller, Ramachandran plot


1989 ◽  
Vol 259 (1) ◽  
pp. 145-152 ◽  
Author(s):  
E A MacGregor ◽  
B Svensson

Predictions of protein secondary structure are used with amino acid sequence alignments to show that the N-terminal domains of cyclodextrin glucanotransferases and a yeast alpha-glucosidase may have the same super-secondary structure as alpha-amylases, i.e. an (alpha/beta)8-barrel fold. Sequence similarities provide evidence that glucanotransferases, and possibly the glucosidase, are, like alpha-amylases, Ca2+-containing enzymes. The relationship between substrate specificity and the nature of the amino acid residues proposed at the active site is discussed for the transferases and alpha-glucosidase. A set of three programs for an Apple IIe computer to carry out the calculations described by Garnier, Osguthorpe & Robson [(1978) J. Mol. Biol. 120, 97-120] and a set of four programs for an Apple IIe computer to carry out the calculations described by Levin, Robson & Garnier [(1986) FEBS Lett. 205, 303-308] have been deposited as Supplementary Publication SUP 50149 (25 pages) at the British Library Document Supply Centre, Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1989) 257, 5.


Author(s):  
JAYAVARDHANA GUBBI ◽  
DANIEL T. H. LAI ◽  
MARIMUTHU PALANISWAMI ◽  
MICHAEL PARKER

Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in the prediction of fold, and eventually the tertiary structure of the protein. A challenging issue of predicting protein secondary structure from sequence alone is addressed. Support vector machines (SVM) are employed for the classification and the SVM outputs are converted to posterior probabilities for multi-class classification. The effect of using Chou–Fasman parameters and physico-chemical parameters along with evolutionary information in the form of position specific scoring matrix (PSSM) is analyzed. These proposed methods are tested on the RS126 and CB513 datasets. A new dataset is curated (PSS504) using recent release of CATH. On the CB513 dataset, sevenfold cross-validation accuracy of 77.9% was obtained using the proposed encoding method. A new method of calculating the reliability index based on the number of votes and the Support Vector Machine decision value is also proposed. A blind test on the EVA dataset gives an average Q3 accuracy of 74.5% and ranks in top five protein structure prediction methods. Supplementary material including datasets are available on .


Author(s):  
Gururaj Tejeshwar ◽  
Siddesh Gaddadadevra Mat

Introduction: The primary structure of the protein is a polypeptide chain made up of a sequence of amino acids. What happens due to interaction between the atoms of the backbone is that it forms within a polypeptide a folded structure which is very much within the secondary structure. These alignments can be made more accurate by the inclusion of secondary structure information. Objective: It is difficult to identify the sequence information embedded in the secondary structure of the protein. However, Deep learning methods can be used for solving the identification of the sequence information in the protein structures. Methods: The scope of the proposed work is to increase the accuracy of identifying the sequence information in the primary structure and the tertiary structure, thereby increasing the accuracy of the predicted protein secondary structure (PSS). In this proposed work, homology is eliminated by a Recurrent Neural Network (RNN) based network that consists of three layers namely bi-directional Long Short term Memory (LSTM), time distributed layer and Softmax layer. Results: The proposed LDS model achieves an accuracy of approx. 86% for the prediction of the three-state secondary structure of the protein. Conclusion: The gap between the number of protein primary structures and secondary structures we know is huge and increasing. Machine learning is trying to reduce this gap. In most of the other pre attempts in predicting the secondary structure of proteins the data is divided according to homology of the proteins. This limits the efficiency of the predicting model and limits the inputs given to such models. Hence in our model homology has not been considered while collecting the data for training or testing out model. This has led to our model to not be affected by the homology of the protein fed to it and hence remove that restriction, so any protein can be fed to it.


Sign in / Sign up

Export Citation Format

Share Document