A Structurally-Validated Multiple Sequence Alignment of 497 Human Protein Kinase Domains

AbstractStudies on the structures and functions of individual kinases have been used to understand the biological properties of other kinases that do not yet have experimental structures. The key factor in accurate inference by homology is an accurate sequence alignment. We present a parsimonious, structure-based multiple sequence alignment (MSA) of 497 human protein kinase domains excluding atypical kinases. The alignment is arranged in 17 blocks of conserved regions and unaligned blocks in between that contain insertions of varying lengths present in only a subset of kinases. The aligned blocks contain well-conserved elements of secondary structure and well-known functional motifs, such as the DFG and HRD motifs. From pairwise, all-against-all alignment of 272 human kinase structures, we estimate the accuracy of our MSA to be 97%. The remaining inaccuracy comes from a few structures with shifted elements of secondary structure, and from the boundaries of aligned and unaligned regions, where compromises need to be made to encompass the majority of kinases. A new phylogeny of the protein kinase domains in the human genome based on our alignment indicates that ten kinases previously labeled as “OTHER” can be confidently placed into the CAMK group. These kinases comprise the Aurora kinases, Polo kinases, and calcium/calmodulin-dependent kinase kinases.

Download Full-text

A Structurally-Validated Multiple Sequence Alignment of 497 Human Protein Kinase Domains

10.1101/776740 ◽

2019 ◽

Cited By ~ 1

Author(s):

Vivek Modi ◽

Roland L. Dunbrack

Keyword(s):

Protein Kinase ◽

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Biological Properties ◽

Human Protein ◽

Multiple Sequence ◽

Calcium Calmodulin Dependent ◽

Human Protein Kinase ◽

Key Factor

AbstractStudies on the structures and functions of individual kinases have been used to understand the biological properties of other kinases that do not yet have experimental structures. The key factor in accurate inference by homology is an accurate sequence alignment. We present a parsimonious, structure-based multiple sequence alignment (MSA) of 497 human protein kinase domains excluding atypical kinases, even those with related but somewhat different folds. The alignment is arranged in 17 blocks of conserved regions and unaligned blocks in between that contain insertions of varying lengths present in only a subset of kinases. The aligned blocks contain well-conserved elements of secondary structure and well-known functional motifs, such as the DFG and HRD motifs. From pairwise, all-against-all alignment of 272 human kinase structures, we estimate the accuracy of our MSA to be 97%. The remaining inaccuracy comes from a few structures with shifted elements of secondary structure, and from the boundaries of aligned and unaligned regions, where compromises need to be made to encompass the majority of kinases. A new phylogeny of the protein kinase domains in the human genome based on our alignment indicates that ten kinases previously labeled as “OTHER” can be confidently placed into the CAMK group. These kinases comprise the Aurora kinases, Polo kinases, and calcium/calmodulin-dependent kinase kinases.

Download Full-text

Faculty Opinions recommendation of A Structurally-Validated Multiple Sequence Alignment of 497 Human Protein Kinase Domains.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.737130960.793574510 ◽

2020 ◽

Author(s):

Jane Endicott ◽

Natalie Tatum

Keyword(s):

Protein Kinase ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Human Protein ◽

Multiple Sequence ◽

Human Protein Kinase

Download Full-text

Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.20630 ◽

2005 ◽

Vol 61 (2) ◽

pp. 318-324 ◽

Cited By ~ 66

Author(s):

Aarti Garg ◽

Harpreet Kaur ◽

G.P.S. Raghava

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Solvent Accessibility ◽

Multiple Sequence ◽

Value Prediction ◽

Real Value

Download Full-text

PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information

Nucleic Acids Research ◽

10.1093/nar/gki390 ◽

2005 ◽

Vol 33 (Web Server) ◽

pp. W289-W294 ◽

Cited By ~ 300

Author(s):

V. A. Simossis ◽

J. Heringa

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Secondary Structure Information ◽

Multiple Sequence ◽

Structure Information

Download Full-text

[29] Identification of functional residues and secondary structure from protein multiple sequence alignment

Methods in Enzymology - Computer Methods for Macromolecular Sequence Analysis ◽

10.1016/s0076-6879(96)66031-5 ◽

1996 ◽

pp. 497-512 ◽

Cited By ~ 41

Author(s):

Craig D. Livingstone ◽

Geoffrey J. Barton

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Protein Multiple Sequence Alignment

Download Full-text

Integrating Protein Secondary Structure Prediction and Multiple Sequence Alignment

Current Protein and Peptide Science ◽

10.2174/1389203043379675 ◽

2004 ◽

Vol 5 (4) ◽

pp. 249-266 ◽

Cited By ~ 35

Author(s):

V. Simossis ◽

J. Heringa

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

A Novel Comparative Sequence Analysis Method for ncRNA Secondary Structure Prediction without Multiple Sequence Alignment

2008 Fourth International Conference on Natural Computation ◽

10.1109/icnc.2008.446 ◽

2008 ◽

Author(s):

Quan Zou ◽

Mao-Zu Guo ◽

Yang Liu ◽

Zhi-An Xing

Keyword(s):

Sequence Analysis ◽

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Comparative Sequence Analysis ◽

Analysis Method ◽

Multiple Sequence ◽

Comparative Sequence

Download Full-text

Mix'n'Match: an improved multiple sequence alignment procedure for distantly related proteins using secondary structure predictions, designed to be independent of the choice of gap penalty and scoring matrix

Protein Engineering Design and Selection ◽

10.1093/protein/6.7.683 ◽

1993 ◽

Vol 6 (7) ◽

pp. 683-690 ◽

Cited By ~ 7

Author(s):

Lachlan H. Bell ◽

John R. Coggins ◽

E. James Milner-White

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Alignment Procedure ◽

Secondary Structure Predictions ◽

Scoring Matrix ◽

Related Proteins

Download Full-text

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

Proteins Structure Function and Bioinformatics ◽

10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q ◽

2000 ◽

Vol 40 (3) ◽

pp. 502-511 ◽

Cited By ~ 484

Author(s):

James A. Cuff ◽

Geoffrey J. Barton

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction ◽

Multiple Sequence

Download Full-text

Predicting Consensus Structures for RNA Alignments via Pseudo-Energy Minimization

Bioinformatics and Biology Insights ◽

10.4137/bbi.s2578 ◽

2009 ◽

Vol 3 ◽

pp. BBI.S2578 ◽

Cited By ~ 8

Author(s):

Junilda Spirollari ◽

Jason T.L. Wang ◽

Kaizhong Zhang ◽

Vivian Bellofatto ◽

Yongkyu Park ◽

...

Keyword(s):

Free Energy ◽

Secondary Structure ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Energy Minimization ◽

Secondary Structure Prediction ◽

Sequence Alignments ◽

Rna Sequences ◽

Multiple Sequence ◽

Consensus Secondary Structure

Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http://datalab.njit.edu/biology/RSpredict .

Download Full-text