Phylogenetic, sequence and structural analysis of Insulin superfamily proteins reveals an indelible link between evolution and structure-function relationship
AbstractThe insulin superfamily proteins (ISPs), in particular, insulin, IGFs and relaxins are key modulators of animal physiology. They are known to have evolved from the same ancestral gene and have diverged into proteins with varied sequences and distinct functions, but maintain a similar structural architecture stabilized by highly conserved disulphide bridges. A recent surge of sequence data and the structures of these proteins prompted a need for a comprehensive analysis which connects the evolution of these sequences in the light of available functional and structural information and their interaction with cognate receptors. This study reveals a) unusually high sequence conservation of IGFs (>90%), which has never been reported before. In fact, it was interesting to observe that the functional domains (excluding signal peptide) of human, horse, pig and Ord’s kangaroo rat are 100% identical. (b) an updated definition of the signature motif of the relaxin family (c) a non-canonical C-peptide cleavage site in a few killifish insulin sequences and so on. We also provide a structure-based rationale for such conservation by introducing a concept called binding partners imposed evolutionary constraints. Furthermore, the high conservation of IGFs appears to represent a classic case of resistance to sequence diversity exerted by physiologically important interactions with multiple partners. Furthermore, we propose a probable mechanism for C-peptide cleavage in killifish insulin sequences.