scholarly journals Can Power Laws Help Us Understand Gene and Proteome Information?

2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
J. A. Tenreiro Machado ◽  
António C. Costa ◽  
Maria Dulce Quelhas

Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.

2004 ◽  
Vol 380 (1) ◽  
pp. 211-218 ◽  
Author(s):  
Chi-Wah TSEUNG ◽  
Laura G. McMAHON ◽  
Jorge VÁZQUEZ ◽  
Jan POHL ◽  
Jesse F. GREGORY

We have previously identified and purified a novel β-glucosidase, designated PNGH (pyridoxine-5´-β-d-glucoside hydrolase), from the cytosolic fraction of pig intestinal mucosal. PNGH catalyses the hydrolysis of PNG (pyridoxine-5´-β-d-glucoside), a plant derivative of vitamin B6 that exhibits partial nutritional bioavailability in humans and animals. Preliminary amino acid sequence analysis indicated regions of close similarity of PNGH to the precursor form of LPH (lactase–phlorizin hydrolase), the β-glucosidase localized to the brush-border membrane. We report in the present study amino acid sequence data for PNGH and results of Northern blot analyses, upon which we propose a common genomic origin of PNGH and LPH. Internal Edman sequencing of the PNGH band isolated by SDS/PAGE yielded data for 16 peptides, averaging 10.8 amino acids in length. These peptides from PNGH (approx. 140 kDa) were highly similar to sequences existing over most of the length of the >200 kDa precursor of rabbit LPH; however, we found no PNGH sequences that corresponded to approx. 350 amino acids between positions 463 and 812 of the LPH precursor, a region encoded by exon 7 of the LPH precursor gene (amino acids 568–784), and no sequences that corresponded to regions near the N-terminus. MS analysis of tryptic peptides yielded 25 peptides, averaging 15 amino acids, with masses that matched segments of the rabbit LPH precursor. Northern blot analysis of pig and human small intestinal polyadenylated mRNA using a non-specific LPH cDNA probe showed an expected approx. 6 kb transcript of the LPH precursor, but also an approx. 4 kb transcript that was consistent with the size predicted from the PNGH protein mass. Using a probe specific to the region encoded by exon 7, hybridization occurred only with the 6 kb transcript. Based on these observations, we propose that both PNGH and LPH enzymes have the same genomic origin, but differ in transcriptional and, possibly, post-translational processing.


1967 ◽  
Vol 167 (1009) ◽  
pp. 331-347 ◽  

Genes are made of nucleic acid. Enzymes are made of protein. The amino acid sequence of a particular protein is synthesized under instruction from a particular piece of nucleic acid. Each protein is made of one or more polypeptide chains, synthesized by condensing together amino acids, head to tail, with the elimination of water. A typical polypeptide chain is several hundred amino acid residues long. Nevertheless only twenty different kinds of amino acids are commonly found in proteins. This standard set of twenty is the same throughout nature. Nucleic acid is made of polynucleotide chains. The repeating unit of the chain is a sugar (ribose for RNA , deoxyribose for DNA ) connected to a phosphate. A base is joined on to each sugar. There are four common bases in nucleic acid. DNA usually has adenine, guanine, cytosine and thymine. In RNA thymine is replaced by uracil.


1976 ◽  
Vol 54 (10) ◽  
pp. 902-914 ◽  
Author(s):  
Anne Cunningham ◽  
Hsin-Min Wang ◽  
Stephen R. Jones ◽  
Alexander Kurosky ◽  
Leticia Rao ◽  
...  

The digest of penicillopepsin (EC 3.4.23.7) with protease II from Myxobacter AL-1 gave five fragments which were separated on a Biogel P-100 column in 70% formic acid. The fragments were from 16 to 125 amino acids long. Two fragments were also isolated from a digest with a protease from Staphylococcus aureus. The analysis of these fragments by automatic sequencer gave a number of overlaps of the chymotryptic and thermolytic peptides. The available amino acid sequence data for penicillopepsin described in this paper and the accompanying papers (Kurosky, A. &Hofmann, T.: Can. J. Biochem. 54, 872 (1976); Rao, L. &Hofmann, T.: Can. J. Biochem. 54, 885 (1976); Harris, C. I., Rao, L., Shutsa, P., Kurosky, A. &Hofmann, T.: Can. J. Biochem. 54, 895 (1976)) have been combined and yield 15 fragments which range in lengths from 3 to 112 amino acid residues. These unique fragments account for virtually all the amino acids of the fungal protease. Four of the fragments with a total of 194 residues (about 60% of the molecule) have been aligned with corresponding sections of pig pepsin (EC 3.4.23.1) and with part of the N-terminal sequence available for calf chymosin (EC 3.4.23.4). In the alignments about 37% of the residues in the fungal enzyme are identical with at least one of the mammalian enzymes. An additional 20% are chemically similar. These results, together with previously reported active-site directed modifications, show conclusively that penicillopepsin is an evolutionary homologue of the mammalian acid proteases.


1979 ◽  
Vol 42 (05) ◽  
pp. 1652-1660 ◽  
Author(s):  
Francis J Morgan ◽  
Geoffrey S Begg ◽  
Colin N Chesterman

SummaryThe amino acid sequence of the subunit of human platelet factor 4 has been determined. Human platelet factor 4 consists of identical subunits containing 70 amino acids, each with a molecular weight of 7,756. The molecule contains no methionine, phenylalanine or tryptophan. The proposed amino acid sequence of PF4 is: Glu-Ala-Glu-Glu-Asp-Gly-Asp-Leu-Gln-Cys-Leu-Cys-Val-Lys-Thr-Thr-Ser- Gln-Val-Arg-Pro-Arg-His-Ile-Thr-Ser-Leu-Glu-Val-Ile-Lys-Ala-Gly-Pro-His-Cys-Pro-Thr-Ala-Gin- Leu-Ile-Ala-Thr-Leu-Lys-Asn-Gly-Arg-Lys-Ile-Cys-Leu-Asp-Leu-Gln-Ala-Pro-Leu-Tyr-Lys-Lys- Ile-Ile-Lys-Lys-Leu-Leu-Glu-Ser. From consideration of the homology with p-thromboglobulin, disulphide bonds between residues 10 and 36 and between residues 12 and 52 can be inferred.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Pablo Mier ◽  
Miguel A. Andrade-Navarro

Abstract According to the amino acid composition of natural proteins, it could be expected that all possible sequences of three or four amino acids will occur at least once in large protein datasets purely by chance. However, in some species or cellular context, specific short amino acid motifs are missing due to unknown reasons. We describe these as Avoided Motifs, short amino acid combinations missing from biological sequences. Here we identify 209 human and 154 bacterial Avoided Motifs of length four amino acids, and discuss their possible functionality according to their presence in other species. Furthermore, we determine two Avoided Motifs of length three amino acids in human proteins specifically located in the cytoplasm, and two more in secreted proteins. Our results support the hypothesis that the characterization of Avoided Motifs in particular contexts can provide us with information about functional motifs, pointing to a new approach in the use of molecular sequences for the discovery of protein function.


Amino Acids ◽  
2020 ◽  
Author(s):  
Thomas L. Williams ◽  
Debra J. Iskandar ◽  
Alexander R. Nödling ◽  
Yurong Tan ◽  
Louis Y. P. Luk ◽  
...  

AbstractGenetic code expansion is a powerful technique for site-specific incorporation of an unnatural amino acid into a protein of interest. This technique relies on an orthogonal aminoacyl-tRNA synthetase/tRNA pair and has enabled incorporation of over 100 different unnatural amino acids into ribosomally synthesized proteins in cells. Pyrrolysyl-tRNA synthetase (PylRS) and its cognate tRNA from Methanosarcina species are arguably the most widely used orthogonal pair. Here, we investigated whether beneficial effect in unnatural amino acid incorporation caused by N-terminal mutations in PylRS of one species is transferable to PylRS of another species. It was shown that conserved mutations on the N-terminal domain of MmPylRS improved the unnatural amino acid incorporation efficiency up to five folds. As MbPylRS shares high sequence identity to MmPylRS, and the two homologs are often used interchangeably, we examined incorporation of five unnatural amino acids by four MbPylRS variants at two temperatures. Our results indicate that the beneficial N-terminal mutations in MmPylRS did not improve unnatural amino acid incorporation efficiency by MbPylRS. Knowledge from this work contributes to our understanding of PylRS homologs which are needed to improve the technique of genetic code expansion in the future.


2015 ◽  
Vol 24 (4) ◽  
pp. 197-205
Author(s):  
Dwi Wulandari ◽  
Lisnawati Rachmadi ◽  
Tjahjani M. Sudiro

Background: E6 and E7 are oncoproteins of HPV16. Natural amino acid variation in HPV16 E6 can alter its carcinogenic potential. The aim of this study was to analyze phylogenetically E6 and E7 genes and proteins of HPV16 from Indonesia and predict the effects of single amino acid substitution on protein function. This analysis could be used to reduce time, effort, and research cost as initial screening in selection of protein or isolates to be tested in vitro or in vivo.Methods: In this study, E6 and E7 gene sequences were obtained from 12 samples of  Indonesian isolates, which  were compared with HPV16R (prototype) and 6 standard isolates in the category of European (E), Asian (As), Asian-American (AA), African-1 (Af-1), African-2 (Af-2), and North American (NA) branch from Genbank. Bioedit v.7.0.0 was used to analyze the composition and substitution of single amino acids. Phylogenetic analysis of E6 and E7 genes and proteins was performed using Clustal X (1.81) and NJPLOT softwares. Effects of single amino acid substitutions on protein function of E6 and E7 were analysed by SNAP.Results: Java variants and isolate ui66* belonged to European branch, while the others belonged to Asian and African branches. Twelve changes of amino acids were found in E6 and one in E7 proteins. SNAP analysis showed two non neutral mutations, i.e. R10I and C63G in E6 proteins. R10I mutations were found in Af-2 genotype (AF472509) and Indonesian isolates (Af2*), while C63G mutation was found only in Af2*.Conclusion: E6 proteins of HPV16 variants were more variable than E7. SNAP analysis showed that only E6 protein of African-2 branch had functional differences compared to HPV16R.


1964 ◽  
Vol 42 (6) ◽  
pp. 755-762 ◽  
Author(s):  
David B. Smith

An outline of present ideas concerning the arrangement, folding, and chemistry of the polypeptide chains of hemoglobin is given with some references to present know ledge of myoglobin.New material includes a partial amino acid sequence of the β-chain of horse hemoglobin, details concerning the amino acids lining the heme pocket of horse hemoglobin, and the effects of carboxypeptidases A and B on horse oxy- and horse deoxy-hemoglobin. The kinetics of the latter reactions are not simple. The C-terminal amino acids are released more rapidly from the oxygenated form.


1973 ◽  
Vol 131 (3) ◽  
pp. 485-498 ◽  
Author(s):  
R. P. Ambler ◽  
Margaret Wynn

The amino acid sequences of the cytochromes c-551 from three species of Pseudomonas have been determined. Each resembles the protein from Pseudomonas strain P6009 (now known to be Pseudomonas aeruginosa, not Pseudomonas fluorescens) in containing 82 amino acids in a single peptide chain, with a haem group covalently attached to cysteine residues 12 and 15. In all four sequences 43 residues are identical. Although by bacteriological criteria the organisms are closely related, the differences between pairs of sequences range from 22% to 39%. These values should be compared with the differences in the sequence of mitochondrial cytochrome c between mammals and amphibians (about 18%) or between mammals and insects (about 33%). Detailed evidence for the amino acid sequences of the proteins has been deposited as Supplementary Publication SUP 50015 at the National Lending Library for Science and Technology, Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1973), 131, 5.


1987 ◽  
Vol 105 (3) ◽  
pp. 1183-1190 ◽  
Author(s):  
W S Argraves ◽  
S Suzuki ◽  
H Arai ◽  
K Thompson ◽  
M D Pierschbacher ◽  
...  

The amino acid sequence deduced from cDNA of the human placental fibronectin receptor is reported. The receptor is composed of two subunits: an alpha subunit of 1,008 amino acids which is processed into two polypeptides disulfide bonded to one another, and a beta subunit of 778 amino acids. Each subunit has near its COOH terminus a hydrophobic segment. This and other sequence features suggest a structure for the receptor in which the hydrophobic segments serve as transmembrane domains anchoring each subunit to the membrane and dividing each into a large ectodomain and a short cytoplasmic domain. The alpha subunit ectodomain has five sequence elements homologous to consensus Ca2+-binding sites of several calcium-binding proteins, and the beta subunit contains a fourfold repeat strikingly rich in cysteine. The alpha subunit sequence is 46% homologous to the alpha subunit of the vitronectin receptor. The beta subunit is 44% homologous to the human platelet adhesion receptor subunit IIIa and 47% homologous to a leukocyte adhesion receptor beta subunit. The high degree of homology (85%) of the beta subunit with one of the polypeptides of a chicken adhesion receptor complex referred to as integrin complex strongly suggests that the latter polypeptide is the chicken homologue of the fibronectin receptor beta subunit. These receptor subunit homologies define a superfamily of adhesion receptors. The availability of the entire protein sequence for the fibronectin receptor will facilitate studies on the functions of these receptors.


Sign in / Sign up

Export Citation Format

Share Document