scholarly journals Bioinformatics Insights Into Microbial Xylanase Protein Sequences

2018 ◽  
Vol 15 (2) ◽  
pp. 275-294
Author(s):  
Deepsikha Anand ◽  
Jeya Nasim ◽  
Sangeeta Yadav ◽  
Dinesh Yadav

Microbial xylanases represents an industrially important group of enzymes associated with hydrolysis of xylan, a major hemicellulosic component of plant cell walls. A total of 122 protein sequences comprising of 58 fungal, 25 bacterial, 19actinomycetes and 20 yeasts xylanaseswere retrieved from NCBI, GenBank databases. These sequences were in-silico characterized for homology,sequence alignment, phylogenetic tree construction, motif assessment and physio-chemical attributes. The amino acid residues ranged from 188 to 362, molecular weights were in the range of 20.3 to 39.7 kDa and pI ranged from 3.93 to 9.69. The aliphatic index revealed comparatively less thermostability and negative GRAVY indicated that xylanasesarehydrophilicirrespective of the source organisms.Several conserved amino acid residues associated with catalytic domain of the enzyme were observed while different microbial sources also revealed few conserved amino acid residues. The comprehensive phylogenetic tree indicatedsevenorganismsspecific,distinct major clusters,designated as A, B, C, D, E, F and G. The MEME based analysis of 10 motifs indicated predominance of motifs specific to GH11 family and one of the motif designated as motif 3 with sequence GTVTSDGGTYDIYTTTRTNAP was found to be present in most of the xylanases irrespective of the sources.Sequence analysis of microbial xylanases provides an opportunity to develop strategies for molecular cloning and expression of xylanase genes and also foridentifying sites for genetic manipulation for developing novel xylanases with desired features as per industrial needs.

1998 ◽  
Vol 79 (02) ◽  
pp. 306-309 ◽  
Author(s):  
Dougald Monroe ◽  
Julie Oliver ◽  
Darla Liles ◽  
Harold Roberts ◽  
Jen-Yea Chang

SummaryTissue factor pathway inhibitor (TFPI) acts to regulate the initiation of coagulation by first inhibiting factor Xa. The complex of factor Xa/ TFPI then inhibits the factor VIIa/tissue factor complex. The cDNA sequences of TFPI from several different species have been previously reported. A high level of similarity is present among TFPIs at the molecular level (DNA and protein sequences) as well as in biochemical function (inhibition of factor Xa, VIIa/tissue factor). In this report, we used a PCR-based screening method to clone cDNA for full length TFPI from a mouse macrophage cDNA library. Both cDNA and predicted protein sequences show significant homology to the other reported TFPI sequences, especially to that of rat. Mouse TFPI has a signal peptide of 28 amino acid residues followed by the mature protein (in which the signal peptide is removed) which has 278 amino acid residues. Mouse TFPI, like that of other species, consists of three tandem Kunitz type domains. Recombinant mouse TFPI was expressed in the human kidney cell line 293 and purified for functional assays. When using human clotting factors to investigate the inhibition spectrum of mouse TFPI, it was shown that, in addition to human factor Xa, mouse TFPI inhibits human factors VIIa, IXa, as well as factor XIa. Cloning and expression of the mouse TFPI gene will offer useful information and material for coagulation studies performed in a mouse model system.


2020 ◽  
Vol 17 (1) ◽  
pp. 59-77
Author(s):  
Anand Kumar Nelapati ◽  
JagadeeshBabu PonnanEttiyappan

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.


1980 ◽  
Vol 187 (1) ◽  
pp. 65-74 ◽  
Author(s):  
D Penny ◽  
M D Hendy ◽  
L R Foulds

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.


2017 ◽  
Vol 107 (4) ◽  
pp. 550-561 ◽  
Author(s):  
L. Li ◽  
Y.-T. Zhou ◽  
Y. Tan ◽  
X.-R. Zhou ◽  
B.-P. Pang

AbstractOdorant-binding proteins (OBPs) play a fundamental role in insect olfaction. In recent years,Galeruca daurica(Joannis) (Coleoptera: Chrysomelidae) has become one of the most important insect pests in the Inner Mongolian grasslands of China. This pest only feeds on the species ofAlliumplants, implying the central role of olfaction in its search for specific host plants. However, the olfaction-related proteins have not been investigated in this beetle. In this study, we identified 29 putative OBP genes, namely GdauOBP1–29, from the transcriptome database ofG. dauricaassembled in our laboratory by using RNA-Seq. All 29 genes had the full-length open reading frames except GdauOBP29, encoding proteins in length from 119 to 202 amino acids with their predicted molecular weights from 12 to 22 kDa with isoelectric points from 3.88 to 8.84. Predicted signal peptides consisting of 15–22 amino acid residues were found in all except GdauOBP6, GdauOBP13 and GdauOBP29. The amino acid sequence identity between the 29 OBPs ranged 8.33–71.83%. GdauOBP1–12 belongs to the Classic OBPs, while the others belong with the Minus-C OBPs. Phylogenetic analysis indicated that GdauOBPs are the closest to CbowOBPs fromColaphellus bowringi. RT-PCR and qRT-PCR analyses showed that all GdauOBPs were expressed in adult antennae, 11 of which with significant differences in their expression levels between males and females. Most GdauOBPs were also expressed in adult heads (without antennae), thoraxes, abdomens, legs and wings. Moreover, the expression levels of the GdauOBPs varied during the different development stages ofG. dauricawith most GdauOBPs expressed highly in the adult antennae but scarcely in eggs and pupae. These results provide insights for further research on the molecular mechanisms of chemical communications inG. daurica.


2003 ◽  
Vol 373 (2) ◽  
pp. 369-379 ◽  
Author(s):  
Maria-Dolores MONTIEL ◽  
Marie-Ange KRZEWINSKI-RECCHI ◽  
Philippe DELANNOY ◽  
Anne HARDUIN-LEPERS

The human Sda antigen is formed through the addition of an N-acetylgalactosamine residue via a β1,4-linkage to a sub-terminal galactose residue substituted with an α2,3-linked sialic acid residue. We have taken advantage of the previously cloned mouse cDNA sequence of the UDP-GalNAc:Neu5Acα2-3Galβ-R β1,4-N-acetylgalactosaminyltransferase (Sda β1,4GalNAc transferase) to screen the human EST and genomic databases and to identify the corresponding human gene. The sequence spans over 35 kb of genomic DNA on chromosome 17 and comprises at least 12 exons. As judged by reverse transcription PCR, the human gene is expressed widely since it is detected in various amounts in almost all cell types studied. Northern blot analysis indicated that five Sda β1,4GalNAc transferase transcripts of 8.8, 6.1, 4.7, 3.8 and 1.65 kb were highly expressed in colon and to a lesser extent in kidney, stomach, ileum and rectum. The complete coding nucleotide sequence was amplified from Caco-2 cells. Interestingly, the alternative use of two first exons, named E1S and E1L, leads to the production of two transcripts. These nucleotide sequences give rise potentially to two proteins of 506 and 566 amino acid residues, identical in their sequence with the exception of their cytoplasmic tail. The short form is highly similar (74% identity) to the mouse enzyme whereas the long form shows an unusual long cytoplasmic tail of 66 amino acid residues that is as yet not described for any other mammalian glycosyltransferase. Upon transient transfection in Cos-7 cells of the common catalytic domain, a soluble form of the protein was obtained, which catalysed the transfer of GalNAc residues to α2,3-sialylated acceptor substrates, to form the GalNAcβ1-4[Neu5Acα2-3]Galβ1-R trisaccharide common to both Sda and Cad antigens.


2014 ◽  
Vol 998-999 ◽  
pp. 210-213
Author(s):  
Chun Ling Zhao ◽  
Wen Jing Yu ◽  
Ji Yu Ju

cDNA of a novel protease, designated as AFEI, was cloned from digestive tract of Arenicola cristata by RACE. The cDNA of AFEIcomprised 897bp and an open reading frame that encoded polypeptides of 264 amino acid residues. AFEIshowed similarity to serine protease family and contained the conserved catalytic amino acid residues. The gene encoding the active form of AFEIwas expressed in E.coli and the purified recombinant protein could dissolve an artificial fibrin plate with plasminogen, which indicated the recombinant protein might be a plasminogen activator for thrombosis therapy.


2020 ◽  
Author(s):  
Kunchur Guruprasad

<p></p><p>Mutations in orf1ab poly-protein sequences from human SARS-CoV-2 isolates representing six geographical locations were identified by comparing with the equivalent reference sequences from the Wuhan-Hu-1, China isolate, epicentre of the current COVID-19 pandemic disease. The orf1ab poly-proteins of sequence length 7096 amino acid residues representing 10,929 genomes from six geographical locations comprised a total of 27,895 mutations that corresponded to 2,095 distinct mutation sites. The percentage of mutations was significantly high for RdRp (33.47%), nsp2 (20.04%), helicase (15.95%) and nsp3 (12.61%) proteins, compared to rest of the proteins which ranged between (0.14%) for nsp10 to (2.79%) for nsp6 proteins. A total of 2715 mutations were observed for the unique mutation sites identified for each of the six geographical locations. The distribution of the mutations was; Africa (87), Asia (605), Europe (134), North America (1677), Oceania (200) and South America (12). The RdRp protein contained significantly high mutation percentage (>31%) that varied among the different geographical locations. The nsp2 proteins from Asia, North America, Oceania and South America, the nsp3 proteins from Africa and Europe and the helicase proteins from North America showed high mutation percentage next to the RdRp proteins. The P4715L mutation in RdRp, T265I in nsp2 and L3606F in nsp6 were observed in all the geographical locations with the RdRp P4715L mutation being predominant among the orf1ab poly-proteins. In another dataset comprising 158 genomes in which the orf1ab poly-proteins comprised sequences of variable length between 7084-7095 amino acid residues, 88 additional distinct mutations were observed for the six geographical locations that included deletion mutations. The proteins containing deletion mutations were; leader protein, nsp2, nsp3, nsp4, nsp6, RdRp, 3’ -to-5’ exonuclease and endoRNAse.</p> <p> </p> <p>In this work, all the mutations observed in 11,087 orf1ab poly-proteins of human SARS CoV-2 comprising between 7084-7096 amino acid residues with reference to the human SARS-CoV-2 orf1ab poly-protein sequences from Wuhan-Hu-1, China and representing the six geographical locations; Africa, Asia, Europe, North America, Oceania and South America are presented.</p><br><p></p>


1999 ◽  
Vol 181 (17) ◽  
pp. 5288-5295 ◽  
Author(s):  
Irina Kataeva ◽  
Xin-Liang Li ◽  
Huizhong Chen ◽  
Sang-Ki Choi ◽  
Lars G. Ljungdahl

ABSTRACT The cellulolytic and hemicellulolytic complex of Clostridium thermocellum, termed cellulosome, consists of up to 26 polypeptides, of which at least 17 have been sequenced. They include 12 cellulases, 3 xylanases, 1 lichenase, and CipA, a scaffolding polypeptide. We report here a new cellulase gene, celK, coding for CelK, a 98-kDa major component of the cellulosome. The gene has an open reading frame (ORF) of 2,685 nucleotides coding for a polypeptide of 895 amino acid residues with a calculated mass of 100,552 Da. A signal peptide of 27 amino acid residues is cut off during secretion, resulting in a mature enzyme of 97,572 Da. The nucleotide sequence is highly similar to that of cbhA(V. V. Zverlov et al., J. Bacteriol. 180:3091–3099, 1998), having an ORF of 3,690 bp coding for the 1,230-amino-acid-residue CbhA of the same bacterium. Homologous regions of the two genes are 86.5 and 84.3% identical without deletion or insertion on the nucleotide and amino acid levels, respectively. Both have domain structures consisting of a signal peptide, a family IV cellulose binding domain (CBD), a family 9 glycosyl hydrolase domain, and a dockerin domain. A striking distinction between the two polypeptides is that there is a 330-amino-acid insertion in CbhA between the catalytic domain and the dockerin domain containing a fibronectin type 3-like domain and family III CBD. This insertion, missing in CelK, is responsible for the size difference between CelK and CbhA. Upstream and downstream flanking sequences of the two genes show no homology. The data indicate thatcelK and cbhA in the genome of C. thermocellum have evolved through gene duplication and recombination of domain coding sequences. celK without a dockerin domain was expressed in Escherichia coli and purified. The enzyme had pH and temperature optima at 6.0 and 65°C, respectively. It hydrolyzedp-nitrophenyl-β-d-cellobioside with aKm and a V max of 1.67 μM and 15.1 U/mg, respectively. Cellobiose was a strong inhibitor of CelK activity, with a Ki of 0.29 mM. The enzyme was thermostable, after 200 h of incubation at 60°C, 97% of the original activity remained. Properties of the enzyme indicated that it is a cellobiohydrolase.


2020 ◽  
Vol 65 (6) ◽  
pp. 1065-1071
Author(s):  
А.Н. Некрасов ◽  
◽  
Ю.П. Козмин ◽  
С.В. Козырев ◽  
Н.Г. Есипова ◽  
...  

This research investigates 24 647 non-homologous protein sequences. The occurrence profile of peptapeptides was constructed for every sequence and hierarchically organized elements of various sizes were revealed by a special mathematical method in each profile. The correlations between these hierarchical elements were analyzed and it was shown that in a tested set of protein sequences there are 11 levels of protein organization with elements ranging in length from 7 to 56 amino acid residues. It was suggested that the identified levels of organization correspond to elements of a super-secondary structure with different topology.


Sign in / Sign up

Export Citation Format

Share Document