protein sequence comparison
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 13)

H-INDEX

14
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Jayanta Pal ◽  
Soumen Ghosh ◽  
Bansibadan Maji ◽  
Dilip Kumar Bhattacharya

Abstract Similarity/dissimilarity study of protein and genome sequences remains a challenging task and selection of techniques and descriptors to be adopted, plays an important role in computational biology. Again, genome sequence comparison is always preferred to protein sequence comparison due the presence of 20 amino acids in protein sequence compared to only 4 nucleotides in genome sequence. So it is important to consider suitable representation that is both time and space efficient and also equally applicable to protein sequences of equal and unequal lengths. In the binary form of representation, Fourier transform of a protein sequence reduces to the transformation of 20 simple binary sequences in Fourier domain, where in each such sequence, Perseval’s Identity gives a very simple computable form of power spectrum. This gives rise to readily acceptable forms of moments of different degrees. Again such moments, when properly normalized, show a monotonically descending trend with the increase in the degrees of the moments. So it is better to stick to moments of smaller degrees only. In this paper, descriptors are taken as 20 component vectors, where each component corresponds to a general second order moment of one of the 20 simple binary sequences. Then distance matrices are obtained by using Euclidean distance as the distance measure between each pair of sequence. Phylogenetic trees are obtained from the distance matrices using UPGMA algorithm. In the present paper, the datasets used for similarity/dissimilarity study are 9 ND4, 16 ND5, 9 ND6, 24 TF proteins and 12 Baculovirus proteins. It is found that the phylogenetic trees produced by the present method are at par with those produced by the earlier methods adopted by other authors and also their known biological references. Further it takes less computational time and also it is equally applicable to sequences of equal and unequal lengths.


2021 ◽  
Author(s):  
Soumen Ghosh ◽  
Jayanta Pal ◽  
Bansibadan Maji ◽  
Dilip Kumar Bhattacharya

2021 ◽  
Author(s):  
Eszter Bokor ◽  
Michel Flipphi ◽  
Sandor Kocsube ◽  
Judit Amon ◽  
Csaba Vagvolgyi ◽  
...  

We describe an HxnR-dependent regulon composed of 11 hxn genes (hxnS, T, R, P, Y, Z, X, W, V, M and N). The regulon is inducible by a nicotinate metabolic derivative and repressible by ammonium and under stringent control of the GATA factor AreA. This is the first publication of a eukaryotic, complete nicotinate metabolic cluster including five novel genes. While in A. nidulans the regulon is organised in three distinct clusters, this organisation is variable in the Ascomycota. In some Pezizomycotina species all the 11 genes are organised in a single cluster, in other in two clusters. This variable organisation sheds light on cluster evolution. Instances of gene duplication, followed by or simultaneous with integration in the cluster; partial or total cluster loss; horizontal gene transfer of several genes, including an example of whole cluster re-acquisition in Aspergillus of section Flavi were detected, together with the incorporation in some clusters of genes not found in the A. nidulans co-regulated regulon, which underlie both the plasticity and the reticulate character of metabolic cluster evolution. This study provides the first comprehensive protein sequence comparison of six members of the cluster across representatives of all Ascomycota classes, including several hundreds of species.


2021 ◽  
Vol 231 ◽  
pp. 104045
Author(s):  
Carine Froment ◽  
Clément Zanolli ◽  
Mathilde Hourset ◽  
Emmanuelle Mouton-Barbosa ◽  
Andreia Moreira ◽  
...  

Author(s):  
Yanping Zhang ◽  
Ya Gao ◽  
Jianwei Ni ◽  
Pengcheng Chen ◽  
Xiaosheng Wang

Aim and Objective: Given the rapidly increasing number of molecular biology data available, computational methods of low complexity are necessary to infer protein structure, function, and evolution. Method: In the work, we proposed a novel mthod, FermatS, which based on the global position information and local position representation from the curve and normalized moments of inertia, respectively, to extract features information of protein sequences. Furthermore, we use the generated features by FermatS method to analyze the similarity/dissimilarity of nine ND5 proteins and establish the prediction model of DNA-binding proteins based on logistic regression with 5-fold crossvalidation. Results: In the similarity/dissimilarity analysis of nine ND5 proteins, the results are consistent with evolutionary theory. Moreover, this method can effectively predict the DNA-binding proteins in realistic situations. Conclusion: The findings demonstrate that the proposed method is effective for comparing, recognizing and predicting protein sequences. The main code and datasets can download from https://github.com/GaoYa1122/FermatS..


2020 ◽  
Vol 37 (7) ◽  
pp. 1986-2001
Author(s):  
Monica Roman-Trufero ◽  
Constance M Ito ◽  
Conrado Pedebos ◽  
Indiana Magdalou ◽  
Yi-Fang Wang ◽  
...  

Abstract Genetic variation in the enzymes that catalyze posttranslational modification of proteins is a potentially important source of phenotypic variation during evolution. Ubiquitination is one such modification that affects turnover of virtually all of the proteins in the cell in addition to roles in signaling and epigenetic regulation. UBE2D3 is a promiscuous E2 enzyme, which acts as an ubiquitin donor for E3 ligases that catalyze ubiquitination of developmentally important proteins. We have used protein sequence comparison of UBE2D3 orthologs to identify a position in the C-terminal α-helical region of UBE2D3 that is occupied by a conserved serine in amniotes and by alanine in anamniote vertebrate and invertebrate lineages. Acquisition of the serine (S138) in the common ancestor to modern amniotes created a phosphorylation site for Aurora B. Phosphorylation of S138 disrupts the structure of UBE2D3 and reduces the level of the protein in mouse embryonic stem cells (ESCs). Substitution of S138 with the anamniote alanine (S138A) increases the level of UBE2D3 in ESCs as well as being a gain of function early embryonic lethal mutation in mice. When mutant S138A ESCs were differentiated into extraembryonic primitive endoderm, levels of the PDGFRα and FGFR1 receptor tyrosine kinases were reduced and primitive endoderm differentiation was compromised. Proximity ligation analysis showed increased interaction between UBE2D3 and the E3 ligase CBL and between CBL and the receptor tyrosine kinases. Our results identify a sequence change that altered the ubiquitination landscape at the base of the amniote lineage with potential effects on amniote biology and evolution.


2019 ◽  
Author(s):  
Pu Tian

AbstractSequence comparison is the cornerstone of bioinformatics and is traditionally realized by alignment. Unfortunately, exponential computational complexity renders rigorous multiple sequence alignment (MSA) intractable. Approximate algorithms and heuristics provide acceptable performance for relatively small number of sequences but engender prohibitive computational cost and unbounded accumulation of error for massive sequence sets. Alignment free algorithms achieved linear computational cost for sequence pair comparison but the challenge for multiple sequence comparison (MSC) remains. Meanwhile, various number of parameters and procedures need to be empirically adjusted for different MSC tasks with their complex interactions and impact not well understood. Therefore, development of efficient and nonparametric global sequence comparison method is essential for explosive sequencing data. It is shown here that sorted composition vector (SCV), which is based on a physical perspective on sequence composition constraint, is a feasible non-parametric encoding scheme for global protein sequence comparison and classification with linear computational complexity, and provides a global atlas tree for natural protein sequences. This finding renders massive sequence comparison and classification, which is infeasible on supercomputers, routine on a workstation. SCV sets an example of one-way encoding that might revolutionize recognition and classification tasks in general.


Author(s):  
Lina Yang ◽  
Pu Wei ◽  
Cheng Zhong ◽  
Zuqiang Meng ◽  
Patrick Wang ◽  
...  

In bioinformatics, the biological functions of proteins and their interactions can often be analyzed by the similarity of their sequences. In this paper, the authors combine the fractal dimension, empirical mode decomposition (EMD), and sliding window for protein sequence comparison. First, the protein sequence is characterized and digitized into a signal, and then the signal characteristics are obtained by using EMD and fractal dimension. Each protein sequence can be decomposed into Intrinsic Mode Functions (IMFs). The fixed window’s fractal dimension is applied to each IMF and the original signal to extract the protein sequence characteristics. Experiments have shown that the feature extracted by this hybrid method is superior to the EMD method alone.


2019 ◽  
Author(s):  
Monica Roman-Trufero ◽  
Constance M Ito ◽  
Conrado Pedebos ◽  
Indiana Magdalou ◽  
Yi-Fang Wang ◽  
...  

AbstractGenetic variation in the enzymes that catalyse post-translational modification of proteins is a potentially important source of phenotypic variation during evolution. Ubiquitination is one such modification that affects turnover of virtually all of the proteins in the cell in addition to roles in signalling and epigenetic regulation. UBE2D3 is a promiscuous E2 enzyme that acts as a ubiquitin donor for E3 ligases that catalyse ubiquitination of developmentally important proteins. We have used protein sequence comparison of UBE2D3 orthologues to identify a position in the C-terminal α-helical region of UBE2D3 that is occupied by a conserved serine in amniotes and by alanine in anamniote vertebrate and invertebrate lineages. Acquisition of the serine (S138) in the common ancestor to modern amniotes created a phosphorylation site for Aurora B. Phosphorylation of S138 disrupts the structure of UBE2D3 and reduces the level of the protein in mouse ES cells (ESCs). Substitution of S138 with the anamniote alanine (S138A) increases the level of UBE2D3 in ESCs as well as being a gain of function early embryonic lethal in mice. When mutant S138A ESCs were differentiated into extra-embryonic primitive endoderm (PrE), levels of the PDGFRα and FGFR1 receptor tyrosine kinases (RTKs) were reduced and PreE differentiation was compromised. Proximity ligation analysis showed increased interaction between UBE2D3 and the E3 ligase CBL and between CBL and the RTKs. Our results identify a sequence change that altered the ubiquitination landscape at the base of the amniote lineage with potential effects on amniote biology and evolution.


Sign in / Sign up

Export Citation Format

Share Document