Alternative approach to protein structure prediction based on sequential similarity of physical properties

The relationship between protein sequence and structure arises entirely from amino acid physical properties. An alternative method is therefore proposed to identify homologs in which residue equivalence is based exclusively on the pairwise physical property similarities of sequences. This approach, the property factor method (PFM), is entirely different from those in current use. A comparison is made between our method and PSI BLAST. We demonstrate that traditionally defined sequence similarity can be very low for pairs of sequences (which therefore cannot be identified using PSI BLAST), but similarity of physical property distributions results in almost identical 3D structures. The performance of PFM is shown to be better than that of PSI BLAST when sequence matching is comparable, based on a comparison using targets from CASP10 (89 targets) and CASP11 (51 targets). It is also shown that PFM outperforms PSI BLAST in informatically challenging targets.

Download Full-text

SeqStruct: A New Amino Acid Similarity Matrix Based on Sequence Correlations and Structural Contacts Yields Sequence-Structure Congruence

10.1101/268904 ◽

2018 ◽

Author(s):

Kejue Jia ◽

Robert L. Jernigan

Keyword(s):

Amino Acid ◽

Protein Sequence ◽

Sequence Similarity ◽

Protein Structures ◽

Substitution Matrix ◽

Similarity Matrix ◽

Sequence Matching ◽

Sequence Structure ◽

Amino Acid Similarity ◽

Simple Amino Acid

SUMMARYProtein sequence matching does not properly account for some well-known features of protein structures: surface residues being more variable than core residues, the high packing densities in globular proteins, and does not yield good matches of sequences of many proteins known to be close structural relatives. There are now abundant protein sequences and structures to enable major improvements to sequence matching. Here, we utilize structural frameworks to mount the observed correlated sequences to identify the most important correlated parts. The rationale is that protein structures provide the important physical framework for improving sequence matching. Combining the sequence and structure data in this way leads to a simple amino acid substitution matrix that can be readily incorporated into any sequence matching. This enables the incorporation of allosteric information into sequence matching and transforms it effectively from a 1-D to a 3-D procedure. The results from testing in over 3,000 sequence matches demonstrate a 37% gain in sequence similarity and a loss of 26% of the gaps when compared with the use of BLOSUM62. And, importantly there are major gains in the specificity of sequence matching across diverse proteins. Specifically, all known cases where protein structures match but sequences do not match well are resolved.

Download Full-text

Alternative Approach to Protein Structure Prediction Based on Sequential Similarity of Physical Properties

Biophysical Journal ◽

10.1016/j.bpj.2015.11.2865 ◽

2016 ◽

Vol 110 (3) ◽

pp. 535a

Author(s):

Yi He ◽

S. Rackovsky ◽

Yanping Yin ◽

Harold A. Scheraga

Keyword(s):

Protein Structure ◽

Physical Properties ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Alternative Approach

Download Full-text

Effects of Heating Rate on the Physical Property, Porosity and Expansibility of Sewage Sludge Ceramics (SSC)

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.383 ◽

2013 ◽

Vol 427-429 ◽

pp. 383-387

Author(s):

Dong Ting Yue ◽

Qin Yan Yue ◽

Bao Yu Gao ◽

Qian Li ◽

Yan Wang

Keyword(s):

Sewage Sludge ◽

Physical Properties ◽

Water Absorption ◽

Heating Rate ◽

Glass Phase ◽

Physical Property ◽

Experimental Results ◽

Expansion Rate ◽

Forming Mechanism ◽

The Relationship

The effects of heating rate on the preparation, characterization, pore-forming mechanism and bloating mechanism of sludge ceramics were investigated. The experimental results indicated that physical properties of SSC were highly dependent on heating rate. SSC with higher expansion rate and lower water absorption could be obtained as the heating rate was between 4 °C/min and 5 °C/min. Porosity and expansibility of SSC were closely related to the heating rate. The heating rate could affect the relationship between the forming rate of gas and the forming rate of glass phase, and further influenced the bloating phenomenon of SSC.

Download Full-text

Faculty Opinions recommendation of Alternative approach to protein structure prediction based on sequential similarity of physical properties.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.725424611.793506045 ◽

2015 ◽

Author(s):

Andras Fiser

Keyword(s):

Protein Structure ◽

Physical Properties ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Alternative Approach

Download Full-text

Three classes of tetrahydrobiopterin-dependent enzymes

Pteridines ◽

10.1515/pterid-2013-0003 ◽

2013 ◽

Vol 24 (1) ◽

pp. 7-11

Author(s):

Ernst R. Werner

Keyword(s):

Nitric Oxide ◽

Amino Acid ◽

Protein Sequence ◽

Aromatic Amino Acid ◽

Phenylalanine Hydroxylase ◽

Current Knowledge ◽

Sequence Similarity ◽

Protein Sequences ◽

Nitric Oxide Synthases ◽

Aromatic Amino Acid Hydroxylases

AbstractCurrent knowledge distinguishes three classes of tetrahydrobiopterin-dependent enzymes as based on protein sequence similarity. These three protein sequence clusters hydroxylate three types of substrate atoms and use three different forms of iron for catalysis. The first class to be discovered was the aromatic amino acid hydroxylases, which, in mammals, include phenylalanine hydroxylase, tyrosine hydroxylase, and two isoforms of tryptophan hydroxylases. The protein sequences of these tetrahydrobiopterin-dependent aromatic amino acid hydroxylases are significantly similar, and all mammalian aromatic amino acid hydroxylases require a non-heme-bound iron atom in the active site of the enzyme for catalysis. The second classes of tetrahydrobiopterin-dependent enzymes to be characterized were the nitric oxide synthases, which in mammals occur as three isoforms. Nitric oxide synthase protein sequences form a separate cluster of homologous sequences with no similarity to aromatic amino acid hydroxylase protein sequences. In contrast to aromatic amino acid hydroxylases, nitric oxide synthases require a heme-bound iron for catalysis. The alkylglycerol monooxygenase protein sequence was the most recent to be characterized. This sequence shares no similarity with aromatic amino acid hydroxylases and nitric oxide synthases. Motifs contained in the alkylglycerol monooxygenase protein sequence suggest that this enzyme may use a di-iron center for catalysis.

Download Full-text

Molecular and Mechanistic Characterization of PddB, the First PLP-Independent 2,4-Diaminobutyric Acid Racemase Discovered in an Actinobacterial D-Amino Acid Homopolymer Biosynthesis

Frontiers in Microbiology ◽

10.3389/fmicb.2021.686023 ◽

2021 ◽

Vol 12 ◽

Author(s):

Kazuya Yamanaka ◽

Ryo Ozaki ◽

Yoshimitsu Hamano ◽

Tadao Oikawa

Keyword(s):

Amino Acid ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Sequence Similarity ◽

Structural Difference ◽

Structural Modeling ◽

Specific Substrate ◽

Amino Acid Residues ◽

Diaminobutyric Acid

We recently disclosed that the biosynthesis of antiviral γ-poly-D-2,4-diaminobutyric acid (poly-D-Dab) in Streptoalloteichus hindustanus involves an unprecedented cofactor independent stereoinversion of Dab catalyzed by PddB, which shows weak homology to diaminopimelate epimerase (DapF). Enzymological properties and mechanistic details of this enzyme, however, had remained to be elucidated. Here, through a series of biochemical characterizations, structural modeling, and site-directed mutageneses, we fully illustrate the first Dab-specific PLP-independent racemase PddB and further provide an insight into its evolution. The activity of the recombinant PddB was shown to be optimal around pH 8.5, and its other fundamental properties resembled those of typical PLP-independent racemases/epimerases. The enzyme catalyzed Dab specific stereoinversion with a calculated equilibrium constant of nearly unity, demonstrating that the reaction catalyzed by PddB is indeed racemization. Its activity was inhibited upon incubation with sulfhydryl reagents, and the site-directed substitution of two putative catalytic Cys residues led to the abolishment of the activity. These observations provided critical evidence that PddB employs the thiolate-thiol pair to catalyze interconversion of Dab isomers. Despite the low levels of sequence similarity, a phylogenetic analysis of PddB indicated its particular relevance to DapF among PLP-independent racemases/epimerases. Secondary structure prediction and 3D structural modeling of PddB revealed its remarkable conformational analogy to DapF, which in turn allowed us to predict amino acid residues potentially responsible for the discrimination of structural difference between diaminopimelate and its specific substrate, Dab. Further, PddB homologs which seemed to be narrowly distributed only in actinobacterial kingdom were constantly encoded adjacent to the putative poly-D-Dab synthetase gene. These observations strongly suggested that PddB could have evolved from the primary metabolic DapF in order to organize the biosynthesis pathway for the particular secondary metabolite, poly-D-Dab. The present study is on the first molecular characterization of PLP-independent Dab racemase and provides insights that could contribute to further discovery of unprecedented PLP-independent racemases.

Download Full-text

Protein Secondary Structure Prediction using Recurrent Neural Networks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6137.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 660-663

Keyword(s):

Neural Networks ◽

Amino Acid ◽

Secondary Structure ◽

Amino Acid Sequence ◽

Recurrent Neural Networks ◽

Protein Sequence ◽

Structure Prediction ◽

Short Term Memory ◽

Secondary Structure Prediction ◽

Protein Secondary Structure

In bioinformatics the prediction of the secondary structure of the protein from its primary amino acid sequence is very difficult, which has a huge impact on the field of science and medicine. The hardest part is how to learn the most effective and correct protein features to improve prediction. Here, we carry out a deep learning model to enhance structure prediction. The core achievement of this paper is a group of recurrent neural networks (RNNs) that can manage high-level relational features from a pair of input protein sequence and target protein sequences. This paper contrasts the different type of recurrent network in recurrent neural networks (RNNs). In addition, the emphasis is on more advanced systems which incorporate a gating utility is called long short term memory (LSTM) unit and the newly added gated recurrent unit (GRU). This recurrent units has been calculated on the basis of predicting protein secondary structure using an amino acid sequence. The dataset has been taken from a publicly available database server (RCSB), and this study shows that advanced recurrent units LSTM is better than GRU for a long protein sequence.

Download Full-text

An Alternative Approach to Measure Co-Movement between Two Time Series

Mathematics ◽

10.3390/math8020261 ◽

2020 ◽

Vol 8 (2) ◽

pp. 261 ◽

Cited By ~ 7

Author(s):

José Pedro Ramos-Requena ◽

Juan Evangelista Trinidad-Segovia ◽

Miguel Ángel Sánchez-Granero

Keyword(s):

Time Series ◽

Portfolio Management ◽

Correlation Method ◽

Simple Method ◽

Derivatives Pricing ◽

Statistical Arbitrage ◽

Alternative Approach ◽

Complex Relationships ◽

The Relationship ◽

Better Than

The study of the dependences between different assets is a classic topic in financial literature. To understand how the movements of one asset affect to others is critical for derivatives pricing, portfolio management, risk control, or trading strategies. Over time, different methodologies were proposed by researchers. ARCH, GARCH or EGARCH models, among others, are very popular to model volatility autocorrelation. In this paper, a new simple method called HP is introduced to measure the co-movement between two time series. This method, based on the Hurst exponent of the product series, is designed to detect correlation, even if the relationship is weak, but it also works fine with cointegration as well as non linear correlations or more complex relationships given by a copula. This method and different variations thereaof are tested in statistical arbitrage. Results show that HP is able to detect the relationship between assets better than the traditional correlation method.

Download Full-text

A novel avian isolate of hepatitis E virus from Pakistan

Virology Journal ◽

10.1186/s12985-019-1247-0 ◽

2019 ◽

Vol 16 (1) ◽

Cited By ~ 1

Author(s):

Tahir Iqbal ◽

Umer Rashid ◽

Muhammad Idrees ◽

Amber Afroz ◽

Saleem Kamili ◽

...

Keyword(s):

Amino Acid ◽

Structure Prediction ◽

Hepatitis E Virus ◽

Secondary Structure Prediction ◽

Clinical Symptoms ◽

Sequence Similarity ◽

Evolutionary Genetics ◽

Hepatitis E ◽

Stem Loop ◽

Layer Chickens

Abstract Background Avian hepatitis E virus (aHEV) has been associated with hepatitis-splenomegaly syndrome (HSS) in chickens along with asymptomatic subclinical infection in many cases. So far, four genotypes have been described, which cause infection in chickens, specifically in broiler breeders and layer chickens. In the present study, we isolated and identified two novel aHEV strains from the bile of layer chickens in Pakistan evincing clinical symptoms related to HSS. Methodology Histology of liver and spleen tissues was carried out to observe histopathological changes in these tissues. Bile fluid and fecal suspensions were used for viral RNA isolation through MegNA pure and Trizol method which was further used for viral genome detection and characterization by cDNA synthesis and amplification of partial open reading frame (ORF) 1, ORF2 and complete ORF3. The bioinformatics tools; Molecular Evolutionary Genetics Analysis version 6.0 (MEGA 6), Mfold and ProtScale were used for phylogenic analysis, RNA secondary structure prediction and protein hydropathy analysis, respectively. Results Sequencing and phylogenetic analysis on the basis of partial methyltranferase (MeT), helicase (Hel) domain, ORF2 and complete ORF3 sequence suggests these Pakistani aHEV (Pak aHEV) isolates may belong to a Pakistani specific clade. The overall sequence similarity between the Pak aHEV sequences was 98–100%. The ORF1/ORF3 intergenic region contains a conserved cis-reactive element (CRE) and stem-loop structure (SLS). Analysis of the amino acid sequence of ORF3 indicated two hydrophobic domains (HD) and single conserved proline-rich domain (PRD) PREPSAPP (PXXPXXPP) with a single PSAP motif found in C-terminal. Amino acid changes S15 T, A31T, Q35H and G46D unique to the Pak aHEV sequences were found in the N-terminal region of ORF3. Conclusions Our data suggests that Pak aHEV isolates may represent a novel Pakistani clade and high sequence homology to each other support the supposition they may belong to a monophyletic clade circulating in the region around Pakistan. The data presented in this study provide further information for aHEV genetic diversity, genotype mapping, global distribution and epidemiology.

Download Full-text