Bioluminescent proteins prediction with voting strategy.

2020 ◽  
Vol 15 ◽  
Author(s):  
Shulin Zhao ◽  
Ying Ju ◽  
Xiucai Ye ◽  
Jun Zhang ◽  
Shuguang Han

Background: Bioluminescence is a unique and significant phenomenon in nature. Bioluminescence is important for the lifecycle of some organisms and is valuable in biomedical research, including for gene expression analysis and bioluminescence imaging technology.In recent years, researchers have identified a number of methods for predicting bioluminescent proteins (BLPs), which have increased in accuracy, but could be further improved. Method: In this paper, we propose a new bioluminescent proteins prediction method based on a voting algorithm. We used four methods of feature extraction based on the amino acid sequence. We extracted 314 dimensional features in total from amino acid composition, physicochemical properties and k-spacer amino acid pair composition. In order to obtain the highest MCC value to establish the optimal prediction model, then used a voting algorithm to build the model.To create the best performing model, we discuss the selection of base classifiers and vote counting rules. Results: Our proposed model achieved 93.4% accuracy, 93.4% sensitivity and 91.7% specificity in the test set, which was better than any other method. We also improved a previous prediction of bioluminescent proteins in three lineages using our model building method, resulting in greatly improved accuracy.

Author(s):  
Deepak Singh ◽  
Dilip Singh Sisodia ◽  
Pradeep Singh

A novel evolutionary-based feature selection model for ACPs identification that will explore the relationships hidden across the various feature descriptors is explored in this chapter. In this model, the authors amalgamate the nine feature descriptors from the three groups of peptide feature descriptors including amino acid composition (three descriptors), grouped amino acid composition and composition/transition/distribution (three descriptors). The proposed model integrates these features to unfold the hidden association between the diverse features in peptide classification. However, the inclusion of irrelevant, redundant, and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. Hence, evolutionary-based feature selection is utilized in the model that involves a combination of search and feature utility estimation by ReliefF score. Through extensive experiments on benchmark dataset, it is demonstrated that the proposed model achieves improved performance.


2020 ◽  
Vol 15 ◽  
Author(s):  
Affan Alim ◽  
Abdul Rafay ◽  
Imran Naseem

Background: Proteins contribute significantly in every task of cellular life. Their functions encompass the building and repairing of tissues in human bodies and other organisms. Hence they are the building blocks of bones, muscles, cartilage, skin, and blood. Similarly, antifreeze proteins are of prime significance for organisms that live in very cold areas. With the help of these proteins, the cold water organisms can survive below zero temperature and resist the water crystallization process which may cause the rupture in the internal cells and tissues. AFP’s have attracted attention and interest in food industries and cryopreservation. Objective: With the increase in the availability of genomic sequence data of protein, an automated and sophisticated tool for AFP recognition and identification is in dire need. The sequence and structures of AFP are highly distinct, therefore, most of the proposed methods fail to show promising results on different structures. A consolidated method is proposed to produce the competitive performance on highly distinct AFP structure. Methods: In this study, we propose to use machine learning-based algorithms Principal Component Analysis (PCA) followed by Gradient Boosting (GB) for antifreeze protein identification. To analyze the performance and validation of the proposed model, various combinations of two segments composition of amino acid and dipeptide are used. PCA, in particular, is proposed to dimension reduction and high variance retaining of data which is followed by an ensemble method named gradient boosting for modelling and classification. Results: The proposed method obtained the superfluous performance on PDB, Pfam and Uniprot dataset as compared with the RAFP-Pred method. In experiment-3, by utilizing only 150 PCA components a high accuracy of 89.63 was achieved which is superior to the 87.41 utilizing 300 significant features reported for the RAFP-Pred method. Experiment-2 is conducted using two different dataset such that non-AFP from the PISCES server and AFPs from Protein data bank. In this experiment-2, our proposed method attained high sensitivity of 79.16 which is 12.50 better than state-of-the-art the RAFP-pred method. Conclusion: AFPs have a common function with distinct structure. Therefore, the development of a single model for different sequences often fails to AFPs. A robust results have been shown by our proposed model on the diversity of training and testing dataset. The results of the proposed model outperformed compared to the previous AFPs prediction method such as RAFP-Pred. Our model consists of PCA for dimension reduction followed by gradient boosting for classification. Due to simplicity, scalability properties and high performance result our model can be easily extended for analyzing the proteomic and genomic dataset.


2021 ◽  
Vol 11 (5) ◽  
pp. 2083
Author(s):  
Jia Xie ◽  
Zhu Wang ◽  
Zhiwen Yu ◽  
Bin Guo ◽  
Xingshe Zhou

Ischemic stroke is one of the typical chronic diseases caused by the degeneration of the neural system, which usually leads to great damages to human beings and reduces life quality significantly. Thereby, it is crucial to extract useful predictors from physiological signals, and further diagnose or predict ischemic stroke when there are no apparent symptoms. Specifically, in this study, we put forward a novel prediction method by exploring sleep related features. First, to characterize the pattern of ischemic stroke accurately, we extract a set of effective features from several aspects, including clinical features, fine-grained sleep structure-related features and electroencephalogram-related features. Second, a two-step prediction model is designed, which combines commonly used classifiers and a data filter model together to optimize the prediction result. We evaluate the framework using a real polysomnogram dataset that contains 20 stroke patients and 159 healthy individuals. Experimental results demonstrate that the proposed model can predict stroke events effectively, and the Precision, Recall, Precision Recall Curve and Area Under the Curve are 63%, 85%, 0.773 and 0.919, respectively.


2021 ◽  
Vol 99 (Supplement_1) ◽  
pp. 55-56
Author(s):  
Christian D Ramirez-Camba ◽  
Crystal L Levesque

Abstract A mechanistic model was developed with the objective to characterize weight gain and essential amino acid (EAA) deposition in the different tissue pools that make up the pregnant sow: placenta, allantoic fluid, amniotic fluid, fetus, uterus, mammary gland, and maternal body were considered. The data used in this modelling approach were obtained from published scientific articles reporting weights, crude protein (CP), and EAA composition in the previously mentioned tissues; studies reporting not less than 5 datapoints across gestation were considered. A total of 12 scientific articles published between 1977 and 2020 were selected for the development of the model and the model was validated using 11 separate scientific papers. The model consists of three connected sub-models: protein deposition (Pd) model, weight gain model, and EAA deposition model. Weight gain, Pd, and EAA deposition curves were developed with nonparametric statistics using splines regression. The validation of the model showed a strong agreement between observed and predicted growth (r2 = 0.92, root mean square error = 3%). The proposed model also offered descriptive insights into the weight gain and Pd during gestation. The model suggests that the definition of time-dependent Pd is more accurately described as an increase in fluid deposition during mid-gestation coinciding with a reduction in Pd. In addition, due to differences in CP composition between pregnancy-related tissues and maternal body, Pd by itself may not be the best measurement criteria for the estimation of EAA requirement in pregnant sows. The proposed model also captures the negative maternal Pd that occurs in late gestation and indicates that litter size influences maternal tissue mobilization more than parity. The model predicts that the EAA requirements in early and mid-gestation are 75, 55 and 50% lower for primiparous sows than parity 2, 3 and 4+ sows, respectively, which suggest the potential benefits of parity segregated feeding.


2011 ◽  
Vol 56 (3) ◽  
pp. 1331-1341 ◽  
Author(s):  
Philip J. F. Troke ◽  
Marilyn Lewis ◽  
Paul Simpson ◽  
Katrina Gore ◽  
Jennifer Hammond ◽  
...  

ABSTRACTFilibuvir (PF-00868554) is an investigational nonnucleoside inhibitor of the hepatitis C virus (HCV) nonstructural 5B (NS5B) RNA-dependent RNA polymerase currently in development for treating chronic HCV infection. The aim of this study was to characterize the selection of filibuvir-resistant variants in HCV-infected individuals receiving filibuvir as short (3- to 10-day) monotherapy. We identified amino acid M423 as the primary site of mutation arising upon filibuvir dosing. Through bulk cloning of clinical NS5B sequences into a transient-replicon system, and supported by site-directed mutagenesis of the Con1 replicon, we confirmed that mutations M423I/T/V mediate phenotypic resistance. Selection in patients of an NS5B mutation at M423 was associated with a reduced replicative capacityin vitrorelative to the pretherapy sequence; consistent with this, reversion to wild-type M423 was observed in the majority of patients following therapy cessation. Mutations at NS5B residues R422 and M426 were detected in a small number of patients at baseline or the end of therapy and also mediate reductions in filibuvir susceptibility, suggesting these are rare but clinically relevant alternative resistance pathways. Amino acid variants at position M423 in HCV NS5B polymerase are the preferred pathway for selection of viral resistance to filibuvirin vivo.


2004 ◽  
Vol 78 (2) ◽  
pp. 868-881 ◽  
Author(s):  
Rachel H. Edwards ◽  
Diane Sitki-Green ◽  
Dominic T. Moore ◽  
Nancy Raab-Traub

ABSTRACT Seven distinct sequence variants of the Epstein-Barr virus latent membrane protein 1 (LMP1) have been identified by distinguishing amino acid changes in the carboxy-terminal domain. In this study the transmembrane domains are shown to segregate identically with the distinct carboxy-terminal amino acid sequences. Since strains of LMP1 have been shown to differ in abundance between blood and throat washes, nasopharyngeal carcinomas (NPCs) from areas of endemicity and nonendemicity with matching blood were analyzed by using a heteroduplex tracking assay to distinguish LMP1 variants. Striking differences were found between the compartments with the Ch1 strain prevalent in the NPCs from areas of endemicity and nonendemicity and the B958 strain prevalent in the blood of the endemic samples, whereas multiple strains of LMP1 were prevalent in the blood of the nonendemic samples. The possible selection against the B958 strain appearing in the tumor was highly significant (P < 0.0001). Sequence analysis of the full-length LMP1 variants revealed changes in many of the known and computer-predicted HLA-restricted epitopes with changes in key positions in multiple, potential epitopes for the specific HLA of the patients. These amino acid substitutions at key positions in the LMP1 epitopes may result in a reduced cytotoxic-T-lymphocyte response. These data indicate that strains with specific variants of LMP1 are more likely to be found in NPC. The predominance of specific LMP1 variants in NPC could reflect differences in the biologic or molecular properties of the distinct forms of LMP1 or possible immune selection.


Author(s):  
S. Elavaar Kuzhali ◽  
D. S. Suresh

For handling digital images for various applications, image denoising is considered as a fundamental pre-processing step. Diverse image denoising algorithms have been introduced in the past few decades. The main intent of this proposal is to develop an effective image denoising model on the basis of internal and external patches. This model adopts Non-local means (NLM) for performing the denoising, which uses redundant information of the image in pixel or spatial domain to reduce the noise. While performing the image denoising using NLM, “denoising an image patch using the other noisy patches within the noisy image is done for internal denoising and denoising a patch using the external clean natural patches is done for external denoising”. Here, the selection of optimal block from the entire datasets including internal noisy images and external clean natural images is decided by a new hybrid optimization algorithm. The two renowned optimization algorithms Chicken Swarm Optimization (CSO), and Dragon Fly Algorithm (DA) are merged, and the new hybrid algorithm Rooster-based Levy Updated DA (RLU-DA) is adopted. The experimental results in terms of some relevant performance measures show the promising results of the proposed model with remarkable stability and high accuracy.


Vaccine ◽  
2018 ◽  
Vol 36 (43) ◽  
pp. 6383-6392 ◽  
Author(s):  
Marta L. DeDiego ◽  
Kevin Chiem ◽  
David J. Topham

2019 ◽  
Vol 2 (4) ◽  
pp. 530
Author(s):  
Amr Hassan Yassin ◽  
Hany Hamdy Hussien

Due to the exponential growth of E-Business and computing capabilities over the web for a pay-for-use groundwork, the risk factors regarding security issues also increase rapidly. As the usage increases, it becomes very difficult to identify malicious attacks since the attack patterns change. Therefore, host machines in the network must continually be monitored for intrusions since they are the final endpoint of any network. The purpose of this work is to introduce a generalized neural network model that has the ability to detect network intrusions. Two recent heuristic algorithms inspired by the behavior of natural phenomena, namely, the particle swarm optimization (PSO) and gravitational search (GSA) algorithms are introduced. These algorithms are combined together to train a feed forward neural network (FNN) for the purpose of utilizing the effectiveness of these algorithms to reduce the problems of getting stuck in local minima and the time-consuming convergence rate. Dimension reduction focuses on using information obtained from NSL-KDD Cup 99 data set for the selection of some features to discover the type of attacks. Detecting the network attacks and the performance of the proposed model are evaluated under different patterns of network data.


Sign in / Sign up

Export Citation Format

Share Document