positive dataset
Recently Published Documents

A majority of microbial infections are associated with biofilms. Targeting biofilms is considered an effective strategy to limit microbial virulence while minimizing the development of antibiotic resistance. Towards this need, antibiofilm peptides are an attractive arsenal since they are bestowed with properties orthogonal to small molecule drugs. In this work, we developed machine learning models to identify the distinguishing characteristics of known antibiofilm peptides, and to mine peptide databases from diverse habitats to classify new peptides with potential antibiofilm activities. Additionally, we used the reported minimum inhibitory/eradication concentration (MBIC/MBEC) of the antibiofilm peptides to create a regression model on top of the classification model to predict the effectiveness of new antibiofilm peptides. We used a positive dataset containing 242 antibiofilm peptides, and a negative dataset which, unlike previous datasets, contains peptides that are likely to promote biofilm formation. Our model achieved a classification accuracy greater than 98% and harmonic mean of precision-recall (F1) and Matthews correlation coefficient (MCC) scores greater than 0.90; the regression model achieved an MCC score greater than 0.81. We utilized our classification-regression pipeline to evaluate 135,015 peptides from diverse sources and identified antibiofilm peptide candidates that are efficacious against preformed biofilms at micromolar concentrations. Structural analysis of the top 37 hits revealed a larger distribution of helices and coils than sheets. Sequence alignment of these hits with known antibiofilm peptides revealed that, while some of the hits showed relatively high sequence similarity with known peptides, some others did not indicate the presence of antibiofilm activity in novel sources or sequences. Further, some of the hits had previously recognized therapeutic properties or host defense traits suggestive of drug repurposing applications. Taken together, this work demonstrates a new in silicio approach to predicting antibiofilm efficacy, and identifies promising new candidates for biofilm eradication.

Download Full-text

Diabetes Mellitus Prediction and Severity Level Estimation Using OWDANN Algorithm

Computational Intelligence and Neuroscience ◽

10.1155/2021/5573179 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Annamalai R ◽

Nedunchelian R

Keyword(s):

Diabetes Mellitus ◽

State Of The Art ◽

Disease Prediction ◽

Severity Level ◽

Training Time ◽

The World ◽

Positive Dataset ◽

Diabetes Prediction ◽

Two Phases ◽

F Measure

Today, diabetes is one of the most prevalent, chronic, and deadly diseases in the world owing to some complications. If accurate early diagnosis is feasible, the risk factor and incidence of diabetes may be greatly decreased. Diabetes prediction is stable and reliable, since there are only minimal labelling evidence and outliers found in the datasets of diabetes. Numerous works coped with diabetes disease prediction and provided the solution. But the existing methods proffered low accuracy detection and consumed more training time. So, this paper proposed an OWDANN algorithm for diabetes mellitus disease prediction and severity level estimation. The proposed system mainly consists of two phases, namely, disease prediction and severity level estimation phase. In the disease prediction phase, the preprocessing is performed for the Pima dataset. Then, the features are extracted from the preprocessed data, and finally, the classification step is performed by using OWDANN. In the severity level estimation phase, the diabetes positive dataset is preprocessed first. Then, the features are extracted, and lastly, the severity level is predicted using GDHC. The extensive experimental results showed that the proposed system outperforms with 98.97% accuracy, 94.98% sensitivity, 95.62% specificity, 97.02% precision, 93.84% recall, 9404% f-measure, 0.094% FDR, and 0.023% FPR compared with the state-of-the-art methods.

Download Full-text

A Two-Stage Machine Learning Classification Approach to Identify Extremism in Arabic Opinions

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/391022021 ◽

2021 ◽

Vol 10 (2) ◽

pp. 736-745

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Feature Selection Method ◽

Support Vector ◽

Two Stage ◽

Machine Learning Classification ◽

Second Stage ◽

Testing Data ◽

Stage Classification ◽

Positive Dataset

The increased usage of the Internet and social networks allowed and enabled people to express their views, which have generated an increasing attention lately. Sentiment Analysis (SA) techniques are used to determine the polarity of information, either positive or negative, toward a given topic, including opinions. In this research, we have introduced a machine learning approach based on Support Vector Machine (SVM), Naïve Bayes (NB) and Random Forest (RF) classifiers, to find and classify extreme opinions in Arabic reviews. To achieve this, a dataset of 1500 Arabic reviews was collected from Google Play Store. In addition, a two-stage Classification process was applied to classify the reviews. In the first stage, we built a binary classifier to sort out positive from negative reviews. In the second stage, however we applied a binary classification mechanism based on a set of proposed rules that distinguishes extreme positive from positive reviews, and extreme negative from negative reviews. Four major experiments were conducted with a total of 10 different sub experiments to fulfill the two-stage process using different X-validation schemas and Term Frequency-Inverse Document Frequency feature selection method. Obtained results have indicated that SVM was the best during the first stage classification with 30% testing data, and NB was the best with 20% testing data. The results of the second stage classification indicated that SVM has scored better results in identifying extreme positive reviews when dealing with the positive dataset with an overall accuracy of 68.7% and NB showed better accuracy results in identifying extreme negative reviews when dealing with the negative dataset, with an overall accuracy of 72.8%.

Download Full-text

A novel approach for predicting protein S-glutathionylation

BMC Bioinformatics ◽

10.1186/s12859-020-03571-w ◽

2020 ◽

Vol 21 (S11) ◽

Cited By ~ 1

Author(s):

Anastasia A. Anashkina ◽

Yuri M. Poluektov ◽

Vladimir A. Dmitriev ◽

Eugene N. Kuznetsov ◽

Vladimir A. Mitkevich ◽

...

Keyword(s):

Disulfide Bonds ◽

Prediction Method ◽

Protein S ◽

Redox Status ◽

Cysteine Residue ◽

Cysteine Residues ◽

Amino Acid Residues ◽

Novel Approach ◽

Positive Dataset ◽

Unknown Structure

Abstract Background S-glutathionylation is the formation of disulfide bonds between the tripeptide glutathione and cysteine residues of the protein, protecting them from irreversible oxidation and in some cases causing change in their functions. Regulatory glutathionylation of proteins is a controllable and reversible process associated with cell response to the changing redox status. Prediction of cysteine residues that undergo glutathionylation allows us to find new target proteins, which function can be altered in pathologies associated with impaired redox status. We set out to analyze this issue and create new tool for predicting S-glutathionylated cysteine residues. Results One hundred forty proteins with experimentally proven S-glutathionylated cysteine residues were found in the literature and the RedoxDB database. These proteins contain 1018 non-S-glutathionylated cysteines and 235 S-glutathionylated ones. Based on 235 S-glutathionylated cysteines, non-redundant positive dataset of 221 heptapeptide sequences of S-glutathionylated cysteines was made. Based on 221 heptapeptide sequences, a position-specific matrix was created by analyzing the protein sequence near the cysteine residue (three amino acid residues before and three after the cysteine). We propose the method for calculating the glutathionylation propensity score, which utilizes the position-specific matrix and a criterion for predicting glutathionylated peptides. Conclusion Non-S-glutathionylated sites were enriched by cysteines in − 3 and + 3 positions. The proposed prediction method demonstrates 76.6% of correct predictions of S-glutathionylated cysteines. This method can be used for detecting new glutathionylation sites, especially in proteins with an unknown structure.

Download Full-text

Construction of pseudo CRS frontier for a negative data using RTS model of Allahyar & modified multiplier BCC model

RAIRO - Operations Research ◽

10.1051/ro/2020039 ◽

2020 ◽

Author(s):

Subhadip Sarkar

Keyword(s):

Decision Making ◽

Returns To Scale ◽

Data Envelopment ◽

Negative Data ◽

Data Set ◽

Decision Making Unit ◽

Multiple Input ◽

Data Problem ◽

Positive Dataset ◽

Decision Making Units

Performance measurement of Decision Making Units (DMU) possessing an array of positive and negative type of data has been an extensively researched topic in Data Envelopment Analysis. However, assessment of Returns to Scale (RTS) under negative data problem is rarely witnessed without the steps referred by Allahyar, M. (2015). Authors purported a solution around the vicinity of the Decision Making Unit under examination to predict the nature of the Return to Scale of a firm. The extant investigation is aimed to extend the research of Allahyar, M. (2015) to identify a Pseudo Frontier for a negative data problem under Constant Return to Scale. In addition to it, a new origin based on the provided data is also computed with a view to convert the entire data set into a positive dataset. However, this approach seems to be ineffective to create a frontier under the multiple input output scenario. In this regard, a new variation of the Multiplier form of BCC model is proposed here to detect the new origin for the sake of designing the Pseudo CRS Frontier. Small examples are added for the elaboration of the CRS efficient DMUs using methods described by Allahyar, M. (2015) and identification of the New Origin from the Multiplier form of BCC model.

Download Full-text

positive datasetRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Identification of distinct characteristics of antibiofilm peptides and prospection of diverse sources for efficacious sequences

Diabetes Mellitus Prediction and Severity Level Estimation Using OWDANN Algorithm

A Two-Stage Machine Learning Classification Approach to Identify Extremism in Arabic Opinions

A novel approach for predicting protein S-glutathionylation

Construction of pseudo CRS frontier for a negative data using RTS model of Allahyar & modified multiplier BCC model

positive dataset
Recently Published Documents