scholarly journals Supervised learning techniques to predict compounds in pathway modules based on molecular properties

Author(s):  
Hayat Ali Shah

<div># Machine learning Classifiers for prediction of Pathway module & it classes </div><div>We use SMILES representation of query molecules to generate relevant fingerprints, which are then fed to the machine learning classifiers ETC for producing binary labels corresponding pathway module & its classes. The details of the works are described in our paper.</div><div>A dataset of 6597 downloaded from KEGG, 4612 compounds either belong or not to Pathway module in metabolic pathway the remaining 1985 compounds belong to module classes prediction problems </div><div>### Requirements</div><div>*Chemoinformatics tools</div><div>* Python</div><div>* scikit-learn</div><div>* RDKit</div><div>* Jupyter Notebook</div><div>### Usage</div><div>We provide two folder containing Classifiers files,grid search for optimization of hyperparameters, and datasets(module, module classes</div>

2021 ◽  
Author(s):  
Hayat Ali Shah

<div># Machine learning Classifiers for prediction of Pathway module & it classes </div><div>We use SMILES representation of query molecules to generate relevant fingerprints, which are then fed to the machine learning classifiers ETC for producing binary labels corresponding pathway module & its classes. The details of the works are described in our paper.</div><div>A dataset of 6597 downloaded from KEGG, 4612 compounds either belong or not to Pathway module in metabolic pathway the remaining 1985 compounds belong to module classes prediction problems </div><div>### Requirements</div><div>*Chemoinformatics tools</div><div>* Python</div><div>* scikit-learn</div><div>* RDKit</div><div>* Jupyter Notebook</div><div>### Usage</div><div>We provide two folder containing Classifiers files,grid search for optimization of hyperparameters, and datasets(module, module classes</div>


2019 ◽  
Vol 8 (7) ◽  
pp. 1050 ◽  
Author(s):  
Meghana Padmanabhan ◽  
Pengyu Yuan ◽  
Govind Chada ◽  
Hien Van Nguyen

Machine learning is often perceived as a sophisticated technology accessible only by highly trained experts. This prevents many physicians and biologists from using this tool in their research. The goal of this paper is to eliminate this out-dated perception. We argue that the recent development of auto machine learning techniques enables biomedical researchers to quickly build competitive machine learning classifiers without requiring in-depth knowledge about the underlying algorithms. We study the case of predicting the risk of cardiovascular diseases. To support our claim, we compare auto machine learning techniques against a graduate student using several important metrics, including the total amounts of time required for building machine learning models and the final classification accuracies on unseen test datasets. In particular, the graduate student manually builds multiple machine learning classifiers and tunes their parameters for one month using scikit-learn library, which is a popular machine learning library to obtain ones that perform best on two given, publicly available datasets. We run an auto machine learning library called auto-sklearn on the same datasets. Our experiments find that automatic machine learning takes 1 h to produce classifiers that perform better than the ones built by the graduate student in one month. More importantly, building this classifier only requires a few lines of standard code. Our findings are expected to change the way physicians see machine learning and encourage wide adoption of Artificial Intelligence (AI) techniques in clinical domains.


Author(s):  
Surafel Mehari Atnafu ◽  
Anuja Kumar Acharya

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.


Author(s):  
Surafel Mehari Atnafu ◽  
◽  
Prof (Dr.) Anuja Kumar Acharya ◽  

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.


2020 ◽  
Vol 16 (1) ◽  
pp. 67
Author(s):  
Minghua Jia ◽  
Xiaodong Wang ◽  
Yue Xu ◽  
Zhanqi Cui ◽  
Ruilin Xie

2021 ◽  
Vol 13 (8) ◽  
pp. 1433
Author(s):  
Shobitha Shetty ◽  
Prasun Kumar Gupta ◽  
Mariana Belgiu ◽  
S. K. Srivastav

Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.


Sign in / Sign up

Export Citation Format

Share Document