Classification study of solvation free energies of organic molecules using machine learning techniques

N. S. Hari Narayana Moorthy; Silvia A. Martins; Sergio F. Sousa; Maria J. Ramos; Pedro A. Fernandes

doi:10.1039/c4ra07961b

Alzheimer's Disease Early Detection Using Machine Learning Techniques

10.21203/rs.3.rs-624520/v1 ◽

2021 ◽

Author(s):

Roobaea Alroobaea ◽

Seifeddine Mechti ◽

Mariem Haoues ◽

Saeed Rubaiee ◽

Anas Ahmed ◽

...

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Support Vector Machine ◽

Logistic Regression ◽

Random Forest ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Detection ◽

Learning Techniques

Abstract Alzheimer's is the main reason for dementia, that affects frequently older adults. This disease is costly especially, in terms of treatment. In addition, Alzheimer's is one of the deaths causes in the old-age citizens. Early Alzheimer's detection helps medical staffs in this disease diagnosis, which will certainly decrease the risk of death. This made the early Alzheimer's disease detection a crucial problem in the healthcare industry. The objective of this research study is to introduce a computer-aided diagnosis system for Alzheimer's disease detection using machine learning techniques. We employed data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS) brain datasets. Common supervised machine learning techniques have been applied for automatic Alzheimer’s disease detection such as: logistic regression, support vector machine, random forest, linear discriminant analysis, etc. The best accuracy values provided by the machine learning classifiers are 99.43% and 99.10% given by respectively, logistic regression and support vector machine using ADNI dataset, whereas for the OASIS dataset, we obtained 84.33% and 83.92% given by respectively logistic regression and random forest.

Exploring Statistical Parameters of Machine Learning Techniques for Detection and Classification of Brain Tumor

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9860.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 4118-4124

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Brain Tumor ◽

Random Forest ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Artificial Neural ◽

Computerized System ◽

The Brain

A computerized system can improve the disease identifying abilities of doctor and also reduce the time needed for the identification and decision-making in healthcare. Gliomas are the brain tumors that can be labeled as Benign (non- cancerous) or Malignant (cancerous) tumor. Hence, the different stages of the tumor are extremely important for identification of appropriate medication. In this paper, a system has been proposed to detect brain tumor of different stages by MR images. The proposed system uses Fuzzy C-Mean (FCM) as a clustering technique for better outcome. The main focus in this paper is to refine the required features in two steps with the help of Discrete Wavelet Transform (DWT) and Independent Component Analysis (ICA) using three machine learning techniques i.e. Random Forest (RF), Artificial Neural Network (ANN) and Support Vector Machine (SVM). The final outcome of our experiment indicated that the proposed computerized system identifies the brain tumor using RF, ANN and SVM with 100%, 91.6% and 95.8%, accuracy respectively. We have also calculated Sensitivity, Specificity, Matthews’s Correlation Coefficient and AUC-ROC curve. Random forest shows the highest accuracy as compared to Support Vector Machine and Artificial Neural Networks.

Combining classifiers to detect faults in wastewater networks

Water Science & Technology ◽

10.2166/wst.2018.131 ◽

2018 ◽

Vol 77 (9) ◽

pp. 2184-2189 ◽

Cited By ~ 2

Author(s):

Joshua Myrans ◽

Zoran Kapelan ◽

Richard Everson

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Machine Learning Techniques ◽

Support Vector ◽

Minimal Impact ◽

South West ◽

Learning Techniques ◽

Increase In Accuracy ◽

The Uk

Abstract This work presents a methodology for automatic detection of structural faults in sewers from CCTV footage, which has been improved by combining the outputs of different machine learning techniques. The predictions of support vector machine and random forest classifiers are combined using three distinct techniques: ‘both’, ‘most likely’ and ‘stacking’. Each technique is tested on CCTV data taken from real surveys covering a range of pipes at locations in the south-west of the UK. The best tested technique, stacking, offers a 5% increase in accuracy for minimal impact in efficiency, proving useful for future development and implementation of the fault detection methodology.

Machine learning predictivity applied to consumer creditworthiness

Future Business Journal ◽

10.1186/s43093-020-00041-w ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Maisa Cardoso Aniceto ◽

Flavio Barboza ◽

Herbert Kimura

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Credit Risk ◽

Performance Metrics ◽

Prediction Models ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Default Prediction

AbstractCredit risk evaluation has a relevant role to financial institutions, since lending may result in real and immediate losses. In particular, default prediction is one of the most challenging activities for managing credit risk. This study analyzes the adequacy of borrower’s classification models using a Brazilian bank’s loan database, and exploring machine learning techniques. We develop Support Vector Machine, Decision Trees, Bagging, AdaBoost and Random Forest models, and compare their predictive accuracy with a benchmark based on a Logistic Regression model. Comparisons are analyzed based on usual classification performance metrics. Our results show that Random Forest and Adaboost perform better when compared to other models. Moreover, Support Vector Machine models show poor performance using both linear and nonlinear kernels. Our findings suggest that there are value creating opportunities for banks to improve default prediction models by exploring machine learning techniques.

Data-Driven Trend Forecasting in Stock Market Using Machine Learning Techniques

Journal of Information Technology Research ◽

10.4018/jitr.2020010109 ◽

2020 ◽

Vol 13 (1) ◽

pp. 130-149

Author(s):

Puneet Misra ◽

Siddharth Chaurasia

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Stock Market ◽

Machine Learning Techniques ◽

Support Vector ◽

Raw Data ◽

Learning Techniques ◽

Clear Winner ◽

Trend Forecasting ◽

Trend Data

Stock market movements are affected by numerous factors making it one of the most challenging problems for forecasting. This article attempts to predict the direction of movement of stock and stock indices. The study uses three classifiers - Artificial Neural Network, Random Forest and Support Vector Machine with four different representation of inputs. First representation uses raw data (open, high, low, close and volume), The second uses ten features in the form of technical indicators generated by use of technical analysis. The third and fourth portrayal presents two different ways of converting the indicator data into discrete trend data. Experimental results suggest that for raw data support vector machine provides the best results. For other representations, there is no clear winner regarding models applied, but portrayal of data by the proposed approach gave best overall results for all the models and financial series. Consistency of the results highlight the importance of feature generation and right representation of dataset to machine learning techniques.

A review of machine learning techniques using decision tree and support vector machine

2016 International Conference on Computing Communication Control and automation (ICCUBEA) ◽

10.1109/iccubea.2016.7860040 ◽

2016 ◽

Cited By ~ 14

Author(s):

Madan Somvanshi ◽

Pranjali Chavan ◽

Shital Tambade ◽

S. V. Shinde

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques

Credit Risk Assessment using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4936.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 3482-3486

Keyword(s):

Machine Learning ◽

Risk Assessment ◽

Random Forest ◽

Credit Risk ◽

Banking Sector ◽

Machine Learning Techniques ◽

Support Vector ◽

Credit Risk Assessment ◽

Learning Techniques ◽

Cart Algorithm

Analysis of credit scoring is an effective credit risk assessment technique, which is one of the major research fields in the banking sector. Machine learning has a variety of applications in the banking sector and it has been widely used for data analysis. Modern techniques such as machine learning have provided a self-regulating process to analyze the data using classification techniques. The classification method is a supervised learning process in which the computer learns from the input data provided and makes use of this information to classify the new dataset. This research paper presents a comparison of various machine learning techniques used to evaluate the credit risk. A credit transaction that needs to be accepted or rejected is trained and implemented on the dataset using different machine learning algorithms. The techniques are implemented on the German credit dataset taken from UCI repository which has 1000 instances and 21 attributes, depending on which the transactions are either accepted or rejected. This paper compares algorithms such as Support Vector Network, Neural Network, Logistic Regression, Naive Bayes, Random Forest, and Classification and Regression Trees (CART) algorithm and the results obtained show that Random Forest algorithm was able to predict credit risk with higher accuracy

Framework for Providing Security in Private Cloud using Machine Learning Techniques

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f9121.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 7641-7645

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Learning Algorithms ◽

Feature Reduction ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Cyber Attack ◽

Learning Techniques

The advancement in cyber-attack technologies have ushered in various new attacks which are difficult to detect using traditional intrusion detection systems (IDS).Existing IDS are trained to detect known patterns because of which newer attacks bypass the current IDS and go undetected. In this paper, a two level framework is proposed which can be used to detect unknown new attacks using machine learning techniques. In the first level the known types of classes for attacks are determined using supervised machine learning algorithms such as Support Vector Machine (SVM) and Neural networks (NN). The second level uses unsupervised machine learning algorithms such as K-means. The experimentation is carried out with four models with NSL- KDD dataset in Openstack cloud environment. The Model with Support Vector Machine for supervised machine learning, Gradual Feature Reduction (GFR) for feature selection and K-means for unsupervised algorithm provided the optimum efficiency of 94.56 %.

Interpolation of Instantaneous Air Temperature Using Geographical and MODIS Derived Variables with Machine Learning Techniques

10.20944/preprints201906.0008.v1 ◽

2019 ◽

Author(s):

Marcos Ruiz-Álvarez ◽

Francisco Alonso-Sarría ◽

Francisco Gomariz-Castillo

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Linear Regression ◽

Air Temperature ◽

Satellite Data ◽

Multivariate Linear Regression ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector

Several methods have been tried to estimate air temperature using satellite imagery. In this paper, the results of two machine learning algorithms, Support Vector Machine and Random Forest, are compared with Multivariate Linear Regression, TVX and Ordinary kriging. Several geographic, remote sensing and time variables are used as predictors. The validation is carried out using four different statistics on a daily basis allowing the use of ANOVA to compare the results. The main conclusion is that Random Forest with residual kriging produces the best results (R$^2$=0.612 $\pm$ 0.019, NSE=0.578 $\pm$ 0.025, RMSE=1.068 $\pm$ 0.027, PBIAS=-0.172 $\pm$ 0.046), whereas TVX produces the least accurate results. The environmental conditions in the study area are not really suited to TVX, moreover this method only takes into account satellite data. On the other hand, regression methods (Support Vector Machine, Random Forest and Multivariate Linear Regression) use several parameters that are easily calculated from a Digital Elevation Model, adding very little difficulty to the use of satellite data alone. The most important variables in the Random Forest Model were satellite temperature, potential irradiation and cdayt, a cosine transformation of the julian day.

Machine Learning Algorithms For Understanding The Determinants of Under-Five Mortality

10.21203/rs.3.rs-1021040/v1 ◽

2021 ◽

Author(s):

Rakesh Kumar Saroj ◽

Pawan Kumar Yadav ◽

Rajneesh Singh ◽

Obvious Nchimunya Chilyabanyama

Keyword(s):

Machine Learning ◽

Random Forest ◽

Information Gain ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Mortality Data ◽

Mortality Factors ◽

Under Five ◽

Learning Techniques

Abstract Background: The death rate of under-five children in India declined last few decades, but few bigger states have poor performance. This is a matter of serious concern for the child's health as well as social development. Nowadays, machine learning techniques play a crucial role in the smart health care system to capture the hidden factors and patterns of outcomes. In this paper, we used machine learning techniques to predict the important factors of under-five mortality.This study aims to explore the importance of machine learning techniques to predict under-five mortality and to find the important factors that cause under-five mortality.The data was taken from the National Family Health Survey-IV of Uttar Pradesh. We used four machine learning techniques like decision tree, support vector machine, random forest, and logistic regression to predict under-five mortality factors and model accuracy of each model. We have also used information gain to rank to know the important variables for accurate predictions in under-five mortality data.Result: Random Forest (RF) predicts the child mortality factors with the highest accuracy of 97.5 %, and the number of living children, births in the last five years, educational level, birth order, total children ever born, currently breastfeeding, and size of child at birth that identifying as essential factors for under-five mortality.Conclusion: The study focuses on machine learning techniques to predict and identify important factors for under-five mortality. The random forest model provides an excellent predictive result for estimating the risk factors of under-five mortality. Based on the resulting outcome, policymakers can make policies and plans to reduce under-five mortality.