Noise detection in classification problems

Large volumes of data have been produced in many application domains. Nonetheless, when data quality is low, the performance of Machine Learning techniques is harmed. Real data are frequently affected by the presence of noise, which, when used in the training of Machine Learning techniques for predictive tasks, can result in complex models, with high induction time and low predictive performance. Identification and removal of noise can improve data quality and, as a result, the induced model. This thesis proposes new techniques for noise detection and the development of a recommendation system based on meta-learning to recommend the most suitable filter for new tasks. Experiments using artificial and real datasets show the relevance of this research.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2

Current Medicinal Chemistry ◽

10.2174/2213275912666191102162959 ◽

2020 ◽

Vol 28 (2) ◽

pp. 253-265 ◽

Cited By ~ 3

Author(s):

Gabriela Bitencourt-Ferreira ◽

Amauri Duarte da Silva ◽

Walter Filgueira de Azevedo

Keyword(s):

Machine Learning ◽

Binding Affinity ◽

Predictive Performance ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Scoring Functions ◽

Cyclin Dependent Kinase ◽

Learning Models ◽

Learning Techniques ◽

Machine Learning Models

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.

Download Full-text

Predictive and interpretable models via the stacked elastic net

Bioinformatics ◽

10.1093/bioinformatics/btaa535 ◽

2020 ◽

Author(s):

Armin Rauschenberger ◽

Enrico Glaab ◽

Mark van de Wiel

Keyword(s):

Machine Learning ◽

R Package ◽

Elastic Net ◽

Machine Learning Techniques ◽

Supplementary Information ◽

Biomedical Sciences ◽

Molecular Features ◽

Learning Techniques ◽

Meta Learning ◽

Interpretable Models

Abstract Motivation Machine learning in the biomedical sciences should ideally provide predictive and interpretable models. When predicting outcomes from clinical or molecular features, applied researchers often want to know which features have effects, whether these effects are positive or negative, and how strong these effects are. Regression analysis includes this information in the coefficients but typically renders less predictive models than more advanced machine learning techniques. Results Here we propose an interpretable meta-learning approach for high-dimensional regression. The elastic net provides a compromise between estimating weak effects for many features and strong effects for some features. It has a mixing parameter to weight between ridge and lasso regularisation. Instead of selecting one weighting by tuning, we combine multiple weightings by stacking. We do this in a way that increases predictivity without sacrificing interpretability. Availability and Implementation The R package starnet is available on GitHub: https://github.com/rauschenberger/starnet. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Hybrid Vision-Map Method for Urban Road Detection

Journal of Advanced Transportation ◽

10.1155/2017/7090549 ◽

2017 ◽

Vol 2017 ◽

pp. 1-21 ◽

Cited By ~ 6

Author(s):

Carlos Fernández ◽

David Fernández-Llorca ◽

Miguel A. Sotelo

Keyword(s):

Machine Learning ◽

Urban Environments ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Classification Problems ◽

Road Detection ◽

Training Set ◽

Digital Maps ◽

The Road ◽

Learning Techniques

A hybrid vision-map system is presented to solve the road detection problem in urban scenarios. The standardized use of machine learning techniques in classification problems has been merged with digital navigation map information to increase system robustness. The objective of this paper is to create a new environment perception method to detect the road in urban environments, fusing stereo vision with digital maps by detecting road appearance and road limits such as lane markings or curbs. Deep learning approaches make the system hard-coupled to the training set. Even though our approach is based on machine learning techniques, the features are calculated from different sources (GPS, map, curbs, etc.), making our system less dependent on the training set.

Download Full-text

Modern machine learning outperforms GLMs at predicting spikes

10.1101/111450 ◽

2017 ◽

Cited By ~ 4

Author(s):

Ari S. Benjamin ◽

Hugo L. Fernandes ◽

Tucker Tomlinson ◽

Pavan Ramkumar ◽

Chris VerSteeg ◽

...

Keyword(s):

Machine Learning ◽

Neural Activity ◽

Linear Models ◽

Feedforward Neural Networks ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Machine Learning Methods ◽

Learning Techniques ◽

Neural Spiking ◽

Modern Machine

AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods: feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.

Download Full-text

Analysis of Intrusion Detection and Classification using Machine Learning Approaches

SMART MOVES JOURNAL IJOSCIENCE ◽

10.24113/ijoscience.v3i10.13 ◽

2017 ◽

Vol 3 (10) ◽

Author(s):

Anjum Khan ◽

Anjana Nigam

Keyword(s):

Machine Learning ◽

Network Security ◽

Intrusion Detection ◽

Detection System ◽

Real Data ◽

Machine Learning Techniques ◽

Learning Approaches ◽

High Detection Rate ◽

Learning Techniques ◽

Result Analysis

As the network primarily based applications are growing quickly, the network security mechanisms need a lot of attention to enhance speed and preciseness. The ever evolving new intrusion types cause a significant threat to network security. Though varied network security tools are developed, however the quick growth of intrusive activities continues to be a significant issue. Intrusion detection systems (IDSs) are wont to detect intrusive activities on the network. Analysis showed that application of machine learning techniques in intrusion detection might reach high detection rate. Machine learning and classification algorithms facilitate to design “Intrusion Detection Models” which might classify the network traffic into intrusive or traditional traffic. This paper discusses some usually used machine learning techniques in Intrusion Detection System and conjointly reviews a number of the prevailing machine learning IDS proposed by researchers at different times. in this paper an experimental analysis is performed to demonstrate the performance analysis of some existing techniques in order that they will be used further in developing Hybrid Classifier for real data packets classification. The given result analysis shows that KNN, RF and SVM performs best for NSL-KDD dataset.

Download Full-text

Customer determinants of used auto loan churn: comparing predictive performance using machine learning techniques

Journal of Marketing Analytics ◽

10.1057/s41270-021-00135-6 ◽

2021 ◽

Author(s):

Chandrasekhar Valluri ◽

Sudhakar Raju ◽

Vivek H. Patil

Keyword(s):

Machine Learning ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Traditional vs. Machine-Learning Techniques for OSM Quality Assessment

Geospatial Intelligence ◽

10.4018/978-1-5225-8054-6.ch022 ◽

2019 ◽

pp. 469-487

Author(s):

Musfira Jilani ◽

Michela Bertolotto ◽

Padraig Corcoran ◽

Amerah Alghanim

Keyword(s):

Machine Learning ◽

Data Quality ◽

Quality Assessment ◽

Spatial Data ◽

Current Data ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Data Quality Assessment ◽

Expensive Process

Nowadays an ever-increasing number of applications require complete and up-to-date spatial data, in particular maps. However, mapping is an expensive process and the vastness and dynamics of our world usually render centralized and authoritative maps outdated and incomplete. In this context crowd-sourced maps have the potential to provide a complete, up-to-date, and free representation of our world. However, the proliferation of such maps largely remains limited due to concerns about their data quality. While most of the current data quality assessment mechanisms for such maps require referencing to authoritative maps, we argue that such referencing of a crowd-sourced spatial database is ineffective. Instead we focus on the use of machine learning techniques that we believe have the potential to not only allow the assessment but also to recommend the improvement of the quality of crowd-sourced maps without referencing to external databases. This chapter gives an overview of these approaches.

Download Full-text

Prediction and Recommendation System for Diabetes Using Machine Learning Models

10.4018/978-1-7998-7709-7.ch018 ◽

2022 ◽

pp. 316-327

Author(s):

Nareshkumar Mustary ◽

Phani Kumar Singamsetty

Keyword(s):

Machine Learning ◽

Recommendation System ◽

Medical Center ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Diabetes Prediction ◽

Artery Disease ◽

The Right ◽

Deep Learning Model ◽

Machine Learning Models

Diabetes is one of the most deadly diseases on the planet. It is also a cause of a variety of illnesses, such as coronary artery disease, blindness, and urinary organ disease. In this situation, the patient must visit a medical center to obtain their results following consultation. Finding the right combination of characteristics and machine learning techniques for classification is also very critical. However, with the advancement of machine learning techniques, we now have the potential to find a solution to the current problem. The healthcare recommendation system (HRS) may be designed to predict health by evaluating patient lifestyle, physical health, mental health aspects using machine learning. For example, training the model using people's age and diabetes helps to predict new patients without a specific diagnostic for diabetes. The proposed deep learning model with convolutional neural network (D-CNN) achieves an overall accuracy of 96.25%. D-CNN is found to be more successful for diabetes prediction than other machine learning (ML) approaches in the experimental analysis.

Download Full-text

An Aviation Delay Prediction and Recommendation System Using Machine Learning Techniques

Algorithms for Intelligent Systems - Proceedings of Integrated Intelligence Enable Networks and Computing ◽

10.1007/978-981-33-6307-6_25 ◽

2021 ◽

pp. 239-253

Author(s):

Ranga Swamy Sirisati ◽

Kalavala Gowthami Prasanthi ◽

Anga Gautami Latha

Keyword(s):

Machine Learning ◽

Recommendation System ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Delay Prediction

Download Full-text