scholarly journals Dealing with Noise Problem in Machine Learning Data-sets: A Systematic Review

2019 ◽  
Vol 161 ◽  
pp. 466-474 ◽  
Author(s):  
Shivani Gupta ◽  
Atul Gupta
Author(s):  
Paul Rippon ◽  
Kerrie Mengersen

Learning algorithms are central to pattern recognition, artificial intelligence, machine learning, data mining, and statistical learning. The term often implies analysis of large and complex data sets with minimal human intervention. Bayesian learning has been variously described as a method of updating opinion based on new experience, updating parameters of a process model based on data, modelling and analysis of complex phenomena using multiple sources of information, posterior probabilistic expectation, and so on. In all of these guises, it has exploded in popularity over recent years.


2020 ◽  
Vol 13 (1) ◽  
pp. 25-40
Author(s):  
Aleksei Dudchenko ◽  
Matthias Ganzinger ◽  
Georgy Kopanitsa

Background: It could be seen in the previous decades that Machine Learning (ML) has a huge variety of possible implementations in medicine and can be of great use. Nevertheless, cardiovascular diseases cause about a third of the total global deaths. Does ML work in the cardiology domain and what is the current progress in this regard? To answer this question, we present a systematic review aiming at 1) identifying studies where machine learning algorithms were applied in the domain of cardiology; 2) providing an overview based on the existing literature about the state-of-the-art ML algorithms applied in cardiology. Methods: For organizing this review, we adopted the PRISMA statement. We used PubMed as the search engine and identified the search keywords as “Machine Learning”, “Data Mining”, “Cardiology”, and “Cardiovascular” in combinations. Scientific articles and conference papers published between 2013-2017 reporting about implementations of ML algorithms in the domain of cardiology have been included in this review. Results: In total, 27 relevant papers were included. We examined four aspects: the aims of ML systems, the methods, datasets, and evaluation metrics. The major part of the paper was aimed at predicting the risk of mortality. A promising branch of Machine Learning, the ‘Reinforcement Learning’, was also never proposed in the observed papers. Tree-based ensembles are common and show good results, whereas deep neural networks are poorly represented. Most papers (20 of 27) have used datasets that are hardly available for other researchers, e.g. unpublished local registries. We also identified 28 different metrics for model evaluation. This variety of metrics makes it difficult to compare the results of different researches. Conclusion: We suppose that this systematic review will be helpful for researchers developing medical machine learning systems and for cardiology in particular.


Author(s):  
Paul Rippon ◽  
Kerrie Mengersen

Learning algorithms are central to pattern recognition, artificial intelligence, machine learning, data mining, and statistical learning. The term often implies analysis of large and complex data sets with minimal human intervention. Bayesian learning has been variously described as a method of updating opinion based on new experience, updating parameters of a process model based on data, modelling and analysis of complex phenomena using multiple sources of information, posterior probabilistic expectation, and so on. In all of these guises, it has exploded in popularity over recent years.


2020 ◽  
Vol 10 (17) ◽  
pp. 5811
Author(s):  
Imatitikua D. Aiyanyo ◽  
Hamman Samuel ◽  
Heuiseok Lim

This is a systematic review of over one hundred research papers about machine learning methods applied to defensive and offensive cybersecurity. In contrast to previous reviews, which focused on several fragments of research topics in this area, this paper systematically and comprehensively combines domain knowledge into a single review. Ultimately, this paper seeks to provide a base for researchers that wish to delve into the field of machine learning for cybersecurity. Our findings identify the frequently used machine learning methods within supervised, unsupervised, and semi-supervised machine learning, the most useful data sets for evaluating intrusion detection methods within supervised learning, and methods from machine learning that have shown promise in tackling various threats in defensive and offensive cybersecurity.


2008 ◽  
pp. 1877-1887
Author(s):  
Desheng Wu ◽  
David L. Olson

The technique for order preference by similarity to ideal solution (TOPSIS) is a technique that can consider any number of measures, seeking to identify solutions close to an ideal and far from a nadir solution. TOPSIS has traditionally been applied in multiple criteria decision analysis. In this paper we propose an approach to develop a TOPSIS classifier. We demonstrate its use in credit scoring, providing a way to deal with large sets of data using machine learning. Data sets often contain many potential explanatory variables, some preferably minimized, some preferably maximized. Results are favorable by a comparison with traditional data mining techniques of decision trees. Proposed models are validated using Mont Carlo simulation.


2021 ◽  
Vol 141 (8) ◽  
pp. 284-291
Author(s):  
Ryohei Matsui ◽  
Iwao Tanuma ◽  
Ryotaro Kawahara ◽  
Naoko Ushio ◽  
Hiroyuki Yoshimoto ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document