scholarly journals A Comparison Study of Data Mining Algorithms for blood Cancer Prediction

passer ◽  
2019 ◽  
Vol 3 (1) ◽  
pp. 174-179
Author(s):  
Noor Bahjat ◽  
Snwr Jamak

Cancer is a common disease that threats the life of one of every three people. This dangerous disease urgently requires early detection and diagnosis. The recent progress in data mining methods, such as classification, has proven the need for machine learning algorithms to apply to large datasets. This paper mainly aims to utilise data mining techniques to classify cancer data sets into blood cancer and non-blood cancer based on pre-defined information and post-defined information obtained after blood tests and CT scan tests. This research conducted using the WEKA data mining tool with 10-fold cross-validation to evaluate and compare different classification algorithms, extract meaningful information from the dataset and accurately identify the most suitable and predictive model. This paper depicted that the most suitable classifier with the best ability to predict the cancerous dataset is Multilayer perceptron with an accuracy of 99.3967%.


2021 ◽  
Vol 297 ◽  
pp. 01032
Author(s):  
Harish Kumar ◽  
Anshal Prasad ◽  
Ninad Rane ◽  
Nilay Tamane ◽  
Anjali Yeole

Phishing is a common attack on credulous people by making them disclose their unique information. It is a type of cyber-crime where false sites allure exploited people to give delicate data. This paper deals with methods for detecting phishing websites by analyzing various features of URLs by Machine learning techniques. This experimentation discusses the methods used for detection of phishing websites based on lexical features, host properties and page importance properties. We consider various data mining algorithms for evaluation of the features in order to get a better understanding of the structure of URLs that spread phishing. To protect end users from visiting these sites, we can try to identify the phishing URLs by analyzing their lexical and host-based features.A particular challenge in this domain is that criminals are constantly making new strategies to counter our defense measures. To succeed in this contest, we need Machine Learning algorithms that continually adapt to new examples and features of phishing URLs.



2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Ivan Kholod ◽  
Ilya Petukhov ◽  
Andrey Shorov

This paper describes the construction of a Cloud for Distributed Data Analysis (CDDA) based on the actor model. The design uses an approach to map the data mining algorithms on decomposed functional blocks, which are assigned to actors. Using actors allows users to move the computation closely towards the stored data. The process does not require loading data sets into the cloud and allows users to analyze confidential information locally. The results of experiments show that the efficiency of the proposed approach outperforms established solutions.



Author(s):  
Balazs Feil ◽  
Janos Abonyi

This chapter aims to give a comprehensive view about the links between fuzzy logic and data mining. It will be shown that knowledge extracted from simple data sets or huge databases can be represented by fuzzy rule-based expert systems. It is highlighted that both model performance and interpretability of the mined fuzzy models are of major importance, and effort is required to keep the resulting rule bases small and comprehensible. Therefore, in the previous years, soft computing based data mining algorithms have been developed for feature selection, feature extraction, model optimization, and model reduction (rule based simplification). Application of these techniques is illustrated using the wine data classification problem. The results illustrate that fuzzy tools can be applied in a synergistic manner through the nine steps of knowledge discovery.



2015 ◽  
Vol 813-814 ◽  
pp. 1104-1113 ◽  
Author(s):  
A. Sumesh ◽  
Dinu Thomas Thekkuden ◽  
Binoy B. Nair ◽  
K. Rameshkumar ◽  
K. Mohandas

The quality of weld depends upon welding parameters and exposed environment conditions. Improper selection of welding process parameter is one of the important reasons for the occurrence of weld defect. In this work, arc sound signals are captured during the welding of carbon steel plates. Statistical features of the sound signals are extracted during the welding process. Data mining algorithms such as Naive Bayes, Support Vector Machines and Neural Network were used to classify the weld conditions according to the features of the sound signal. Two weld conditions namely good weld and weld with defects namely lack of fusion, and burn through were considered in this study. Classification efficiencies of machine learning algorithms were compared. Neural network is found to be producing better classification efficiency comparing with other algorithms considered in this study.



Author(s):  
Balaji Rajagopalan ◽  
Ravi Krovi

Data mining is the process of sifting through the mass of organizational (internal and external) data to identify patterns critical for decision support. Successful implementation of the data mining effort requires a careful assessment of the various tools and algorithms available. The basic premise of this study is that machine-learning algorithms, which are assumption free, should outperform their traditional counterparts when mining business databases. The objective of this study is to test this proposition by investigating the performance of the algorithms for several scenarios. The scenarios are based on simulations designed to reflect the extent to which typical statistical assumptions are violated in the business domain. The results of the computational experiments support the proposition that machine learning algorithms generally outperform their statistical counterparts under certain conditions. These can be used as prescriptive guidelines for the applicability of data mining techniques.



2010 ◽  
Vol 1 (1) ◽  
pp. 60-92 ◽  
Author(s):  
Joaquín Derrac ◽  
Salvador García ◽  
Francisco Herrera

The use of Evolutionary Algorithms to perform data reduction tasks has become an effective approach to improve the performance of data mining algorithms. Many proposals in the literature have shown that Evolutionary Algorithms obtain excellent results in their application as Instance Selection and Instance Generation procedures. The purpose of this paper is to present a survey on the application of Evolutionary Algorithms to Instance Selection and Generation process. It will cover approaches applied to the enhancement of the nearest neighbor rule, as well as other approaches focused on the improvement of the models extracted by some well-known data mining algorithms. Furthermore, some proposals developed to tackle two emerging problems in data mining, Scaling Up and Imbalance Data Sets, also are reviewed.



2016 ◽  
Vol 39 (16) ◽  
pp. 4829-4845 ◽  
Author(s):  
Jose Manuel Rodríguez-Jiménez ◽  
Pablo Cordero ◽  
Manuel Enciso ◽  
Angel Mora


2014 ◽  
Vol 490-491 ◽  
pp. 1361-1367
Author(s):  
Xin Huang ◽  
Hui Juan Chen ◽  
Mao Gong Zheng ◽  
Ping Liu ◽  
Jing Qian

With the advent of location-based social media and locationacquisition technologies, trajectory data are becoming more and more ubiquitous in the real world. A lot of data mining algorithms have been successfully applied to trajectory data sets. Trajectory pattern mining has received a lot of attention in recent years. In this paper, we review the most inuential methods as well as typical applications within the context of trajectory pattern mining.



2014 ◽  
Vol 556-562 ◽  
pp. 3901-3904
Author(s):  
Cui Xia Tao

Data mining means to extract information and knowledge that potentially useful while still unknown in advance, from a large quantity of implicit incomplete, random data. With the quick advancement of modern information technology, people are accumulating data volume on the increase sharply, often at the speed of TB. How to extract meaningful information from large amounts of data has become a big problem must be tackled. In view of the huge amounts of data mining, distributed parallel processing and incremental processing is valid solution.



2018 ◽  
Vol 7 (1.7) ◽  
pp. 179
Author(s):  
Nivedhitha G ◽  
Carmel Mary Belinda M.J ◽  
Rupavathy N

The development of the phishing sites is by all accounts amazing. Despite the fact that the web clients know about these sorts of phishing assaults, part of clients move toward becoming casualty to these assaults. Quantities of assaults are propelled with the point of making web clients trust that they are speaking with a trusted entity. Phishing is one among them. Phishing is consistently developing since it is anything but difficult to duplicate a whole site utilizing the HTML source code. By rolling out slight improvements in the source code, it is conceivable to guide the victim to the phishing site. Phishers utilize part of strategies to draw the unsuspected web client. Consequently an efficient mechanism is required to recognize the phishing sites from the real sites keeping in mind the end goal to spare credential data. To detect the phishing websites and to identify it as information leaking sites, the system proposes data mining algorithms. In this paper, machine-learning algorithms have been utilized for modeling the prediction task. The process of identity extraction and feature extraction are discussed in this paper and the various experiments carried out to discover the performance of the models are demonstrated.



Sign in / Sign up

Export Citation Format

Share Document