BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm

PeerJ Computer Science ◽

10.7717/peerj-cs.390 ◽

2021 ◽

Vol 7 ◽

pp. e390

Author(s):

Shafaq Abbas ◽

Zunera Jalil ◽

Abdul Rehman Javed ◽

Iqra Batool ◽

Mohammad Zubair Khan ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Experimental Results ◽

Support Vector ◽

Novel Approach ◽

Whale Optimization

Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.

Download Full-text

A Support Vector Machine and Decision Tree Based Breast Cancer Prediction

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1752.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2972-2976

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Misclassification Rate ◽

Support Vector

The first step in diagnosis of a breast cancer is the identification of the disease. Early detection of the breast cancer is significant to reduce the mortality rate due to breast cancer. Machine learning algorithms can be used in identification of the breast cancer. The supervised machine learning algorithms such as Support Vector Machine (SVM) and the Decision Tree are widely used in classification problems, such as the identification of breast cancer. In this study, a machine learning model is proposed by employing learning algorithms namely, the support vector machine and decision tree. The kaggle data repository consisting of 569 observations of malignant and benign observations is used to develop the proposed model. Finally, the model is evaluated using accuracy, confusion matrix precision and recall as metrics for evaluation of performance on the test set. The analysis result showed that, the support vector machine (SVM) has better accuracy and less number of misclassification rate and better precision than the decision tree algorithm. The average accuracy of the support vector machine (SVM) is 91.92 % and that of the decision tree classification model is 87.12 %.

Download Full-text

Classification of Child Items in a Gold Tree using Support Vector Machine Classifier

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d8026.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 3208-3216

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Support Vector Machine Classifier ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Svm Classifier ◽

Main Application ◽

Novel Approach

Sorting of images has been a challenge in Machine Learning Algorithms over the years. Various algorithms have been proposed to sort an image but none of them are able to sort the image clearly. The drawback of the existing systems is that the sorted image is not clearly identified. So, to overcome this drawback we have proposed a novel approach to sort the children of a tree and match them with the existing designs. The images will be sorted on the basis of the class of the image. The images are taken from the image and manual binning of those images are done. Then the images are trained and tested. GLCM feature is extracted from the trained and tested images which are later on fed to the SVM classifier. The classification of image is then done with the help of SVM classifier. Around 7000 images are trained on SVM and used for classification. More than 300 different classes have been created in the database for comparison. Realtime images of child items are captured and fed to the SVM for classifying. The main application of this image is the use in distinguishing the designs in the ornaments. The various parts of the ornaments can be differentiated clearly. Thus, the proposed method is precise as compared to the existing methods.

Download Full-text

Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms

2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT) ◽

10.1109/isaect.2018.8618688 ◽

2018 ◽

Author(s):

Youness Khourdifi ◽

Mohamed Bahaj

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prediction

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i03.002 ◽

2021 ◽

Vol 10 (3) ◽

pp. 14-25

Author(s):

Parilkumar Shiroya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39088 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1-10

Author(s):

Harsha A K

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Steady Increase ◽

Extreme Gradient Boosting

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

Download Full-text

Tremor Identification Using Machine Learning in Parkinson's Disease

Early Detection of Neurological Disorders Using Machine Learning Systems - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-8567-1.ch008 ◽

2019 ◽

pp. 128-151

Author(s):

Angana Saikia ◽

Vinayak Majhi ◽

Masaraf Hussain ◽

Sudip Paul ◽

Amitava Datta

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Support Vector Machine ◽

Parkinson's Disease ◽

Discriminant Analysis ◽

Learning Algorithms ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector

Tremor is an involuntary quivering movement or shake. Characteristically occurring at rest, the classic slow, rhythmic tremor of Parkinson's disease (PD) typically starts in one hand, foot, or leg and can eventually affect both sides of the body. The resting tremor of PD can also occur in the jaw, chin, mouth, or tongue. Loss of dopamine leads to the symptoms of Parkinson's disease and may include a tremor. For some people, a tremor might be the first symptom of PD. Various studies have proposed measurable technologies and the analysis of the characteristics of Parkinsonian tremors using different techniques. Various machine-learning algorithms such as a support vector machine (SVM) with three kernels, a discriminant analysis, a random forest, and a kNN algorithm are also used to classify and identify various kinds of tremors. This chapter focuses on an in-depth review on identification and classification of various Parkinsonian tremors using machine learning algorithms.

Download Full-text

Introduction and Implementation of Machine Learning Algorithms in R

Advances in Business Information Systems and Analytics - Sentiment Analysis and Knowledge Discovery in Contemporary Business ◽

10.4018/978-1-5225-4999-4.ch008 ◽

2019 ◽

pp. 126-147

Author(s):

S. R. Mani Sekhar ◽

G. M. Siddesh

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machine ◽

Discriminant Analysis ◽

Computer Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Linear Discriminant ◽

The Given

Machine learning is one of the important areas in the field of computer science. It helps to provide an optimized solution for the real-world problems by using past knowledge or previous experience data. There are different types of machine learning algorithms present in computer science. This chapter provides the overview of some selected machine learning algorithms such as linear regression, linear discriminant analysis, support vector machine, naive Bayes classifier, neural networks, and decision trees. Each of these methods is illustrated in detail with an example and R code, which in turn assists the reader to generate their own solutions for the given problems.

Download Full-text

Evaluating Variable Selection and Machine Learning Algorithms for Estimating Forest Heights by Combining Lidar and Hyperspectral Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9090507 ◽

2020 ◽

Vol 9 (9) ◽

pp. 507

Author(s):

Sanjiwana Arjasakusuma ◽

Sandiaga Swahyu Kusuma ◽

Stuart Phinn

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Principal Component ◽

Hyperspectral Data ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Forest Height ◽

Extreme Gradient Boosting

Machine learning has been employed for various mapping and modeling tasks using input variables from different sources of remote sensing data. For feature selection involving high- spatial and spectral dimensionality data, various methods have been developed and incorporated into the machine learning framework to ensure an efficient and optimal computational process. This research aims to assess the accuracy of various feature selection and machine learning methods for estimating forest height using AISA (airborne imaging spectrometer for applications) hyperspectral bands (479 bands) and airborne light detection and ranging (lidar) height metrics (36 metrics), alone and combined. Feature selection and dimensionality reduction using Boruta (BO), principal component analysis (PCA), simulated annealing (SA), and genetic algorithm (GA) in combination with machine learning algorithms such as multivariate adaptive regression spline (MARS), extra trees (ET), support vector regression (SVR) with radial basis function, and extreme gradient boosting (XGB) with trees (XGbtree and XGBdart) and linear (XGBlin) classifiers were evaluated. The results demonstrated that the combinations of BO-XGBdart and BO-SVR delivered the best model performance for estimating tropical forest height by combining lidar and hyperspectral data, with R2 = 0.53 and RMSE = 1.7 m (18.4% of nRMSE and 0.046 m of bias) for BO-XGBdart and R2 = 0.51 and RMSE = 1.8 m (15.8% of nRMSE and −0.244 m of bias) for BO-SVR. Our study also demonstrated the effectiveness of BO for variables selection; it could reduce 95% of the data to select the 29 most important variables from the initial 516 variables from lidar metrics and hyperspectral data.

Download Full-text

A comparison of machine learning algorithms for assessment of delamination in fiber-reinforced polymer composite beams

Structural Health Monitoring ◽

10.1177/1475921720967157 ◽

2020 ◽

pp. 147592172096715

Author(s):

Mengyue He ◽

Yishou Wang ◽

Karthik Ram Ramakrishnan ◽

Zhifang Zhang

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Polymer Composites ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Fiber Reinforced Polymer ◽

Support Vector ◽

Fiber Reinforced ◽

Reinforced Polymer ◽

Fiber Reinforced Polymer Composites

Structural health monitoring techniques based on vibration parameters have been used to assess the internal delamination damage of fiber-reinforced polymer composites. Recently, machine learning algorithms have been adopted to solve the inverse problem of predicting delamination parameters of the delamination from natural frequency shifts. In this article, a delamination detection methodology is proposed based on the changes in multiple modes of frequencies to assess the interface, location, and size of delamination in fiber-reinforced polymer composites. Three types of machine learning algorithms including back propagation neural network, extreme learning machine, and support vector machine algorithm were adopted as inverse algorithms for assessment of the delamination parameters, with a special focus on the interface prediction. A theoretical model of fiber-reinforced polymer beam with delamination under vibration was constructed to learn how the frequencies are affected by the delaminations (“forward problem”) and to generate a database of “frequency shifts versus delamination parameters” to be used in machine learning algorithms for delamination prediction (“inverse problem”). Multiple carbon/epoxy fiber-reinforced polymer beam specimens were manufactured and measured by a laser scanning Doppler vibrometer to extract the modal frequencies. Numerical and experimental verification results have shown that support vector machine has the best prediction performance among the three machine learning algorithms, with high prediction accuracy and only requiring a small number of samples. For predicting the interface of delamination which is a discrete variable, support vector machine classification has observed better prediction accuracy and requiring less running time than regression. This study is one of the first to prove the applicability of support vector machine for structural health monitoring of delamination damage in fiber-reinforced polymer composites and has the potential to improve the prediction capability of machine learning algorithms. Another significant outcome of the study is that the interface of delamination has been predicted accurately with support vector machine.

Download Full-text

BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm

A Support Vector Machine and Decision Tree Based Breast Cancer Prediction

Classification of Child Items in a Gold Tree using Support Vector Machine Classifier

Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset﻿

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

Tremor Identification Using Machine Learning in Parkinson's Disease

Introduction and Implementation of Machine Learning Algorithms in R

Evaluating Variable Selection and Machine Learning Algorithms for Estimating Forest Heights by Combining Lidar and Hyperspectral Data

A comparison of machine learning algorithms for assessment of delamination in fiber-reinforced polymer composite beams

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset