Churn Prediction in Telecom Industry Using Machine Learning Algorithms with K-Best and Principal Component Analysis

Author(s):  
K. V. Anjana ◽  
Siddhaling Urolagin
Author(s):  
Anupam Sen

Machine Learning (ML) techniques play an important role in the medical field. Early diagnosis is required to improve the treatment of carcinoma. During this analysis Breast Cancer Coimbra dataset (BCCD) with ten predictors are analyzed to classify carcinoma. In this paper method for feature selection and Machine learning algorithms are applied to the dataset from the UCI repository. WEKA (“Waikato Environment for Knowledge Analysis”) tool is used for machine learning techniques. In this paper Principal Component Analysis (PCA) is used for feature extraction. Different Machine Learning classification algorithms are applied through WEKA such as Glmnet, Gbm, ada Boosting, Adabag Boosting, C50, Cforest, DcSVM, fnn, Ksvm, Node Harvest compares the accuracy and also compare values such as Kappa statistic, Mean Absolute Error (MAE), Root Mean Square Error (RMSE). Here the 10-fold cross validation method is used for training, testing and validation purposes.


2018 ◽  
Author(s):  
Kang K. Yan ◽  
Xiaofei Wang ◽  
Wendy Lam ◽  
Varut Vardhanabhuti ◽  
Anne W.M. Lee ◽  
...  

AbstractRadiomics is a newly emerging field that involves the extraction of a large number of quantitative features from biomedical images through the use of data-characterization algorithms. Radiomics provides a noninvasive approach for personalized therapy decision by identifying distinctive imaging features for predicting prognosis and therapeutic response. So far, many of the published radiomics studies utilize existing out of the box algorithms to identify the prognostic markers from biomedical images that are not specific to radiomics data. T o better utilize biomedical image, we propose a novel machine learning approach, stability selection supervised principal component analysis (SSSuperPCA) that identify a set of stable features from radiomics big data coupled with dimension reduction for right censored survival outcomes. In this paper, we describe stability selection supervised principal component analysis for radiomics data with right-censored survival outcomes. The proposed approach allows us to identify a set of stable features that are highly associated with the survival outcomes, control the per-family error rate, and predict the survival in a simple yet meaningful manner. We evaluate the performance of SSSuperPCA using simulations and real data sets for non-small cell lung cancer and head and neck cancer, and compare it with other machine learning algorithms. The results demonstrate that our method has a competitive edge over other existing methods in identifying the prognostic markers from biomedical big imaging data for the prediction of right-censored survival outcomes. An R package SSSuperPCA is available at the website: http://web.hku.hk/∼herbpang/SSSuperPCA.html


2021 ◽  
Vol 40 ◽  
pp. 03010
Author(s):  
Chinmay Lokare ◽  
Rachana Patil ◽  
Saloni Rane ◽  
Deepakkumar Kathirasen ◽  
Yogita Mistry

In today’s world it is necessary to protect one’s authenticity in order to ensure the protection of personal information that only the authenticate credentials of a person can have access to. Nowadays there is an increase in number of malpractices like signature forgery to access the important information of a person. To encounter signature verification problem, there have been a number of advances in verifying the authenticity of signature using various techniques including Machine Learning and Deep Learning. This paper introduces a novel approach to verify the signatures using difference of gaussian filtering technique, gray level co-occurrence matrix feature extraction technique, principle component analysis and kernel principal component analysis associated with various machine learning algorithms. The publicly available Kaggle offline handwritten signature dataset is used for training. This article compares the accuracy of the dataset on various machine learning algorithms. After training datasets the lowest accuracy achieved is 56.66% for Naive Bayes algorithm. The highest accuracy achieved is 82% for K-Nearest Neighbour (KNN) and 81.66% for Random Forest using principle components and kernel principle components of the dataset.


Author(s):  
Hyeuk Kim

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.


Sign in / Sign up

Export Citation Format

Share Document