Can machine learning techniques predict customer dissatisfaction? A feasibility study for the automotive industry

The automotive industry is in the strongest competition ever, as this sector gets disrupted by new arising competitors. Providing services to maximum customer satisfaction will be one of the most crucial competitive advantages in the future. Around 1 Terabyte of objective data is created every hour today. This volume will significantly grow in the future by the increasing numberof connected services within the automotive industry. However, customer satisfaction determination is solely based on subjective questionnaires today without taking the vast amount of objective sensor and service process data into account. This work presents an industrial application that fills this lack of research and thus provides a solution with a high practical impact to survive in the tough competition of the automotive industry. Therefore, the work addresses these fundamental business questions: 1) Candissatisfied customers be classified based on data that is produced during every service visit? 2) Can the dissatisfaction indicators be derived from service process data? A machine learning problem is set up that compared 5 classifiers and analyzed data from 19,008 real service visits from an automotive company. The 105 extracted features were drawn from the most significant available sources: warranty, diagnostic, dealer system and general vehicle data. The best result for customer dissatisfaction classification was 88.8% achieved with the SVM classifier (RBF kernel). Furthermore, the 46 most potential indicators for dissatisfaction were identified by the evolutionary feature selection. Our system was capable of classifying customer dissatisfaction solely based on the objective data that is generated by almost every service visit. As the amount of these data is continuously growing, we expect that the presented data-driven approach can achieve even better results in the future with a higher amount of data.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

A Non-Intrusive Approach for Indoor Occupancy Detection in Smart Environments

Sensors ◽

10.3390/s18113953 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3953 ◽

Cited By ~ 3

Author(s):

Bruno Abade ◽

David Perez Abreu ◽

Marilia Curado

Keyword(s):

Machine Learning ◽

Environmental Data ◽

Machine Learning Techniques ◽

Smart Environments ◽

Prototype System ◽

Indoor Environments ◽

Process Data ◽

Indoor Space ◽

Occupancy Detection ◽

Real Scenario

Smart Environments try to adapt their conditions focusing on the detection, localisation, and identification of people to improve their comfort. It is common to use different sensors, actuators, and analytic techniques in this kind of environments to process data from the surroundings and actuate accordingly. In this research, a solution to improve the user’s experience in Smart Environments based on information obtained from indoor areas, following a non-intrusive approach, is proposed. We used Machine Learning techniques to determine occupants and estimate the number of persons in a specific indoor space. The solution proposed was tested in a real scenario using a prototype system, integrated by nodes and sensors, specifically designed and developed to gather the environmental data of interest. The results obtained demonstrate that with the developed system it is possible to obtain, process, and store environmental information. Additionally, the analysis performed over the gathered data using Machine Learning and pattern recognition mechanisms shows that it is possible to determine the occupancy of indoor environments.

Download Full-text

Supermarket Sales Prediction Using Regression

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/951022021 ◽

2021 ◽

Vol 10 (2) ◽

pp. 1153-1157

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Low Cost ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Customer Data ◽

Sales Data ◽

Online Marketplace ◽

Sales Prediction ◽

The Future

Sales forecasting is an important when it comes to companies who are engaged in retailing, logistics, manufacturing, marketing and wholesaling. It allows companies to allocate resources efficiently, to estimate revenue of the sales and to plan strategies which are better for company’s future. In this paper, predicting product sales from a particular store is done in a way that produces better performance compared to any machine learning algorithms. The dataset used for this project is Big Mart Sales data of the 2013.Nowadays shopping malls and Supermarkets keep track of the sales data of the each and every individual item for predicting the future demand of the customer. It contains large amount of customer data and the item attributes. Further, the frequent patterns are detected by mining the data from the data warehouse. Then the data can be used for predicting the sales of the future with the help of several machine learning techniques (algorithms) for the companies like Big Mart. In this project, we propose a model using the Xgboost algorithm for predicting sales of companies like Big Mart and founded that it produces better performance compared to other existing models. An analysis of this model with other models in terms of their performance metrics is made in this project. Big Mart is an online marketplace where people can buy or sell or advertise your merchandise at low cost. The goal of the paper is to make Big Mart the shopping paradise for the buyers and a marketing solutions for the sellers as well. The ultimate aim is the complete satisfaction of the customers. The project “SUPERMARKET SALES PREDICTION” builds a predictive model and finds out the sales of each of the product at a particular store. The Big Mart use this model to under the properties of the products which plays a major role in increasing the sales. This can also be done on the basis hypothesis that should be done before looking at the data

Download Full-text

Analyzing Behavior of Cancer Patients using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8414.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1547-1556

Keyword(s):

Machine Learning ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Operating Characteristics ◽

Decision Tree Classifier ◽

Tree Classifier

The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).

Download Full-text

Arabic English Cross-Lingual Plagiarism Detection Based on Keyphrases Extraction, Monolingual and Machine Learning Approach

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2018/v2i330075 ◽

2019 ◽

pp. 1-12

Author(s):

Mokhtar Al-Suhaiqi ◽

Muneer A. S. Hazaa ◽

Mohammed Albared

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Detection Methods ◽

Support Vector ◽

Svm Classifier ◽

Learning Approach ◽

Plagiarism Detection ◽

Machine Learning Approach ◽

Cross Lingual ◽

Cross Language

Due to rapid growth of research articles in various languages, cross-lingual plagiarism detection problem has received increasing interest in recent years. Cross-lingual plagiarism detection is more challenging task than monolingual plagiarism detection. This paper addresses the problem of cross-lingual plagiarism detection (CLPD) by proposing a method that combines keyphrases extraction, monolingual detection methods and machine learning approach. The research methodology used in this study has facilitated to accomplish the objectives in terms of designing, developing, and implementing an efficient Arabic – English cross lingual plagiarism detection. This paper empirically evaluates five different monolingual plagiarism detection methods namely i)N-Grams Similarity, ii)Longest Common Subsequence, iii)Dice Coefficient, iv)Fingerprint based Jaccard Similarity and v) Fingerprint based Containment Similarity. In addition, three machine learning approaches namely i) naïve Bayes, ii) Support Vector Machine, and iii) linear logistic regression classifiers are used for Arabic-English Cross-language plagiarism detection. Several experiments are conducted to evaluate the performance of the key phrases extraction methods. In addition, Several experiments to investigate the performance of machine learning techniques to find the best method for Arabic-English Cross-language plagiarism detection. According to the experiments of Arabic-English Cross-language plagiarism detection, the highest result was obtained using SVM classifier with 92% f-measure. In addition, the highest results were obtained by all classifiers are achieved, when most of the monolingual plagiarism detection methods are used.

Download Full-text

Improved argumentative paragraphs detection in academic theses supported with unit segmentation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219237 ◽

2021 ◽

pp. 1-11

Author(s):

Jesús Miguel García-Gorrostieta ◽

Aurelio López-López ◽

Samuel González-López ◽

Adrián Pastor López-Monroy

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Automatic Detection ◽

Machine Learning Techniques ◽

Svm Classifier ◽

Complex Task ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Academic Author

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.

Download Full-text

Gender Identification and Classification of Drosophila melanogaster Flies Using Machine Learning Techniques

Computational and Mathematical Methods in Medicine ◽

10.1155/2022/4593330 ◽

2022 ◽

Vol 2022 ◽

pp. 1-9

Author(s):

Channabasava Chola ◽

J. V. Bibal Benifa ◽

D. S. Guru ◽

Abdullah Y. Muaad ◽

J. Hanumanthappa ◽

...

Keyword(s):

Machine Learning ◽

Drosophila Melanogaster ◽

Genetic Model ◽

Ventral View ◽

Model Organism ◽

Automated System ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Biological Studies

Drosophila melanogaster is an important genetic model organism used extensively in medical and biological studies. About 61% of known human genes have a recognizable match with the genetic code of Drosophila flies, and 50% of fly protein sequences have mammalian analogues. Recently, several investigations have been conducted in Drosophila to study the functions of specific genes exist in the central nervous system, heart, liver, and kidney. The outcomes of the research in Drosophila are also used as a unique tool to study human-related diseases. This article presents a novel automated system to classify the gender of Drosophila flies obtained through microscopic images (ventral view). The proposed system takes an image as input and converts it into grayscale illustration to extract the texture features from the image. Then, machine learning (ML) classifiers such as support vector machines (SVM), Naive Bayes (NB), and K -nearest neighbour (KNN) are used to classify the Drosophila as male or female. The proposed model is evaluated using the real microscopic image dataset, and the results show that the accuracy of the KNN is 90%, which is higher than the accuracy of the SVM classifier.

Download Full-text

Blog Backlinks Malicious Domain Name Detection via Supervised Learning

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2021070101 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-17

Author(s):

Abdulrahman A. Alshdadi ◽

Ahmed S. Alghamdi ◽

Ali Daud ◽

Saqib Hussain

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Financial Loss ◽

Domain Name ◽

Hybrid Features ◽

Learning Techniques ◽

Web Spam ◽

Social Media Platforms

Web spam is the unwanted request on websites, low-quality backlinks, emails, and reviews which is generated by an automated program. It is the big threat for website owners; because of it, they can lose their top keywords ranking from search engines, which will result in huge financial loss to the business. Over the years, researchers have tried to identify malicious domains based on specific features. However, lighthouse plugin, Ahrefs tool, and social media platforms features are ignored. In this paper, the authors are focused on detection of the spam domain name from a mixture of legit and spam domain name dataset. The dataset is taken from Google webmaster tools. Machine learning models are applied on individual, distributed, and hybrid features, which significantly improved the performance of existing malicious domain machine learning techniques. Better accuracy is achieved for support vector machine (SVM) classifier, as compared to Naïve Bayes, C4.5, AdaBoost, LogitBoost.

Download Full-text

Online inspection system based on machine learning techniques: real case study of fabric textures classification for the automotive industry

Journal of Intelligent Manufacturing ◽

10.1007/s10845-016-1254-6 ◽

2016 ◽

Vol 30 (1) ◽

pp. 351-361 ◽

Cited By ~ 3

Author(s):

Pedro Malaca ◽

Luis F. Rocha ◽

D. Gomes ◽

João Silva ◽

Germano Veiga

Keyword(s):

Machine Learning ◽

Automotive Industry ◽

Machine Learning Techniques ◽

Real Case ◽

Inspection System ◽

Learning Techniques ◽

Online Inspection

Download Full-text

Simulation of a CSP Solar Steam Generator, Using Machine Learning

Energies ◽

10.3390/en14123613 ◽

2021 ◽

Vol 14 (12) ◽

pp. 3613

Author(s):

Adrian Gonzalez Gonzalez ◽

Jose Valeriano Alvarez Cabal ◽

Miguel Angel Vigil Berrocal ◽

Rogelio Peón Menéndez ◽

Adrian Riesgo Fernández

Keyword(s):

Machine Learning ◽

Performance Model ◽

Machine Learning Techniques ◽

Support Vector ◽

Concentrated Solar Power ◽

Process Data ◽

Mena Region ◽

Power Block ◽

Learning Techniques ◽

Fitting In

Developing an accurate concentrated solar power (CSP) performance model requires significant effort and time. The power block (PB) is the most complex system, and its modeling is clearly the most complicated and time-demanding part. Nonetheless, PB layouts are quite similar throughout CSP plants, meaning that there are enough historical process data available from commercial plants to use machine learning techniques. These algorithms allowed the development of a very accurate black-box PB model in a very short amount of time. This PB model could be easily integrated as a block into the PM. The machine learning technique selected was SVR (support vector regression). The PB model was trained using a complete year of data from a commercial CSP plant situated in southern Spain. With a very limited set of inputs, the PB model results were very accurate, according to their validation against a new complete year of data. The model not only fit well on an aggregate basis, but also in the transients between operation modes. To validate applicability, the same model methodology is used with a data from a very different CSP Plant, located in the MENA region and with more than double nominal electric power, obtaining an excellent fitting in the validation.

Download Full-text