scholarly journals A Proposed Model for Predicting Employee Turnover of Information Technology Specialists Using Data Mining Techniques

Author(s):  
Ahmed Hosny Ghazi ◽  
Samir Ismail Elsayed ◽  
Ayman Elsayed Khedr

This article proposes a data mining framework to predict the significant explanations of employee turn-over problems. Using Support vector machine, decision tree, deep learning, random forest, and other classification algorithms, the authors propose features prediction framework to determine the influencing factors of employee turn-over problem. The proposed framework categorizes a set of historical behavior such as years at company, over time, performance rating, years since last promotion, and total working years. The proposed framework also classifies demographics features such as Age, Monthly Income, and Distance from Home, Marital Status, Education, and Gender. It also uses attitudinal employee characteristics to determine the reasons for employee turnover in the information technology sector. It has been found that the monthly rate, overtime, and employee age are the most significant factors which cause employee turnover.

Plants ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 95
Author(s):  
Heba Kurdi ◽  
Amal Al-Aldawsari ◽  
Isra Al-Turaiki ◽  
Abdulrahman S. Aldawood

In the past 30 years, the red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), a pest that is highly destructive to all types of palms, has rapidly spread worldwide. However, detecting infestation with the RPW is highly challenging because symptoms are not visible until the death of the palm tree is inevitable. In addition, the use of automated RPW weevil identification tools to predict infestation is complicated by a lack of RPW datasets. In this study, we assessed the capability of 10 state-of-the-art data mining classification algorithms, Naive Bayes (NB), KSTAR, AdaBoost, bagging, PART, J48 Decision tree, multilayer perceptron (MLP), support vector machine (SVM), random forest, and logistic regression, to use plant-size and temperature measurements collected from individual trees to predict RPW infestation in its early stages before significant damage is caused to the tree. The performance of the classification algorithms was evaluated in terms of accuracy, precision, recall, and F-measure using a real RPW dataset. The experimental results showed that infestations with RPW can be predicted with an accuracy up to 93%, precision above 87%, recall equals 100%, and F-measure greater than 93% using data mining. Additionally, we found that temperature and circumference are the most important features for predicting RPW infestation. However, we strongly call for collecting and aggregating more RPW datasets to run more experiments to validate these results and provide more conclusive findings.


Author(s):  
Efat Jabarpour ◽  
Amin Abedini ◽  
Abbasali Keshtkar

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for  (TAN) when the precision of  TAN  is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.  


2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Grigore Stamatescu ◽  
Iulia Stamatescu ◽  
Nicoleta Arghira ◽  
Ioana Fagarasan

Considering the advances in building monitoring and control through networks of interconnected devices, effective handling of the associated rich data streams is becoming an important challenge. In many situations, the application of conventional system identification or approximate grey-box models, partly theoretic and partly data driven, is either unfeasible or unsuitable. The paper discusses and illustrates an application of black-box modelling achieved using data mining techniques with the purpose of smart building ventilation subsystem control. We present the implementation and evaluation of a data mining methodology on collected data from over one year of operation. The case study is carried out on four air handling units of a modern campus building for preliminary decision support for facility managers. The data processing and learning framework is based on two steps: raw data streams are compressed using the Symbolic Aggregate Approximation method, followed by the resulting segments being input into a Support Vector Machine algorithm. The results are useful for deriving the behaviour of each equipment in various modi of operation and can be built upon for fault detection or energy efficiency applications. Challenges related to online operation within a commercial Building Management System are also discussed as the approach shows promise for deployment.


Author(s):  
Mohammad M. Masud ◽  
Latifur Khan ◽  
Bhavani Thuraisingham

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.


2020 ◽  
Vol 17 (8) ◽  
pp. 3804-3809
Author(s):  
A. Yovan Felix ◽  
Karthik Reddy Vuyyuru ◽  
Viswas Puli

Human Resource Management has gotten one of the basic pastimes of supervisors and chiefs in practically wide variety of corporations to include plans for accurately locating profoundly qualified representatives. In similar way, administrations come to be intrigued about the presentation of these representatives. Particularly to guarantee the fitting person apportioned to the beneficial employment on the opportune time. From right here the enthusiasm of statistics in mining process has been growing that its goal is disclosure of facts from huge measures of statistics. Three fundamental Data Mining strategies were applied for building the arrangement version and distinguishing the quality factors that emphatically impact the exhibition. To get a profoundly actual version, a few trials were achieved dependent on the beyond procedures which can be actualized in WEKA tool for empowering leaders and Human Resource professionals to anticipate and improve the exhibition of their representatives. This paper makes use of Hadoop for the remedy of great measure of data with which may be guaranteed to be able to decide the impact.


Author(s):  
M. Jupri ◽  
Riyanarto Sarno

The achievement of accepting optimal tax need effective and efficient tax supervision can be achieved by classifying taxpayer compliance to tax regulations. Considering this issue, this paper proposes the classification of taxpayer compliance using data mining algorithms; i.e. C4.5, Support Vector Machine, K-Nearest Neighbor, Naive Bayes, and Multilayer Perceptron based on the compliance of taxpayer data. The taxpayer compliance can be classified into four classes, which are (1) formal and material compliant taxpayers, (2) formal compliant taxpayers, (3) material compliant taxpayers, and (4) formal and material non-compliant taxpayers. Furthermore, the results of data mining algorithms are compared by using Fuzzy AHP and TOPSIS to determine the best performance classification based on the criteria of Accuracy, F-Score, and Time required. Selection of the taxpayer's priority for more detailed supervision at each level of taxpayer compliance is ranked using Fuzzy AHP and TOPSIS based on criteria of dataset variables. The results show that C4.5 is the best performance classification and achieves preference value of 0.998; whereas the MLP algorithm results from the lowest preference value of 0.131. Alternative taxpayer A233 is the top priority taxpayer with a preference value of 0.433; whereas alternative taxpayer A051 is the lowest priority taxpayer with a preference value of 0.036.


2021 ◽  
Vol 11 (19) ◽  
pp. 9096
Author(s):  
Idongesit Ekerete ◽  
Matias Garcia-Constantino ◽  
Alexandros Konios ◽  
Mustafa A. Mustafa ◽  
Yohanca Diaz-Skeete ◽  
...  

This paper proposes the fusion of Unobtrusive Sensing Solutions (USSs) for human Activity Recognition and Classification (ARC) in home environments. It also considers the use of data mining models and methods for cluster-based analysis of datasets obtained from the USSs. The ability to recognise and classify activities performed in home environments can help monitor health parameters in vulnerable individuals. This study addresses five principal concerns in ARC: (i) users’ privacy, (ii) wearability, (iii) data acquisition in a home environment, (iv) actual recognition of activities, and (v) classification of activities from single to multiple users. Timestamp information from contact sensors mounted at strategic locations in a kitchen environment helped obtain the time, location, and activity of 10 participants during the experiments. A total of 11,980 thermal blobs gleaned from privacy-friendly USSs such as ceiling and lateral thermal sensors were fused using data mining models and methods. Experimental results demonstrated cluster-based activity recognition, classification, and fusion of the datasets with an average regression coefficient of 0.95 for tested features and clusters. In addition, a pooled Mean accuracy of 96.5% was obtained using classification-by-clustering and statistical methods for models such as Neural Network, Support Vector Machine, K-Nearest Neighbour, and Stochastic Gradient Descent on Evaluation Test.


2019 ◽  
Vol 3 (2) ◽  
pp. 33
Author(s):  
Raheleh Hamedanizad ◽  
Elham Bahmani ◽  
Mojtaba Jamshidi ◽  
Aso Mohammad Darwesh

   Addiction to narcotics is one of the greatest health challenges in today’s world which has become a serious threat for social, economic, and cultural structures and has ruined a part of an active force of the society and it is one of the main factors of growth of diseases such as HIV and hepatitis. Today, addiction is known as a disease and welfare organization, and many of the dependent centers try to help the addicts treat this disease. In this study, using data mining algorithms and based on data collected from opioid withdrawal applicants referring to welfare organization, a prediction model is proposed to predict the success of opioid withdrawal applicants. In this study, the statistical population is comprised opioid withdrawal applicants in a welfare organization. This statistical population includes 26 features of 793 instances including men and women. The proposed model is a combination of meta-learning algorithms (decorate and bagging) and J48 decision tree implemented in Weka data mining software. The efficiency of the proposed model is evaluated in terms of precision, recall, Kappa, and root mean squared error and the results are compared with algorithms such as multilayer perceptron neural network, Naive Bayes, and Random Forest. The results of various experiments showed that the precision of the proposed model is 71.3% which is superior over the other compared algorithms.


Sign in / Sign up

Export Citation Format

Share Document