Why Do Data Scientists Want to Change Jobs: Using Machine Learning Techniques to Analyze Employees’ Intentions in Switching Jobs

Data scientists are among the highest-paid and most in-demand employees in the 21st century. This gives them opportunities to switch jobs quite easily. In this paper, we follow the Cross-Industry Standard Process for Data Mining (CRISP-DM) approach and the data science life cycle process to analyze factors which predict whether a data scientist is looking for a new job or not. Specifically, we use machine learning techniques to analyze data from Kaggle.com. We find that features that have the highest impact on whether a data scientist wants to change his/her job include the city development index, company size, and company type. When we examine the city development index more carefully, we find evidence suggesting that employees move from cities with lower to higher development indexes, as they become more experienced. The predictive analysis system we use is able to predict with average accuracy rates of higher than 78%.

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

Sentiment Analysis in Social Media using Machine Learning Techniques

Iraqi Journal of Science ◽

10.24996/ijs.2020.61.1.22 ◽

2020 ◽

pp. 193-201 ◽

Cited By ~ 1

Author(s):

Hayder A. Alatabi ◽

Ayad R. Abbas

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Machine Learning Techniques ◽

Great Success ◽

Social Media Data ◽

Learning Techniques ◽

The World ◽

Analysis System ◽

Media Data

Over the last period, social media achieved a widespread use worldwide where the statistics indicate that more than three billion people are on social media, leading to large quantities of data online. To analyze these large quantities of data, a special classification method known as sentiment analysis, is used. This paper presents a new sentiment analysis system based on machine learning techniques, which aims to create a process to extract the polarity from social media texts. By using machine learning techniques, sentiment analysis achieved a great success around the world. This paper investigates this topic and proposes a sentiment analysis system built on Bayesian Rough Decision Tree (BRDT) algorithm. The experimental results show the success of this system where the accuracy of the system is more than 95% on social media data.

Download Full-text

Machine Learning Techniques for Internet of Things

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Integrating the Internet of Things Into Software Engineering Practices ◽

10.4018/978-1-5225-7790-4.ch008 ◽

2019 ◽

pp. 160-180

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?

Download Full-text

Machine Learning Techniques for Internet of Things

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch067 ◽

2021 ◽

pp. 1490-1506

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

Download Full-text

Introduction to Computational Psychometrics: Towards a Principled Integration of Data Science and Machine Learning Techniques into Psychometrics

Methodology of Educational Measurement and Assessment - Computational Psychometrics: New Methodologies for a New Generation of Digital Learning and Assessment ◽

10.1007/978-3-030-74394-9_1 ◽

2021 ◽

pp. 1-6

Author(s):

Alina A. von Davier ◽

Robert J. Mislevy ◽

Jiangang Hao

Keyword(s):

Machine Learning ◽

Data Science ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Application of Machine Learning Techniques As a Means of Mooring Integrity Monitoring

Volume 3: Structures, Safety, and Reliability ◽

10.1115/omae2019-96411 ◽

2019 ◽

Author(s):

Jonathan M. Gumley ◽

Hayden Marcollo ◽

Stuart Wales ◽

Andrew E. Potts ◽

Christopher J. Carra

Keyword(s):

Machine Learning ◽

Data Science ◽

Single Point ◽

Original System ◽

Training Data ◽

Machine Learning Techniques ◽

Mooring Line ◽

Artificial Noise ◽

Data Set ◽

Learning Techniques

Abstract There is growing importance in the offshore floating production sector to develop reliable and robust means of continuously monitoring the integrity of mooring systems for FPSOs and FPUs, particularly in light of the upcoming introduction of API-RP-2MIM. Here, the limitations of the current range of monitoring techniques are discussed, including well established technologies such as load cells, sonar, or visual inspection, within the context of the growing mainstream acceptance of data science and machine learning. Due to the large fleet of floating production platforms currently in service, there is a need for a readily deployable solution that can be retrofitted to existing platforms to passively monitor the performance of floating assets on their moorings, for which machine learning based systems have particular advantages. An earlier investigation conducted in 2016 on a shallow water, single point moored FPSO employed host facility data from in-service field measurements before and after a single mooring line failure event. This paper presents how the same machine learning techniques were applied to a deep water, semi taut, spread moored system where there was no host facility data available, therefore requiring a calibrated hydrodynamic numerical model to be used as the basis for the training data set. The machine learning techniques applied to both real and synthetically generated data were successful in replicating the response of the original system, even with the latter subjected to different variations of artificial noise. Furthermore, utilizing a probability-based approach, it was demonstrated that replicating the response of the underlying system was a powerful technique for predicting changes in the mooring system.

Download Full-text

Airbnb (Air Bed and Breakfast) Listing Analysis Through Machine Learning Techniques

10.4018/978-1-7998-8455-2.ch008 ◽

2022 ◽

pp. 209-232

Author(s):

Xiang Li ◽

Jingxi Liao ◽

Tianchuan Gao

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Data Science ◽

Principal Component ◽

Machine Learning Techniques ◽

Classification Models ◽

Performance Measurements ◽

Learning Techniques ◽

Source Data ◽

Bed And Breakfast

Machine learning is a broad field that contains multiple fields of discipline including mathematics, computer science, and data science. Some of the concepts, like deep neural networks, can be complicated and difficult to explain in several words. This chapter focuses on essential methods like classification from supervised learning, clustering, and dimensionality reduction that can be easily interpreted and explained in an acceptable way for beginners. In this chapter, data for Airbnb (Air Bed and Breakfast) listings in London are used as the source data to study the effect of each machine learning technique. By using the K-means clustering, principal component analysis (PCA), random forest, and other methods to help build classification models from the features, it is able to predict the classification results and provide some performance measurements to test the model.

Download Full-text

Data Analytics and Modeling in IoT-Fog Environment for Resource Constrained IoT-Applications - A Review

Recent Advances in Computer Science and Communications ◽

10.2174/2666255814666210715161630 ◽

2021 ◽

Vol 14 ◽

Author(s):

Omar Farooq ◽

Parminder Singh

Keyword(s):

Machine Learning ◽

Data Science ◽

Data Classification ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Resource Constrained ◽

Learning Techniques ◽

Iot Devices ◽

The Right ◽

Continuous Use

Introduction: The emergence of the concepts like Big Data, Data Science, Machine Learning (ML), and the Internet of Things (IoT) has added the potential of research in today's world. The continuous use of IoT devices, sensors, etc. that collect data continuously puts tremendous pressure on the existing IoT network. Materials and Methods: This resource-constrained IoT environment is flooded with data acquired from millions of IoT nodes deployed at the device level. The limited resources of the IoT Network have driven the researchers towards data Management. This paper focuses on data classification at the device level, edge/fog level, and cloud level using machine learning techniques. Results: The data coming from different devices is vast and is of variety. Therefore, it becomes essential to choose the right approach for classification and analysis. It will help optimize the data at the device edge/fog level to better the network's performance in the future. Conclusion: This paper presents data classification, machine learning approaches, and a proposed mathematical model for the IoT environment.

Download Full-text

Coupling data science with community crowdsourcing for urban renewal policy analysis: An evaluation of Atlanta’s Anti-Displacement Tax Fund

Environment and Planning B Urban Analytics and City Science ◽

10.1177/2399808318819847 ◽

2018 ◽

Vol 47 (6) ◽

pp. 1081-1097 ◽

Cited By ~ 3

Author(s):

Jeremy Auerbach ◽

Christopher Blackburn ◽

Hayley Barton ◽

Amanda Meng ◽

Ellen Zegura

Keyword(s):

Machine Learning ◽

Urban Renewal ◽

Data Science ◽

Property Tax ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Household Level ◽

Learning Techniques ◽

Level Data ◽

Program Costs

We estimate the cost and impact of a proposed anti-displacement program in the Westside of Atlanta (GA) with data science and machine learning techniques. This program intends to fully subsidize property tax increases for eligible residents of neighborhoods where there are two major urban renewal projects underway, a stadium and a multi-use trail. We first estimate household-level income eligibility for the program with data science and machine learning approaches applied to publicly available household-level data. We then forecast future property appreciation due to urban renewal projects using random forests with historic tax assessment data. Combining these projections with household-level eligibility, we estimate the costs of the program for different eligibility scenarios. We find that our household-level data and machine learning techniques result in fewer eligible homeowners but significantly larger program costs, due to higher property appreciation rates than the original analysis, which was based on census and city-level data. Our methods have limitations, namely incomplete data sets, the accuracy of representative income samples, the availability of characteristic training set data for the property tax appreciation model, and challenges in validating the model results. The eligibility estimates and property appreciation forecasts we generated were also incorporated into an interactive tool for residents to determine program eligibility and view their expected increases in home values. Community residents have been involved with this work and provided greater transparency, accountability, and impact of the proposed program. Data collected from residents can also correct and update the information, which would increase the accuracy of the program estimates and validate the modeling, leading to a novel application of community-driven data science.

Download Full-text

Buildings Energy Efficiency Analysis and Classification Using Various Machine Learning Technique Classifiers

Energies ◽

10.3390/en13133497 ◽

2020 ◽

Vol 13 (13) ◽

pp. 3497 ◽

Cited By ~ 1

Author(s):

César Benavente-Peces ◽

Nisrine Ibadah

Keyword(s):

Machine Learning ◽

Energy Efficiency ◽

Data Science ◽

Smart Cities ◽

Thermal Modeling ◽

Modern Society ◽

Machine Learning Techniques ◽

Single Family ◽

Residential Areas ◽

Learning Techniques

Energy efficiency is a major concern to achieve sustainability in modern society. Smart cities sustainability depends on the availability of energy-efficient infrastructures and services. Buildings compose most of the city, and they are responsible for most of the energy consumption and emissions to the atmosphere (40%). Smart cities need smart buildings to achieve sustainability goals. Building’s thermal modeling is essential to face the energy efficiency race. In this paper, we show how ICT and data science technologies and techniques can be applied to evaluate the energy efficiency of buildings. In concrete, we apply machine learning techniques to classify buildings based on their energy efficiency. Particularly, our focus is on single-family buildings in residential areas. Along this paper, we demonstrate the capabilities of machine learning techniques to classify buildings depending on their energy efficiency. Moreover, we analyze and compare the performance of different classifiers. Furthermore, we introduce new parameters which have some impact on the buildings thermal modeling, especially those concerning the environment where the building is located. We also make an insight on ICT and remark the growing relevance in data acquisition and monitoring of relevant parameters by using wireless sensor networks. It is worthy to remark the need for an appropriate and reliable dataset to achieve the best results. Moreover, we demonstrate that reliable classification is feasible with a few featured parameters.

Download Full-text