prediction models Latest Research Papers

Machine Learning and Survey-based Predictors of InfoSec Non-Compliance

ACM Transactions on Management Information Systems ◽

10.1145/3466689 ◽

2022 ◽

Vol 13 (2) ◽

pp. 1-20

Author(s):

Byron Marshall ◽

Michael Curry ◽

Robert E. Crossler ◽

John Correia

Keyword(s):

Feature Selection ◽

Security Policy ◽

Prediction Models ◽

Training Programs ◽

Selection Process ◽

Multiple Time ◽

Compliance Behavior ◽

Systematic Feature ◽

Tree Models ◽

Time Frames

Survey items developed in behavioral Information Security (InfoSec) research should be practically useful in identifying individuals who are likely to create risk by failing to comply with InfoSec guidance. The literature shows that attitudes, beliefs, and perceptions drive compliance behavior and has influenced the creation of a multitude of training programs focused on improving ones’ InfoSec behaviors. While automated controls and directly observable technical indicators are generally preferred by InfoSec practitioners, difficult-to-monitor user actions can still compromise the effectiveness of automatic controls. For example, despite prohibition, doubtful or skeptical employees often increase organizational risk by using the same password to authenticate corporate and external services. Analysis of network traffic or device configurations is unlikely to provide evidence of these vulnerabilities but responses to well-designed surveys might. Guided by the relatively new IPAM model, this study administered 96 survey items from the Behavioral InfoSec literature, across three separate points in time, to 217 respondents. Using systematic feature selection techniques, manageable subsets of 29, 20, and 15 items were identified and tested as predictors of non-compliance with security policy. The feature selection process validates IPAM's innovation in using nuanced self-efficacy and planning items across multiple time frames. Prediction models were trained using several ML algorithms. Practically useful levels of prediction accuracy were achieved with, for example, ensemble tree models identifying 69% of the riskiest individuals within the top 25% of the sample. The findings indicate the usefulness of psychometric items from the behavioral InfoSec in guiding training programs and other cybersecurity control activities and demonstrate that they are promising as additional inputs to AI models that monitor networks for security events.

Network-Level Bridge Deterioration Prediction Models That Consider the Effect of Maintenance and Rehabilitation

Journal of Infrastructure Systems ◽

10.1061/(asce)is.1943-555x.0000662 ◽

2022 ◽

Vol 28 (1) ◽

Author(s):

Feiyue Wang ◽

Cheng-Chun “Barry” Lee ◽

Nasir G. Gharaibeh

Keyword(s):

Prediction Models ◽

Maintenance And Rehabilitation

Operating Speed Prediction Models by Vehicle Type on Two-Lane Rural Highways in Indian Hilly Terrains

Journal of Transportation Engineering Part A Systems ◽

10.1061/jtepbs.0000644 ◽

2022 ◽

Vol 148 (3) ◽

Author(s):

Jaydip Goyani ◽

Purvang Chaudhari ◽

Shriniwas Arkatkar ◽

Gaurang Joshi ◽

Said M. Easa

Keyword(s):

Prediction Models ◽

Operating Speed ◽

Rural Highways ◽

Vehicle Type ◽

Speed Prediction

Automatic missing value imputation for cleaning phase of diabetic’s readmission prediction model

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i2.pp2001-2013 ◽

2022 ◽

Vol 12 (2) ◽

pp. 2001

Author(s):

Jesmeen Mohd Zebaral Hoque ◽

Jakir Hossen ◽

Shohel Sayeed ◽

Chy. Mohammed Tawsif K. ◽

Jaya Ganesan ◽

...

Keyword(s):

Incomplete Data ◽

Missing Values ◽

Prediction Models ◽

Low Cost ◽

Support Vector ◽

Data Sampling ◽

Data Set ◽

Missing Value ◽

Missing Value Imputation ◽

Proper Analysis

Recently, the industry of healthcare started generating a large volume of datasets. If hospitals can employ the data, they could easily predict the outcomes and provide better treatments at early stages with low cost. Here, data analytics (DA) was used to make correct decisions through proper analysis and prediction. However, inappropriate data may lead to flawed analysis and thus yield unacceptable conclusions. Hence, transforming the improper data from the entire data set into useful data is essential. Machine learning (ML) technique was used to overcome the issues due to incomplete data. A new architecture, automatic missing value imputation (AMVI) was developed to predict missing values in the dataset, including data sampling and feature selection. Four prediction models (i.e., logistic regression, support vector machine (SVM), AdaBoost, and random forest algorithms) were selected from the well-known classification. The complete AMVI architecture performance was evaluated using a structured data set obtained from the UCI repository. Accuracy of around 90% was achieved. It was also confirmed from cross-validation that the trained ML model is suitable and not over-fitted. This trained model is developed based on the dataset, which is not dependent on a specific environment. It will train and obtain the outperformed model depending on the data available.

A Comparative Study of Energy Big Data Analysis for Product Management in a Smart Factory

Journal of Organizational and End User Computing ◽

10.4018/joeuc.291559 ◽

2022 ◽

Vol 34 (2) ◽

pp. 1-17

Author(s):

Rahman A. B. M. Salman ◽

Lee Myeongbae ◽

Lim Jonghyun ◽

Yongyun Cho ◽

Shin Changsun

Keyword(s):

Economic Growth ◽

Support Vector Machine ◽

Coefficient Of Variation ◽

Prediction Models ◽

Mean Squared Error ◽

Training Data ◽

Smart Factory ◽

Support Vector ◽

Product Management ◽

Data Set

Energy has been obtained as one of the key inputs for a country's economic growth and social development. Analysis and modeling of industrial energy are currently a time-insertion process because more and more energy is consumed for economic growth in a smart factory. This study aims to present and analyse the predictive models of the data-driven system to be used by appliances and find out the most significant product item. With repeated cross-validation, three statistical models were trained and tested in a test set: 1) General Linear Regression Model (GLM), 2) Support Vector Machine (SVM), and 3) boosting Tree (BT). The performance of prediction models measured by R2 error, Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Variation (CV). The best model from the study is the Support Vector Machine (SVM) that has been able to provide R2 of 0.86 for the training data set and 0.85 for the testing data set with a low coefficient of variation, and the most significant product of this smart factory is Skelp.

Causal Feature Selection with Missing Data

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3488055 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-24

Author(s):

Kui Yu ◽

Yajing Yang ◽

Wei Ding

Keyword(s):

Feature Selection ◽

Missing Data ◽

Real World ◽

Missing Values ◽

Prediction Models ◽

Causal Structure ◽

Data Imputation ◽

Accurate Data ◽

Unified Framework ◽

Class Variable

Causal feature selection aims at learning the Markov blanket (MB) of a class variable for feature selection. The MB of a class variable implies the local causal structure among the class variable and its MB and all other features are probabilistically independent of the class variable conditioning on its MB, this enables causal feature selection to identify potential causal features for feature selection for building robust and physically meaningful prediction models. Missing data, ubiquitous in many real-world applications, remain an open research problem in causal feature selection due to its technical complexity. In this article, we discuss a novel multiple imputation MB (MimMB) framework for causal feature selection with missing data. MimMB integrates Data Imputation with MB Learning in a unified framework to enable the two key components to engage with each other. MB Learning enables Data Imputation in a potentially causal feature space for achieving accurate data imputation, while accurate Data Imputation helps MB Learning identify a reliable MB of the class variable in turn. Then, we further design an enhanced kNN estimator for imputing missing values and instantiate the MimMB. In our comprehensively experimental evaluation, our new approach can effectively learn the MB of a given variable in a Bayesian network and outperforms other rival algorithms using synthetic and real-world datasets.

Development of shale gas production prediction models based on machine learning using early data

Energy Reports ◽

10.1016/j.egyr.2021.12.040 ◽

2022 ◽

Vol 8 ◽

pp. 1229-1237

Author(s):

Wente Niu ◽

Jialiang Lu ◽

Yuping Sun

Keyword(s):

Machine Learning ◽

Shale Gas ◽

Prediction Models ◽

Gas Production ◽

Early Data ◽

Gas Production Prediction ◽

Production Prediction

Energy-saving potential prediction models for large-scale building: A state-of-the-art review

Renewable and Sustainable Energy Reviews ◽

10.1016/j.rser.2021.111992 ◽

2022 ◽

Vol 156 ◽

pp. 111992

Author(s):

Xiu'e Yang ◽

Shuli Liu ◽

Yuliang Zou ◽

Wenjie Ji ◽

Qunli Zhang ◽

...

Keyword(s):

Energy Saving ◽

Large Scale ◽

Prediction Models ◽

State Of The Art ◽

Energy Saving Potential

Models for Efficient Utilization of Resources for Upgrading Android Mobile Technology

International Journal of System Dynamics Applications ◽

10.4018/ijsda.20220701.oa2 ◽

2022 ◽

Vol 11 (2) ◽

pp. 1-22

Author(s):

Abha Jain ◽

Ankita Bansal

Keyword(s):

Mobile Phones ◽

Operating Systems ◽

Mobile Technology ◽

Prediction Models ◽

High Dimensionality ◽

Careful Attention ◽

Huge Amount ◽

Efficient Utilization ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

The need of the customers to be connected to the network at all times has led to the evolution of mobile technology. Operating systems play a vitol role when we talk of technology. Nowadays, Android is one of the popularly used operating system in mobile phones. Authors have analysed three stable versions of Android, 6.0, 7.0 and 8.0. Incorporating a change in the version after it is released requires a lot of rework and thus huge amount of costs are incurred. In this paper, the aim is to reduce this rework by identifying certain parts of a version during early phase of development which need careful attention. Machine learning prediction models are developed to identify the parts which are more prone to changes. The accuracy of such models should be high as the developers heavily rely on them. The high dimensionality of the dataset may hamper the accuracy of the models. Thus, the authors explore four dimensionality reduction techniques, which are unexplored in the field of network and communication. The results concluded that the accuracy improves after reducing the features.

A modified mayfly-SVM approach for early detection of type 2 diabetes mellitus

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp524-533 ◽

2022 ◽

Vol 12 (1) ◽

pp. 524

Author(s):

Ratna Patil ◽

Sharvari Tamane ◽

Shitalkumar Adhar Rawandale ◽

Kanishk Patil

Keyword(s):

Diabetes Mellitus ◽

Type 2 Diabetes ◽

Type 2 Diabetes Mellitus ◽

Prediction Models ◽

Early Stage ◽

Machine Learning Algorithms ◽

Support Vector ◽

Test Accuracy ◽

Proposed Model

<p>Diabetes mellitus is a chronic disease that affects many people in the world badly. Early diagnosis of this disease is of paramount importance as physicians and patients can work towards prevention and mitigation of future complications. Hence, there is a necessity to develop a system that diagnoses type 2 diabetes mellitus (T2DM) at an early stage. Recently, large number of studies have emerged with prediction models to diagnose T2DM. Most importantly, published literature lacks the availability of multi-class studies. Therefore, the primary objective of the study is development of multi-class predictive model by taking advantage of routinely available clinical data in diagnosing T2DM using machine learning algorithms. In this work, modified mayfly-support vector machine is implemented to notice the prediabetic stage accurately. To assess the effectiveness of proposed model, a comparative study was undertaken and was contrasted with T2DM prediction models developed by other researchers from last five years. Proposed model was validated over data collected from local hospitals and the benchmark PIMA dataset available on UCI repository. The study reveals that modified Mayfly-SVM has a considerable edge over metaheuristic optimization algorithms in local as well as global searching capabilities and has attained maximum test accuracy of 94.5% over PIMA.</p>

prediction models
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine Learning and Survey-based Predictors of InfoSec Non-Compliance

Network-Level Bridge Deterioration Prediction Models That Consider the Effect of Maintenance and Rehabilitation

Operating Speed Prediction Models by Vehicle Type on Two-Lane Rural Highways in Indian Hilly Terrains

Automatic missing value imputation for cleaning phase of diabetic’s readmission prediction model

A Comparative Study of Energy Big Data Analysis for Product Management in a Smart Factory

Causal Feature Selection with Missing Data

Development of shale gas production prediction models based on machine learning using early data

Energy-saving potential prediction models for large-scale building: A state-of-the-art review

Models for Efficient Utilization of Resources for Upgrading Android Mobile Technology

A modified mayfly-SVM approach for early detection of type 2 diabetes mellitus

Export Citation Format

prediction modelsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine Learning and Survey-based Predictors of InfoSec Non-Compliance

Network-Level Bridge Deterioration Prediction Models That Consider the Effect of Maintenance and Rehabilitation

Operating Speed Prediction Models by Vehicle Type on Two-Lane Rural Highways in Indian Hilly Terrains

Automatic missing value imputation for cleaning phase of diabetic’s readmission prediction model

A Comparative Study of Energy Big Data Analysis for Product Management in a Smart Factory

Causal Feature Selection with Missing Data

Development of shale gas production prediction models based on machine learning using early data

Energy-saving potential prediction models for large-scale building: A state-of-the-art review

Models for Efficient Utilization of Resources for Upgrading Android Mobile Technology

A modified mayfly-SVM approach for early detection of type 2 diabetes mellitus

prediction models
Recently Published Documents