Evaluation of Interstate Work Zone Mobility using Probe Vehicle Data and Machine Learning Techniques

According to the Federal Highway Administration (FHWA), US work zones on freeways account for nearly 24% of nonrecurring freeway delays and 10% of overall congestion. Historically, there have been limited scalable datasets to investigate the specific causes of congestion due to work zones or to improve work zone planning processes to characterize the impact of work zone congestion. In recent years, third-party data vendors have provided scalable speed data from Global Positioning System (GPS) devices and cell phones which can be used to characterize mobility on all roadways. Each work zone has unique characteristics and varying mobility impacts which are predicted during the planning and design phases, but can realistically be quite different from what is ultimately experienced by the traveling public. This paper uses these datasets to introduce a scalable Work Zone Mobility Audit (WZMA) template. Additionally, the paper uses metrics developed for individual work zones to characterize the impact of more than 250 work zones varying in length and duration from Southeast Michigan. The authors make recommendations to work zone engineers on useful data to collect for improving the WZMA. As more systematic work zone data are collected, improved analytical assessment techniques, such as machine learning processes, can be used to identify the factors that will predict future work zone impacts. The paper concludes by demonstrating two machine learning algorithms, Random Forest and XGBoost, which show historical speed variation is a critical component when predicting the mobility impact of work zones.

Download Full-text

Machine Learning Approach to Forecast Work Zone Mobility using Probe Vehicle Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120927401 ◽

2020 ◽

Vol 2674 (9) ◽

pp. 157-167

Author(s):

Mohsen Kamyab ◽

Stephen Remias ◽

Erfan Najmi ◽

Sanaz Rabinia ◽

Jonathan M. Waddell

Keyword(s):

Machine Learning ◽

Traffic Congestion ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Work Zone ◽

Data Sources ◽

Work Zones ◽

Probe Vehicle ◽

Vehicle Data ◽

Lane Closures

The aim of deploying intelligent transportation systems (ITS) is often to help engineers and operators identify traffic congestion. The future of ITS-based traffic management is the prediction of traffic conditions using ubiquitous data sources. There are currently well-developed prediction models for recurrent traffic congestion such as during peak hour. However, there is a need to predict traffic congestion resulting from non-recurring events such as highway lane closures. As agencies begin to understand the value of collecting work zone data, rich data sets will emerge consisting of historical work zone information. In the era of big data, rich mobility data sources are becoming available that enable the application of machine learning to predict mobility for work zones. The purpose of this study is to utilize historical lane closure information with supervised machine learning algorithms to forecast spatio-temporal mobility for future lane closures. Various traffic data sources were collected from 1,160 work zones on Michigan interstates between 2014 and 2017. This study uses probe vehicle data to retrieve a mobility profile for these historical observations, and uses these profiles to apply random forest, XGBoost, and artificial neural network (ANN) classification algorithms. The mobility prediction results showed that the ANN model outperformed the other models by reaching up to 85% accuracy. The objective of this research was to show that machine learning algorithms can be used to capture patterns for non-recurrent traffic congestion even when hourly traffic volume is not available.

Download Full-text

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

International Journal of Cyber Warfare and Terrorism ◽

10.4018/ijcwt.2020040101 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1-26

Author(s):

Naghmeh Moradpoor Sheykhkanloo ◽

Adam Hall

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Third Party ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Insider Threat ◽

Threat Detection ◽

Imbalanced Dataset ◽

The Impact

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.

Download Full-text

Application of LiDAR and Connected Vehicle Data to Evaluate the Impact of Work Zone Geometry on Freeway Traffic Operations

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198118758050 ◽

2018 ◽

Vol 2672 (16) ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Michelle M. Mekker ◽

Yun-Jou Lin ◽

Magdy K. I. Elbahnasawy ◽

Tamer S. A. Shamseldin ◽

Howell Li ◽

...

Keyword(s):

Case Studies ◽

Work Zone ◽

Extensive Literature ◽

Vehicle Speed ◽

Connected Vehicle ◽

Traffic Operations ◽

Work Zones ◽

Geometric Data ◽

Vehicle Data ◽

The Impact

Extensive literature exists regarding recommendations for lane widths, merging tapers, and work zone geometry to provide safe and efficient traffic operations. However, it is often infeasible or unsafe for inspectors to check these geometric features in a freeway work zone. This paper discusses the integration of LiDAR (Light Detection And Ranging)-generated geometric data with connected vehicle speed data to evaluate the impact of work zone geometry on traffic operations. Connected vehicle speed data can be used at both a system-wide (statewide) or segment-level view to identify periods of congestion and queueing. Examples of regional trends, localized incidents, and recurring bottlenecks are shown in the data in this paper. A LiDAR-mounted vehicle was deployed to a variety of work zones where recurring bottlenecks were identified to collect geometric data. In total, 350 directional miles were covered, resulting in approximately 360 GB of data. Two case studies, where geometric anomalies were identified, are discussed in this paper: a short segment with a narrow lane width of 10–10.5 feet and a merging taper that was about 200 feet shorter than recommended by the Manual on Uniform Traffic Control Devices. In both case studies, these work zone features did not conform to project specifications but were difficult to assess safely by an inspector in the field because of the high volume of traffic. The paper concludes by recommending the use of connected vehicle data to systematically identify work zones with recurring congestion and the use of LiDAR to assess work zone geometrics.

Download Full-text

Optimizing Laboratory Investigations of Saline Intrusion by Incorporating Machine Learning Techniques

Water ◽

10.3390/w12112996 ◽

2020 ◽

Vol 12 (11) ◽

pp. 2996

Author(s):

Georgios Etsias ◽

Gerard A. Hamill ◽

Eric M. Benner ◽

Jesús F. Águila ◽

Mark C. McDonnell ◽

...

Keyword(s):

Machine Learning ◽

Image Processing ◽

Porous Medium ◽

Glass Bead ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Image Processing Technique ◽

Saline Intrusion ◽

Learning Techniques ◽

The Impact

Deriving saltwater concentrations from the light intensity values of dyed saline solutions is a long-established image processing practice in laboratory scale investigations of saline intrusion. The current paper presents a novel methodology that employs the predictive ability of machine learning algorithms in order to determine saltwater concentration fields. The proposed approach consists of three distinct parts, image pre-processing, porous medium classification (glass bead structure recognition) and saltwater field generation (regression). It minimizes the need for aquifer-specific calibrations, significantly shortening the experimental procedure by up to 50% of the time required. A series of typical saline intrusion experiments were conducted in homogeneous and heterogeneous aquifers, consisting of glass beads of varying sizes, to recreate the necessary laboratory data. An innovative method of distinguishing and filtering out the common experimental error introduced by both backlighting and the optical irregularities of the glass bead medium was formulated. This enabled the acquisition of quality predictions by classical, easy-to-use machine learning techniques, such as feedforward Artificial Neural Networks, using a limited amount of training data, proving the applicability of the procedure. The new process was benchmarked against a traditional regression algorithm. A series of variables were utilized to quantify the variance between the results generated by the two procedures. No compromise was found to the quality of the derived concentration fields and it was established that the proposed image processing technique is robust when applied to homogeneous and heterogeneous domains alike, outperforming the classical approach in all test cases. Moreover, the method minimized the impact of experimental errors introduced by small movements of the camera and the presence air bubbles trapped in the porous medium.

Download Full-text

A Framework for Structuring Learning Assessment in a Massively Multiplayer Online Educational Game

International Journal of Game-Based Learning ◽

10.4018/ijgbl.2014010103 ◽

2014 ◽

Vol 4 (1) ◽

pp. 37-59 ◽

Cited By ~ 13

Author(s):

Shawn Conrad ◽

Jody Clarke-Midura ◽

Eric Klopfer

Keyword(s):

Machine Learning ◽

Educational Games ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Educational Game ◽

Learning Assessment ◽

Assessment Design ◽

Assessment Techniques ◽

Future Work ◽

Massively Multiplayer

Educational games offer an opportunity to engage and inspire students to take interest in science, technology, engineering, and mathematical (STEM) subjects. Unobtrusive learning assessment techniques coupled with machine learning algorithms can be utilized to record students' in-game actions and formulate a model of the students' knowledge without interrupting the students' play. This paper introduces “Experiment Centered Assessment Design” (XCD), a framework for structuring a learning assessment feedback loop. XCD builds on the “Evidence Centered Assessment Design” (ECD) approach, which uses tasks to elicit evidence about students and their learning. XCD defines every task as an experiment in the scientific method, where an experiment maps a test of factors to observable outcomes. This XCD framework was applied to prototype quests in a massively multiplayer online (MMO) educational game. Future work would build upon the XCD framework and use machine learning techniques to provide feedback to students, teachers, and researchers.

Download Full-text

Multicontextual Machine-Learning Approach to Modeling Traffic Impact of Urban Highway Work Zones

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/2645-20 ◽

2017 ◽

Vol 2645 (1) ◽

pp. 184-194 ◽

Cited By ~ 3

Author(s):

Junseo Bae ◽

Kunhee Choi ◽

Jeong Ho Oh

Keyword(s):

Machine Learning ◽

Traffic Flow ◽

Local Governments ◽

Ad Hoc ◽

Work Zone ◽

Work Zones ◽

Highway Infrastructure ◽

Traffic Impact ◽

State And Local ◽

The Impact

Impact assessments of highway construction work zones (CWZs) are mandated for all federally funded highway infrastructure improvement projects. However, most existing approaches are ad hoc or project specific, so they are incapable of being benchmarked for any particular spatial region. A novel multicontextual approach to modeling the traffic impact of urban highway CWZs is proposed and tested in this paper. The proposed approach is unique because it models the impact of CWZ operations through a multicontextual quantitative method using big data for improved accuracy. In this study, a machine-learning technique was adopted to predict long-term traffic flow rates and the corresponding truck percentages. With the use of these predicted values, stereotypical patterns of traffic volume-to-capacity ratios were created for typical urban nighttime closures. Third-order curve-fitting models to achieve potential work zone travel time delays in heavily trafficked large urban cores were then developed and validated. This study will greatly help state and local governments and the general traveling public in major cities know the potential traffic flow resulting from construction and thereby facilitate progress on highway improvement projects with the better-informed work zone traffic flow and thus improve safety and mobility in and between CWZs.

Download Full-text

Towards Predicting Student’s Dropout in University Courses Using Different Machine Learning Techniques

Applied Sciences ◽

10.3390/app11073130 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3130

Author(s):

Janka Kabathova ◽

Martin Drlik

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Classification Model ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Unseen Data ◽

E Learning ◽

The Impact

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.

Download Full-text

Application of Machine Learning Techniques to Predict the Impact of Health Insurance on the Wellbeing of an Individual

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7247.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3065-3070

Keyword(s):

Machine Learning ◽

Health Insurance ◽

Large Scale ◽

Insurance Industry ◽

Financial Burden ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Insurance Companies ◽

The Government ◽

The Impact

The healthcare domain in India has suffered considerably despite the advancement in technology. Several financing schemes are endorsed by the insurance companies to lessen the financial burden faced by the government and people. Nonetheless, Health Insurance segment in India remains underdeveloped due to various complexities that it faces. This paper exploits a heuristic sampling approach combined with the ensemble Machine Learning algorithms on the large-scale insurance business data to realize the current shape of the Health Insurance industry in India. Through the courtesy of Data Mining and Data Analytics, it is plausible to furnish insights that assist the common people in acquiring closure that helps in the process of decision making.

Download Full-text

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

ACM Computing Surveys ◽

10.1145/3453158 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-36

Author(s):

Ishai Rosenberg ◽

Asaf Shabtai ◽

Yuval Elovici ◽

Lior Rokach

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Future Research ◽

Research Directions ◽

Future Research Directions ◽

Adversarial Attack ◽

The Impact

In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the application of machine learning, especially in non-stationary, adversarial environments, such as the cyber security domain, where actual adversaries (e.g., malware developers) exist. This article comprehensively summarizes the latest research on adversarial attacks against security solutions based on machine learning techniques and illuminates the risks they pose. First, the adversarial attack methods are characterized based on their stage of occurrence, and the attacker’ s goals and capabilities. Then, we categorize the applications of adversarial attack and defense methods in the cyber security domain. Finally, we highlight some characteristics identified in recent research and discuss the impact of recent advancements in other adversarial learning domains on future research directions in the cyber security domain. To the best of our knowledge, this work is the first to discuss the unique challenges of implementing end-to-end adversarial attacks in the cyber security domain, map them in a unified taxonomy, and use the taxonomy to highlight future research directions.

Download Full-text

Applying Machine Learning Techniques in Older People Activity Recognition usingWearable and Mobile Devices

10.5753/webmedia_estendido.2019.8140 ◽

2019 ◽

Author(s):

Flavio Vinicius Vieira Santana ◽

Bruno Henrique Rasteiro ◽

Larissa Cardoso Zimmermann ◽

Luciana De Nardin ◽

Maria da Graça Campos Pimentel

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Accelerometer Data ◽

Smartphone Apps ◽

Online Recognition ◽

Learning Techniques ◽

Combined Use ◽

Future Work

We investigate the potential of the combined use of smartwatch accelerometer data and smartphone apps for online older adultsactivity recognition.We selected machine learning algorithms which resulted in a posteriori recognition accuracy of 98.92%. Our smartphone app, with the selected machine learning algorithms, carried out online recognition from data captured on the smartwatch. These results allow us, as future work, assess the accuracy of online recognition when the system is used by older adults.

Download Full-text