scholarly journals An Ensemble Learning Model for Short-Term Passenger Flow Prediction

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xiangping Wang ◽  
Lei Huang ◽  
Haifeng Huang ◽  
Baoyu Li ◽  
Ziyang Xia ◽  
...  

In recent years, with the continuous improvement of urban public transportation capacity, citizens’ travel has become more and more convenient, but there are still some potential problems, such as morning and evening peak congestion, imbalance between the supply and demand of vehicles and passenger flow, emergencies, and social local passenger flow surged due to special circumstances such as activities and inclement weather. If you want to properly guide the local passenger flow and make a reasonable deployment of operating buses, it is necessary to grasp the changing law of public transportation short-term passenger flow. This paper builds a short-term passenger flow prediction model for urban public transportation based on the idea of integrated learning. The goal is to use the integrated model to accurately predict the short-term passenger flow of urban public transportation, using Multivariable Linear Regression (MLR), K-Nearest Neighbor (KNN), eXtreme Gradient Boosting (XGBoost), and Gated Recurrent Unit (GRU) as the four seed models, and then use regression algorithm to integrate the model and predict the passenger flow, station boarding and landing, and cross-sectional passenger flow data of the typical representative line 428 in the “Huitian Area” of Beijing from January 1, 2020, to May 31, 2020. Finally, the prediction results of the submodels are compared with those of the integrated model to verify the superiority of the integrated model. The research results of this paper can enrich the short-term passenger flow forecasting system of urban public transportation and provide effective data support and scientific basis for the passenger flow, vehicle management, and dispatch of urban public transportation.

2021 ◽  
Vol 13 (5) ◽  
pp. 1021
Author(s):  
Hu Ding ◽  
Jiaming Na ◽  
Shangjing Jiang ◽  
Jie Zhu ◽  
Kai Liu ◽  
...  

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.


Author(s):  
Irfan Ullah Khan ◽  
Nida Aslam ◽  
Malak Aljabri ◽  
Sumayh S. Aljameel ◽  
Mariam Moataz Aly Kamaleldin ◽  
...  

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.


Author(s):  
Eun Hak Lee ◽  
Kyoungtae Kim ◽  
Seung-Young Kho ◽  
Dong-Kyu Kim ◽  
Shin-Hyung Cho

As the share of public transport increases, the express strategy of the urban railway is regarded as one of the solutions that allow the public transportation system to operate efficiently. It is crucial to express the urban railway’s express strategy to balance a passenger load between the two types of trains, that is, local and express trains. This research aims to estimate passengers’ preference between local and express trains based on a machine learning technique. Extreme gradient boosting (XGBoost) is trained to model express train preference using smart card and train log data. The passengers are categorized into four types according to their preference for the local and express trains. The smart card data and train log data of Metro Line 9 in Seoul are combined to generate the individual trip chain alternatives for each passenger. With the dataset, the train preference is estimated by XGBoost, and Shapley additive explanations (SHAP) is used to interpret and analyze the importance of individual features. The overall F1 score of the model is estimated to be 0.982. The results of feature analysis show that the total travel time of the local train feature is found to substantially affect the probability of express train preference with a 1.871 SHAP value. As a result, the probability of the express train preference increases with longer total travel time, shorter in-vehicle time, shorter waiting time, and few transfers on the passenger’s route. The model shows notable performance in accuracy and provided an understanding of the estimation results.


2020 ◽  
Vol 34 (01) ◽  
pp. 808-816
Author(s):  
Zhicheng Liu ◽  
Fabio Miranda ◽  
Weiting Xiong ◽  
Junyan Yang ◽  
Qiao Wang ◽  
...  

Predicting commuting flows based on infrastructure and land-use information is critical for urban planning and public policy development. However, it is a challenging task given the complex patterns of commuting flows. Conventional models, such as gravity model, are mainly derived from physics principles and limited by their predictive power in real-world scenarios where many factors need to be considered. Meanwhile, most existing machine learning-based methods ignore the spatial correlations and fail to model the influence of nearby regions. To address these issues, we propose Geo-contextual Multitask Embedding Learner (GMEL), a model that captures the spatial correlations from geographic contextual information for commuting flow prediction. Specifically, we first construct a geo-adjacency network containing the geographic contextual information. Then, an attention mechanism is proposed based on the framework of graph attention network (GAT) to capture the spatial correlations and encode geographic contextual information to embedding space. Two separate GATs are used to model supply and demand characteristics. To enhance the effectiveness of the embedding representation, a multitask learning framework is used to introduce stronger restrictions, forcing the embeddings to encapsulate effective representation for flow prediction. Finally, a gradient boosting machine is trained based on the learned embeddings to predict commuting flows. We evaluate our model using real-world dataset from New York City and the experimental results demonstrate the effectiveness of our proposed method against the state of the art.


Vaccines ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 709
Author(s):  
Ivan Dimitrov ◽  
Nevena Zaharieva ◽  
Irini Doytchinova

The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 42946-42955 ◽  
Author(s):  
Jianyuan Guo ◽  
Zhen Xie ◽  
Yong Qin ◽  
Limin Jia ◽  
Yaguan Wang

Symmetry ◽  
2018 ◽  
Vol 10 (9) ◽  
pp. 369 ◽  
Author(s):  
Huawei Zhai ◽  
Licheng Cui ◽  
Yu Nie ◽  
Xiaowei Xu ◽  
Weishi Zhang

In order to meet the real-time public travel demands, the bus operators need to adjust the timetables in time. Therefore, it is necessary to predict the variations of the short-term passenger flow. Under the help of the advanced public transportation systems, a large amount of real-time data about passenger flow is collected from the automatic passenger counters, automatic fare collection systems, etc. Using these data, different kinds of methods are proposed to predict future variations of the short-term bus passenger flow. Based on the properties and background knowledge, these methods are classified into three categories: linear, nonlinear and combined methods. Their performances are evaluated in detail in the major aspects of the prediction accuracy, the complexity of training data structure and modeling process. For comparison, some long-term prediction methods are also analyzed simply. At last, it points that, with the help of automatic technology, a large amount of data about passenger flow will be collected, and using the big data technology to speed up the data preprocessing and modeling process may be one of the directions worthy of study in the future.


Sign in / Sign up

Export Citation Format

Share Document