Enriching Activity-Based Models using Smartphone-Based Travel Surveys

Smartphone-based travel surveys have attracted much attention recently, for their potential to improve data quality and response rate. One of the first such survey systems, Future Mobility Sensing (FMS), leverages sensors on smartphones, and machine learning techniques to collect detailed personal travel data. The main purpose of this research is to compare data collected by FMS and traditional methods, and study the implications of using FMS data for travel behavior modeling. Since its initial field test in Singapore, FMS has been used in several large-scale household travel surveys, including one in Tel Aviv, Israel. We present comparative analyses that make use of the rich datasets from Singapore and Tel Aviv, focusing on three main aspects: (1) richness in activity behaviors observed, (2) completeness of travel and activity data, and (3) data accuracy. Results show that FMS has clear advantages over traditional travel surveys: it has higher resolution and better accuracy of times, locations, and paths; FMS represents out-of-work and leisure activities well; and reveals large variability in day-to-day activity pattern, which is inadequately captured in a one-day snapshot in typical traditional surveys. FMS also captures travel and activities that tend to be under-reported in traditional surveys such as multiple stops in a tour and work-based sub-tours. These richer and more complete and accurate data can improve future activity-based modeling.

Download Full-text

Faculty Opinions recommendation of Large-scale physical activity data reveal worldwide activity inequality.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727795643.793534116 ◽

2017 ◽

Author(s):

Mark A Febbraio

Keyword(s):

Physical Activity ◽

Large Scale ◽

Activity Data ◽

Physical Activity Data

Download Full-text

The Characteristics of Leisure Activities and the Built Environment Influences in Large-scale Social Housing Communities in China: The Case Study of Shanghai and Nanjing

Journal of Asian Architecture and Building Engineering ◽

10.1080/13467581.2021.1906257 ◽

2021 ◽

Author(s):

Lingling Zhang ◽

Zihao Wu

Keyword(s):

Built Environment ◽

Social Housing ◽

Large Scale ◽

Leisure Activities

Download Full-text

Using Bus Ticketing Big Data to Investigate the Behaviors of the Population Flow of Chinese Suburban Residents in the Post-COVID-19 Phase

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18116066 ◽

2021 ◽

Vol 18 (11) ◽

pp. 6066

Author(s):

Yanbing Bai ◽

Lu Sun ◽

Haoyu Liu ◽

Chao Xie

Keyword(s):

Travel Behavior ◽

Large Scale ◽

Behavior Patterns ◽

Population Movements ◽

Economic Level ◽

Passenger Travel ◽

Scale Population ◽

Travel Behaviors ◽

Railway Passenger ◽

And Control

Large-scale population movements can turn local diseases into widespread epidemics. Grasping the characteristic of the population flow in the context of the COVID-19 is of great significance for providing information to epidemiology and formulating scientific and reasonable prevention and control policies. Especially in the post-COVID-19 phase, it is essential to maintain the achievement of the fight against the epidemic. Previous research focuses on flight and railway passenger travel behavior and patterns, but China also has numerous suburban residents with a not-high economic level; investigating their travel behaviors is significant for national stability. However, estimating the impacts of the COVID-19 for suburban residents’ travel behaviors remains challenging because of lacking apposite data. Here we submit bus ticketing data including approximately 26,000,000 records from April 2020–August 2020 for 2705 stations. Our results indicate that Suburban residents in Chinese Southern regions are more likely to travel by bus, and travel frequency is higher. Associated with the economic level, we find that residents in the economically developed region more likely to travel or carry out various social activities. Considering from the perspective of the traveling crowd, we find that men and young people are easier to travel by bus; however, they are exactly the main workforce. The indication of our findings is that suburban residents’ travel behavior is affected profoundly by economy and consistent with the inherent behavior patterns before the COVID-19 outbreak. We use typical regions as verification and it is indeed the case.

Download Full-text

Exploiting Social Networks for Large-Scale Human Behavior Modeling

IEEE Pervasive Computing ◽

10.1109/mprv.2011.70 ◽

2011 ◽

Vol 10 (4) ◽

pp. 45-53 ◽

Cited By ~ 29

Author(s):

Nicholas D. Lane ◽

Ye Xu ◽

Hong Lu ◽

Andrew T. Campbell ◽

Tanzeem Choudhury ◽

...

Keyword(s):

Social Networks ◽

Human Behavior ◽

Large Scale ◽

Behavior Modeling ◽

Human Behavior Modeling

Download Full-text

Statistical Characteristics and Correlation Analysis for Control Index of Soil-Aggregate Mixture

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.718-720.1872 ◽

2013 ◽

Vol 718-720 ◽

pp. 1872-1877 ◽

Cited By ~ 1

Author(s):

Xu Xi Chang ◽

Xie Jian Ming ◽

Jiang Ling Fa ◽

Chen Shan Xiong

Keyword(s):

Grain Size ◽

Correlation Analysis ◽

Normal Distribution ◽

Large Scale ◽

Soil Aggregate ◽

Site Preparation ◽

Statistical Characteristics ◽

Coarse Grain ◽

Systematic Research ◽

Large Variability

Currently, the soil-aggregate mixture has been widely used in some large-scale site preparation projects, compaction characteristics has been pay more attention by many engineers and researchers. However, systematic research is insufficient on how to choose the filler. Moreover, some industry regulations are different on the requirements about filler. This paper relies on a certain big site preparation projects, discussing statistical characteristics and correlation on the maximal grain size, contents of the coarse grain, gradation and other parameters of soil-aggregate mixture. The results show that the maximal and the median grain size have small discreteness and normal distribution, indicating site filler is easy to reach the requirement; The coefficient of curvature, coefficient of nonuniformity and the coarse grain content have large discreteness, and dont obey normal distribution, indicating the filler has large variability. The median grain size is highly relevant to the coarse grain content; the maximal grain size isnt relevant to the coefficient of nonuniformity, the coefficient of curvature and the coarse grain content. According to the results of correlation analysis, we suggest that the importance order follow by coarse grain content, the maximum grain size and gradation for the control parameters of filler. This research may be significant to other similar projects.

Download Full-text

Different firm responses to the COVID-19 pandemic shocks: machine-learning evidence on the Vietnamese labor market

International Journal of Emerging Markets ◽

10.1108/ijoem-02-2021-0292 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lam Hoang Viet Le ◽

Toan Luu Duc Huynh ◽

Bryan S. Weber ◽

Bao Khac Quoc Nguyen

Keyword(s):

Machine Learning ◽

Labor Market ◽

Large Scale ◽

Government Support ◽

Policy Implications ◽

Machine Learning Techniques ◽

Firm Characteristics ◽

Data Set ◽

Content Type ◽

Firm Responses

PurposeThis paper aims to identify the disproportionate impacts of the COVID-19 pandemic on labor markets.Design/methodology/approachThe authors conduct a large-scale survey on 16,000 firms from 82 industries in Ho Chi Minh City, Vietnam, and analyze the data set by using different machine-learning methods.FindingsFirst, job loss and reduction in state-owned enterprises have been significantly larger than in other types of organizations. Second, employees of foreign direct investment enterprises suffer a significantly lower labor income than those of other groups. Third, the adverse effects of the COVID-19 pandemic on the labor market are heterogeneous across industries and geographies. Finally, firms with high revenue in 2019 are more likely to adopt preventive measures, including the reduction of labor forces. The authors also find a significant correlation between firms' revenue and labor reduction as traditional econometrics and machine-learning techniques suggest.Originality/valueThis study has two main policy implications. First, although government support through taxes has been provided, the authors highlight evidence that there may be some additional benefit from targeting firms that have characteristics associated with layoffs or other negative labor responses. Second, the authors provide information that shows which firm characteristics are associated with particular labor market responses such as layoffs, which may help target stimulus packages. Although the COVID-19 pandemic affects most industries and occupations, heterogeneous firm responses suggest that there could be several varieties of targeted policies-targeting firms that are likely to reduce labor forces or firms likely to face reduced revenue. In this paper, the authors outline several industries and firm characteristics which appear to more directly be reducing employee counts or having negative labor responses which may lead to more cost–effect stimulus.

Download Full-text

Artificial Intelligence-based Approach For Electric Vehicle Travel Behavior Modeling

Electric Vehicles in Energy Systems ◽

10.1007/978-3-030-34448-1_2 ◽

2020 ◽

pp. 21-46

Author(s):

Hamidreza Jahangir ◽

Masoud Aliakbar Golkar ◽

Ali Ahmadian ◽

Ali Elkamel

Keyword(s):

Artificial Intelligence ◽

Electric Vehicle ◽

Travel Behavior ◽

Behavior Modeling ◽

Travel Behavior Modeling

Download Full-text

Machine Learning-Based Prediction Model for Papillary Thyroid Carcinoma Recurrence

10.21203/rs.3.rs-113105/v1 ◽

2020 ◽

Author(s):

Young Min Park ◽

Byung-Joo Lee

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Tumor Size ◽

Large Scale ◽

Prediction Models ◽

Prognostic Significance ◽

Disease Recurrence ◽

Machine Learning Techniques ◽

Papillary Thyroid ◽

Recurrence Prediction

Abstract Background: This study analyzed the prognostic significance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model using machine learning techniques.Methods: We retrospectively analyzed clinico-pathologic data from 1040 patients diagnosed with papillary thyroid cancer between 2003 and 2009. Results: We analyzed clinico-pathologic factors related to recurrence through logistic regression analysis. Among the factors that we included, only sex and tumor size were significantly correlated with disease recurrence. Parameters such as age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LNs metastasis, number of metastatic LNs, and LNR were input for construction of a machine learning prediction model. The performance of five machine learning models related to recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy at 95%, and the lightGBM and stacking model together showed 93% accuracy. Conclusions: We confirmed that all machine learning prediction models showed an accuracy of 90% or more for predicting disease recurrence in PTC. Large-scale multicenter clinical studies should be performed to improve the performance of our prediction models and verify their clinical effectiveness.

Download Full-text

Blue-collared Workers’ Travel Behavior Modeling using “exPlainable” Machine Learning Model: The Case of Qatar

10.29117/quarfe.2021.0198 ◽

2021 ◽

Author(s):

Aya Alkhereibi ◽

Ali AbuZaid ◽

Tadesse Wakjira

Keyword(s):

Machine Learning ◽

Travel Behavior ◽

Total Population ◽

Performance Metrics ◽

Predictive Accuracy ◽

Mode Choice ◽

Behavior Modeling ◽

Significant Feature ◽

Machine Learning Model ◽

Occupation Level

This paper presents a novel study on the examination of explainable machine learning (ML) technique to predict the mode choice for communities with a majority of blue-collared workers. A total of 4875 trip records for 1050 blue-collared workers have been used to predict their travel mode choices based on 11 trips and socio-economic attributes. The data used in this paper are obtained from the Ministry of Transportation and Communication (MoTC), which targeted blue-collared workers as they represent 89% of the total population in the State of Qatar. A total of four ML models are evaluated to propose the best predictive model. The four models were examined using different performance metrics. The models’ prediction results showed that the random forest (RF) model had the highest accuracy with a predictive accuracy of 0.97. Moreover, SHapley Additive exPlanation (SHAP) approach is used to investigate the significance of the input features and explain the output of the RF model. The results of SHAP analysis revealed that occupation level is the most significant feature that influences the mode choice followed by occupation section, arrival time, and arrival municipality.

Download Full-text

Deep Imputation on Large-Scale Drug Discovery Data

10.22541/au.161111205.55340339/v2 ◽

2021 ◽

Author(s):

Benedict Irwin ◽

Thomas Whitehead ◽

Scott Rowland ◽

Samar Mahmoud ◽

Gareth Conduit ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

High Throughput Screening ◽

Large Scale ◽

Early Stage ◽

Quantitative Structure Activity Relationship ◽

Biological Properties ◽

Data Repository ◽

Activity Data ◽

Predicted Values

More accurate predictions of the biological properties of chemical compounds would guide the selection and design of new compounds in drug discovery and help to address the enormous cost and low success-rate of pharmaceutical R&D. However this domain presents a significant challenge for AI methods due to the sparsity of compound data and the noise inherent in results from biological experiments. In this paper, we demonstrate how data imputation using deep learning provides substantial improvements over quantitative structure-activity relationship (QSAR) machine learning models that are widely applied in drug discovery. We present the largest-to-date successful application of deep-learning imputation to datasets which are comparable in size to the corporate data repository of a pharmaceutical company (678,994 compounds by 1166 endpoints). We demonstrate this improvement for three areas of practical application linked to distinct use cases; i) target activity data compiled from a range of drug discovery projects, ii) a high value and heterogeneous dataset covering complex absorption, distribution, metabolism and elimination properties and, iii) high throughput screening data, testing the algorithm’s limits on early-stage noisy and very sparse data. Achieving median coefficients of determination, R, of 0.69, 0.36 and 0.43 respectively across these applications, the deep learning imputation method offers an unambiguous improvement over random forest QSAR methods, which achieve median R values of 0.28, 0.19 and 0.23 respectively. We also demonstrate that robust estimates of the uncertainties in the predicted values correlate strongly with the accuracies in prediction, enabling greater confidence in decision-making based on the imputed values.

Download Full-text