Recovering the Association Between Unlinked Fare Machines and Stations Using Automated Fare Collection Data in Metro Systems

Data quality is the foundation of data-driven applications in transportation. Data problems such as missing and invalid data could sharply reduce the performance of the methods used in these applications. Although there exist plenty of studies related to data quality issues, they only focus on missing or invalid data caused by infrastructure failures (e.g., loop detector malfunction). In general, there is a lack of attention to data quality issues from insufficient data management. This paper proposes a tensor decomposition based framework to tackle a specific missing data problem which occurs when the machine-station dictionary of an automated fare collection system database is incomplete. In such cases, there is a large amount of loss of origin/destination information as the affected machines are not linked to any station. Consequently, all associated transactions may miss the origin/destination information. The proposed framework recovers the dictionary by capturing features of the passenger flow passing through the unlinked fare machine. Evaluation results show that the proposed approach could recover the missing data with high accuracy even when several fare machines are not linked to a station. The framework could also support other beneficial applications.

Download Full-text

Missing Data Problem in the Monitoring System: A Review

IEEE Sensors Journal ◽

10.1109/jsen.2020.3009265 ◽

2020 ◽

Vol 20 (23) ◽

pp. 13984-13998

Author(s):

Jinghan Du ◽

Minghua Hu ◽

Weining Zhang

Keyword(s):

Missing Data ◽

Monitoring System ◽

System A ◽

Missing Data Problem ◽

Data Problem

Download Full-text

A novel classifier modification approach to missing data problem for noisy speech recognition

7'th International Symposium on Telecommunications (IST'2014) ◽

10.1109/istel.2014.7000747 ◽

2014 ◽

Cited By ~ 1

Author(s):

Kian Ebrahim Kafoori ◽

Seyed Mohammad Ahadi

Keyword(s):

Speech Recognition ◽

Missing Data ◽

Noisy Speech ◽

Missing Data Problem ◽

Data Problem ◽

Noisy Speech Recognition

Download Full-text

Nachwort: Die Dunkle Seite ist ein „Missing Data“-Problem

Bad Science ◽

10.15358/9783800660292-135 ◽

2020 ◽

pp. 135-138

Author(s):

Florian Meinfelder ◽

Rebekka Kluge

Keyword(s):

Missing Data ◽

Missing Data Problem ◽

Data Problem

Download Full-text

PRM223 - THE MISSING DATA PROBLEM: USING PROPENSITY SCORES TO ESTIMATE NON-RANDOMISED TREATMENT EFFECTS WITH MISSING COVARIATE DATA

Value in Health ◽

10.1016/j.jval.2018.09.2341 ◽

2018 ◽

Vol 21 ◽

pp. S394

Author(s):

L Rasouliyan ◽

E Plana ◽

D Martínez ◽

J Aguado ◽

R Ziemiecki

Keyword(s):

Missing Data ◽

Treatment Effects ◽

Propensity Scores ◽

Covariate Data ◽

Missing Covariate Data ◽

Missing Data Problem ◽

Data Problem

Download Full-text

Assessing the missing data problem in criminal network analysis using forensic DNA data

Social Networks ◽

10.1016/j.socnet.2019.09.003 ◽

2020 ◽

Vol 61 ◽

pp. 99-106

Author(s):

Sabine De Moor ◽

Christophe Vandeviver ◽

Tom Vander Beken

Keyword(s):

Missing Data ◽

Network Analysis ◽

Forensic Dna ◽

Criminal Network ◽

Missing Data Problem ◽

Data Problem ◽

Criminal Network Analysis

Download Full-text

Economic Aspects of the Missing Data Problem – the Case of the Patient Registry

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun201765051779 ◽

2017 ◽

Vol 65 (5) ◽

pp. 1779-1791

Author(s):

Hatice Uenal ◽

David Hampel

Keyword(s):

Missing Data ◽

Data Quality ◽

Quality Analysis ◽

Quality Of Data ◽

Quality Costs ◽

Missing Data Problem ◽

Study Results ◽

The Cost ◽

Cost Factors

Registries are indispensable in medical studies and provide the basis for reliable study results for research questions. Depending on the purpose of use, a high quality of data is a prerequisite. However, with increasing registry quality, costs also increase accordingly. Considering these time and cost factors, this work is an attempt to estimate the cost advantages of applying statistical tools to existing registry data, including quality evaluation. Results for quality analysis showed that there are unquestionable savings of millions in study costs by reducing the time horizon and saving on average € 523,126 for every reduced year. Replacing additionally the over 25 % missing data in some variables, data quality was immensely improved. To conclude, our findings showed dearly the importance of data quality and statistical input in avoiding biased conclusions due to incomplete data.

Download Full-text