2017 ◽  
Vol 20 (4) ◽  
pp. 3617-3628 ◽  
Author(s):  
Aziz Nasridinov ◽  
Jong-Hyeok Choi ◽  
Young-Ho Park

Author(s):  
Minh Pham ◽  
Craig A. Knoblock ◽  
Muhao Chen ◽  
Binh Vu ◽  
Jay Pujara

Error detection is one of the most important steps in data cleaning and usually requires extensive human interaction to ensure quality. Existing supervised methods in error detection require a significant amount of training data while unsupervised methods rely on fixed inductive biases, which are usually hard to generalize, to solve the problem. In this paper, we present SPADE, a novel semi-supervised probabilistic approach for error detection. SPADE introduces a novel probabilistic active learning model, where the system suggests examples to be labeled based on the agreements between user labels and indicative signals, which are designed to capture potential errors. SPADE uses a two-phase data augmentation process to enrich a dataset before training a deep learning classifier to detect unlabeled errors. In our evaluation, SPADE achieves an average F1-score of 0.91 over five datasets and yields a 10% improvement compared with the state-of-the-art systems.


Author(s):  
Yu-Bin Yang ◽  
Hui Lin

This chapter presents an automatic meteorological data mining system based on analyzing and mining heterogeneous remote sensed image datasets, with which it is possible to forecast potential rainstorms in advance. A two-phase data mining method employing machine learning techniques, including the C4.5 decision tree algorithm and dependency network analysis, is proposed, by which a group of derivation rules and a conceptual model for metrological environment factors are generated to assist the automatic weather forecasting task. Experimental results have shown that the system reduces the heavy workload of manual weather forecasting and provides meaningful interpretations to the forecasted results.


Author(s):  
Balázs Rácz ◽  
Csaba István Sidló ◽  
András Lukács ◽  
András A. Benczúr

Sign in / Sign up

Export Citation Format

Share Document