Contextual Data Cleaning

Author(s):  
Morteza Alipour-Langouri ◽  
Zheng Zheng ◽  
Fei Chiang ◽  
Lukasz Golab ◽  
Jaroslaw Szlichta
2010 ◽  
Vol 21 (4) ◽  
pp. 632-643 ◽  
Author(s):  
Yu GU ◽  
Ge YU ◽  
Xiao-Long HU ◽  
Yi WANG
Keyword(s):  

2019 ◽  
Vol 15 (3) ◽  
pp. 79-100 ◽  
Author(s):  
Watanee Jearanaiwongkul ◽  
Frederic Andres ◽  
Chutiporn Anutariya

Nowadays, farmers can search for treatments for their plants using search engines and applications. Most existing works are developed in the form of rule-based question answering platforms. However, an observation could be incorrectly given by the farmer. This work recommends that diseases and treatments must be considered from a set of related observations. Thus, we develop a theoretical framework for systems to manage a farmer's observation data. We investigate and formalize desirable characteristics of such systems. The observation data is attached with a geolocation in which related contextual data is found. The framework is formalized based on algebra, in which required types and functions are identified. Its key characteristics are described by: (1) the defined type called warncons for representing observation data; (2) the similarity function for warncons; and (3) the warncons composition function for composing similar warncons. Finally, we show that the framework helps observation data to become richer and improve advice-finding.


2021 ◽  
Vol 25 (4) ◽  
pp. 763-787
Author(s):  
Alladoumbaye Ngueilbaye ◽  
Hongzhi Wang ◽  
Daouda Ahmat Mahamat ◽  
Ibrahim A. Elgendy ◽  
Sahalu B. Junaidu

Knowledge extraction, data mining, e-learning or web applications platforms use heterogeneous and distributed data. The proliferation of these multifaceted platforms faces many challenges such as high scalability, the coexistence of complex similarity metrics, and the requirement of data quality evaluation. In this study, an extended complete formal taxonomy and some algorithms that utilize in achieving the detection and correction of contextual data quality anomalies were developed and implemented on structured data. Our methods were effective in detecting and correcting more data anomalies than existing taxonomy techniques, and also highlighted the demerit of Support Vector Machine (SVM). These proposed techniques, therefore, will be of relevance in detection and correction of errors in large contextual data (Big data).


2021 ◽  
Vol 10 (2) ◽  
pp. 50
Author(s):  
Naomi Biegel ◽  
Karel Neels ◽  
Layla Van den Berg

Grandparents constitute an important source of childcare to many parents. Focusing on the Belgian context, this paper improves our understanding of childcare decision-making by investigating how formal childcare availability and availability of grandparents affect childcare arrangements. By means of multinomial regression models we simultaneously model uptake of formal and informal childcare by parents. Combining linked microdata from the Belgian censuses with contextual data on childcare at the level of municipalities, we consider formal childcare availability at a local level, while including a wide array of characteristics which may affect grandparental availability. Results indicate that increasing formal care crowds-out informal care as the sole care arrangement, whereas combined use of formal and informal care becomes more prevalent. Characteristics indicating a lack of grandmaternal availability increase uptake of formal care and inhibit to a lesser extent the uptake of combined formal and informal care. While increasing formal care substitutes informal care use, the lack of availability of informal care by grandparents may be problematic, particularly for those families most prone to use informal care.


Sign in / Sign up

Export Citation Format

Share Document