HADCLEAN: A hybrid approach to data cleaning in data warehouses

Author(s):  
Arindam Paul ◽  
Varuni Ganesan ◽  
Jagat Sesh Challa ◽  
Yashvardhan Sharma
Author(s):  
A. Anny Leema ◽  
M. Hemalatha

Radio Frequency Identification (RFID) refers to wireless technology that uses radio waves to automatically identify items within a certain proximity. It is being widely used in various applications, but there is reluctance in the deployment of RFID due to the high cost involved and the challenging problems found in the observed colossal RFID data. The obtained data is of low quality and contains anomalies like false positives, false negatives, and duplication. To enhance the quality of data, cleaning is the essential task, so that the resultant data can be applied for high-end applications. This chapter investigates the existing physical, middleware, and deferred approaches to deal with the anomalies found in the RFID data. A novel hybrid approach is developed to solve data quality issues so that the demand for RFID data will certainly grow to meet the user needs.


2019 ◽  
Vol 120 ◽  
pp. 60-82
Author(s):  
Zoubir Ouaret ◽  
Doulkifli Boukraa ◽  
Omar Boussaid ◽  
Rachid Chalal

2008 ◽  
pp. 530-555
Author(s):  
Laura Irina Rusu ◽  
J. Wenny Rahayu ◽  
David Taniar

Developing a data warehouse for XML documents involves two major processes: one of creating it, by processing XML raw documents into a specified data warehouse repository; and the other of querying it, by applying techniques to better answer users’ queries. This paper focuses on the first part; that is identifying a systematic approach for building a data warehouse of XML documents, specifically for transferring data from an underlying XML database into a defined XML data warehouse. The proposed methodology on building XML data warehouses covers processes including data cleaning and integration, summarization, intermediate XML documents, and updating/linking existing documents and creating fact tables. In this paper, we also present a case study on how to put this methodology into practice. We utilise the XQuery technology in all of the above processes.


Author(s):  
William E. Winkler

Fayyad and Uthursamy (2002) have stated that the majority of the work (representing months or years) in creating a data warehouse is in cleaning up duplicates and resolving other anomalies. This article provides an overview of two methods for improving quality. The first is data cleaning for finding duplicates within files or across files. The second is edit/imputation for maintaining business rules and for filling in missing data. The fastest data-cleaning methods are suitable for files with hundreds of millions of records (Winkler, 1999b, 2003b). The fastest edit/imputation methods are suitable for files with millions of records (Winkler, 1999a, 2004b).


VASA ◽  
2016 ◽  
Vol 45 (5) ◽  
pp. 417-422 ◽  
Author(s):  
Anouk Grandjean ◽  
Katia Iglesias ◽  
Céline Dubuis ◽  
Sébastien Déglise ◽  
Jean-Marc Corpataux ◽  
...  

Abstract. Background: Multilevel peripheral arterial disease is frequently observed in patients with intermittent claudication or critical limb ischemia. This report evaluates the efficacy of one-stage hybrid revascularization in patients with multilevel arterial peripheral disease. Patients and methods: A retrospective analysis of a prospective database included all consecutive patients treated by a hybrid approach for a multilevel arterial peripheral disease. The primary outcome was the patency rate at 6 months and 1 year. Secondary outcomes were early and midterm complication rate, limb salvage and mortality rate. Statistical analysis, including a Kaplan-Meier estimate and univariate and multivariate Cox regression analyses were carried out with the primary, primary assisted and secondary patency, comparing the impact of various risk factors in pre- and post-operative treatments. Results: 64 patients were included in the study, with a mean follow-up time of 428 days (range: 4 − 1140). The technical success rate was 100 %. The primary, primary assisted and secondary patency rates at 1 year were 39 %, 66 % and 81 %, respectively. The limb-salvage rate was 94 %. The early mortality rate was 3.1 %. Early and midterm complication rates were 15.4 % and 6.4 %, respectively. The early mortality rate was 3.1 %. Conclusions: The hybrid approach is a major alternative in the treatment of peripheral arterial disease in multilevel disease and comorbid patients, with low complication and mortality rates and a high limb-salvage rate.


2011 ◽  
Vol 14 (1) ◽  
pp. 67 ◽  
Author(s):  
Ireneusz Haponiuk ◽  
Maciej Chojnicki ◽  
Radosaw Jaworski ◽  
Jacek Juciski ◽  
Mariusz Steffek ◽  
...  

There are several strategies of surgical approach for the repair of multiple muscular ventricular septal defects (mVSDs), but none leads to a fully predictable, satisfactory therapeutic outcome in infants. We followed a concept of treating multiple mVSDs consisting of a hybrid approach based on intraoperative perventricular implantation of occluding devices. In this report, we describe a 2-step procedure consisting of a final hybrid approach for multiple mVSDs in the infant following initial coarctation repair with pulmonary artery banding in the newborn. At 7 months, sternotomy and debanding were performed, the right ventricle was punctured under transesophageal echocardiographic guidance, and the 8-mm device was implanted into the septal defect. Color Doppler echocardiography results showed complete closure of all VSDs by 11 months after surgery, probably via a mechanism of a localized inflammatory response reaction, ventricular septum growth, and implant endothelization.


Controlling ◽  
2003 ◽  
Vol 15 (6) ◽  
pp. 323-330 ◽  
Author(s):  
Jürgen Propach ◽  
Svend Reuse
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document