inconsistent data
Recently Published Documents


TOTAL DOCUMENTS

114
(FIVE YEARS 16)

H-INDEX

14
(FIVE YEARS 1)

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Samir Al-Janabi ◽  
Ryszard Janicki

PurposeData quality is a major challenge in data management. For organizations, the cleanliness of data is a significant problem that affects many business activities. Errors in data occur for different reasons, such as violation of business rules. However, because of the huge amount of data, manual cleaning alone is infeasible. Methods are required to repair and clean the dirty data through automatic detection, which are data quality issues to address. The purpose of this work is to extend the density-based data cleaning approach using conditional functional dependencies to achieve better data repair.Design/methodology/approachA set of conditional functional dependencies is introduced as an input to the density-based data cleaning algorithm. The algorithm repairs inconsistent data using this set.FindingsThis new approach was evaluated through experiments on real-world as well as synthetic datasets. The repair quality was determined using the F-measure. The results showed that the quality and scalability of the density-based data cleaning approach improved when conditional functional dependencies were introduced.Originality/valueConditional functional dependencies capture semantic errors among data values. This work demonstrates that the density-based data cleaning approach can be improved in terms of repairing inconsistent data by using conditional functional dependencies.


2021 ◽  
Vol 61 ◽  
pp. e20216117
Author(s):  
Gabriel Biffi ◽  
Marcia Marise Defraia ◽  
Carlos Campaner

The Museu de Zoologia da Universidade de São Paulo (MZUSP) houses an important Megalopodidae collection representing 144 species, especially from Brazil and the Neotropical fauna. The type specimens of some species have never been accessed since their original descriptions, and are thus unknown to a major part of the Megalopodidae researchers. Presented here is an illustrated catalogue of the megalopodid type specimens of 32 species deposited at MZUSP, featuring photos of habitus and labels, and complete label data of all the specimens, which originally belonged to Jacintho Guérin’s personal collection. Conflicting and inconsistent data provided in the literature and in the specimens’ labels are addressed. Taxa originally proposed as species varieties are here reaffirmed as valid, with subspecific rank. Agathomerus varians Monrós, 1945 and Plesioagathomerus vittatus Monrós, 1945, originally described as junior synonyms, are considered unavailable names. The subspecific epithet ngriapex is emended into Agathomerus bifasciatus nigrapex Guérin, 1949. An overview of the MZUSP Megalopodidae collection is presented with a history of the arrival of Guérin’s specimens.


2021 ◽  
Vol 192 ◽  
pp. 1265-1273
Author(s):  
José Miguel Blanco ◽  
Mouzhi Ge ◽  
Tomáš Pitner

2021 ◽  
Author(s):  
Fernando Elias Melo Borges ◽  
Danton Diego Ferreira ◽  
Antônio Carlos de Sousa Couto Júnior

The Rural Environmental Registry (CAR) consists of a mandatory public electronic registry for all rural properties in the Brazilian territory, integrates environmental information of the properties, assists the monitoring of them and the fight against deforestation. However, a large number of registrations are carried out erroneously generating inconsistent data, leading these to be canceled and/or to be requested to correct the registration. Performing automatic verification of these records is important to improve the processing of records. This paper proposes an automatic classification method to approve or cancel the CAR registers with interpretation of the classifications performed. For this, four machine learning-based classifiers were tested and the results were evaluated. The model with the best performance was used to interpret the classification using the Local Interpretable Model-agnostic Explanations (LIME) algorithm. The results showed the potential of the method in future real applications.


Author(s):  
Charlotte Lynch ◽  
Irene Reguilon ◽  
Deanna L Langer ◽  
Damon Lane ◽  
Prithwish De ◽  
...  

Abstract Objective To explore differences in position emission tomography-computed tomography (PET-CT) service provision internationally to further understand the impact variation may have upon cancer services. To identify areas of further exploration for researchers and policymakers to optimize PET-CT services and improve the quality of cancer services. Design Comparative analysis using data based on pre-defined PET-CT service metrics from PET-CT stakeholders across seven countries. This was further informed via document analysis of clinical indication guidance and expert consensus through round-table discussions of relevant PET-CT stakeholders. Descriptive comparative analyses were produced on use, capacity and indication guidance for PET-CT services between jurisdictions. Setting PET-CT services across 21 jurisdictions in seven countries (Australia, Denmark, Canada, Ireland, New Zealand, Norway and the UK). Participants None. Intervention(s) None. Main Outcome Measure(s) None. Results PET-CT service provision has grown over the period 2006–2017, but scale of increase in capacity and demand is variable. Clinical indication guidance varied across countries, particularly for small-cell lung cancer staging and the specific acknowledgement of gastric cancer within oesophagogastric cancers. There is limited and inconsistent data capture, coding, accessibility and availability of PET-CT activity across countries studied. Conclusions Variation in PET-CT scanner quantity, acquisition over time and guidance upon use exists internationally. There is a lack of routinely captured and accessible PET-CT data across the International Cancer Benchmarking Partnership countries due to inconsistent data definitions, data linkage issues, uncertain coverage of data and lack of specific coding. This is a barrier in improving the quality of PET-CT services globally. There needs to be greater, richer data capture of diagnostic and staging tools to facilitate learning of best practice and optimize cancer services.


2020 ◽  
pp. 0739456X2097170
Author(s):  
Gregory F. Randolph ◽  
Chandan Deuskar

How are urban population and its growth distributed across urban settlements of different sizes? In the Global South, this is a critical planning question, yet its answer is muddled by contradictory claims in the literature and inconsistent data. Using a new dataset measuring urbanization from 1975 to 2015, we find that urban populations in the South are less concentrated in megacities than in the North—contrary to conventional wisdom. Given an explosion in the number (not simply size) of urban settlements in the South, we suggest reviving the concept of “barefoot planning” as an approach for empowering communities beyond the metropolis to shape the urbanization process.


2020 ◽  
Vol 12 (9) ◽  
pp. 142
Author(s):  
Zhijun Wu ◽  
Bohua Cui

Aiming at the problem of low interconnection efficiency caused by the wide variety of data in SWIM (System-Wide Information Management) and the inconsistent data naming methods, this paper proposes a new TLC (Type-Length-Content) structure hybrid data naming scheme combined with Bloom filters. This solution can meet the uniqueness and durability requirements of SWIM data names, solve the “suffix loopholes” encountered in prefix-based route aggregation in hierarchical naming, and realize scalable and effective route state aggregation. Simulation verification results show that the hybrid naming scheme is better than prefix-based aggregation in the probability of route identification errors. In terms of search time, this scheme has increased by 17.8% and 18.2%, respectively, compared with the commonly used hierarchical and flat naming methods. Compared with the other two naming methods, scalability has increased by 19.1% and 18.4%, respectively.


2020 ◽  
Vol 391 ◽  
pp. 52-64
Author(s):  
Zhixin Qi ◽  
Hongzhi Wang ◽  
Tao He ◽  
Jianzhong Li ◽  
Hong Gao

Sign in / Sign up

Export Citation Format

Share Document