scholarly journals Quality of Open Research Data: Values, Convergences and Governance

Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 175 ◽  
Author(s):  
Tibor Koltay

This paper focuses on the characteristics of research data quality, and aims to cover the most important issues related to it, giving particular attention to its attributes and to data governance. The corporate word’s considerable interest in the quality of data is obvious in several thoughts and issues reported in business-related publications, even if there are apparent differences between values and approaches to data in corporate and in academic (research) environments. The paper also takes into consideration that addressing data quality would be unimaginable without considering big data.

2021 ◽  
Vol 23 (06) ◽  
pp. 1011-1018
Author(s):  
Aishrith P Rao ◽  
◽  
Raghavendra J C ◽  
Dr. Sowmyarani C N ◽  
Dr. Padmashree T ◽  
...  

With the advancement of technology and the large volume of data produced, processed, and stored, it is becoming increasingly important to maintain the quality of data in a cost-effective and productive manner. The most important aspects of Big Data (BD) are storage, processing, privacy, and analytics. The Big Data group has identified quality as a critical aspect of its maturity. Nonetheless, it is a critical approach that should be adopted early in the lifecycle and gradually extended to other primary processes. Companies are very reliant and drive profits from the huge amounts of data they collect. When its consistency deteriorates, the ramifications are uncertain and may result in completely undesirable conclusions. In the sense of BD, determining data quality is difficult, but it is essential that we uphold the data quality before we can proceed with any analytics. We investigate data quality during the stages of data gathering, preprocessing, data repository, and evaluation/analysis of BD processing in this paper. The related solutions are also suggested based on the elaboration and review of the proposed problems.


2021 ◽  
Vol 13 (3) ◽  
pp. 1-15
Author(s):  
Rada Chirkova ◽  
Jon Doyle ◽  
Juan Reutter

Assessing and improving the quality of data are fundamental challenges in Big-Data applications. These challenges have given rise to numerous solutions targeting transformation, integration, and cleaning of data. However, while schema design, data cleaning, and data migration are nowadays reasonably well understood in isolation, not much attention has been given to the interplay between standalone tools in these areas. In this article, we focus on the problem of determining whether the available data-transforming procedures can be used together to bring about the desired quality characteristics of the data in business or analytics processes. For example, to help an organization avoid building a data-quality solution from scratch when facing a new analytics task, we ask whether the data quality can be improved by reusing the tools that are already available, and if so, which tools to apply, and in which order, all without presuming knowledge of the internals of the tools, which may be external or proprietary. Toward addressing this problem, we conduct a formal study in which individual data cleaning, data migration, or other data-transforming tools are abstracted as black-box procedures with only some of the properties exposed, such as their applicability requirements, the parts of the data that the procedure modifies, and the conditions that the data satisfy once the procedure has been applied. As a proof of concept, we provide foundational results on sequential applications of procedures abstracted in this way, to achieve prespecified data-quality objectives, for the use case of relational data and for procedures described by standard relational constraints. We show that, while reasoning in this framework may be computationally infeasible in general, there exist well-behaved cases in which these foundational results can be applied in practice for achieving desired data-quality results on Big Data.


2017 ◽  
Vol 4 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Diana Effendi

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The  process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management.   Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.


2021 ◽  
pp. 004912412199553
Author(s):  
Jan-Lucas Schanze

An increasing age of respondents and cognitive impairment are usual suspects for increasing difficulties in survey interviews and a decreasing data quality. This is why survey researchers tend to label residents in retirement and nursing homes as hard-to-interview and exclude them from most social surveys. In this article, I examine to what extent this label is justified and whether quality of data collected among residents in institutions for the elderly really differs from data collected within private households. For this purpose, I analyze the response behavior and quality indicators in three waves of Survey of Health, Ageing and Retirement in Europe. To control for confounding variables, I use propensity score matching to identify respondents in private households who share similar characteristics with institutionalized residents. My results confirm that most indicators of response behavior and data quality are worse in institutions compared to private households. However, when controlling for sociodemographic and health-related variables, differences get very small. These results suggest the importance of health for the data quality irrespective of the housing situation.


Author(s):  
Christopher D O’Connor ◽  
John Ng ◽  
Dallas Hill ◽  
Tyler Frederick

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.


Author(s):  
Marco Angrisani ◽  
Anya Samek ◽  
Arie Kapteyn

The number of data sources available for academic research on retirement economics and policy has increased rapidly in the past two decades. Data quality and comparability across studies have also improved considerably, with survey questionnaires progressively converging towards common ways of eliciting the same measurable concepts. Probability-based Internet panels have become a more accepted and recognized tool to obtain research data, allowing for fast, flexible, and cost-effective data collection compared to more traditional modes such as in-person and phone interviews. In an era of big data, academic research has also increasingly been able to access administrative records (e.g., Kostøl and Mogstad, 2014; Cesarini et al., 2016), private-sector financial records (e.g., Gelman et al., 2014), and administrative data married with surveys (Ameriks et al., 2020), to answer questions that could not be successfully tackled otherwise.


2021 ◽  
Vol 27 (3) ◽  
pp. 8-34
Author(s):  
Tatyana Cherkashina

The article presents the experience of converting non-targeted administrative data into research data, using as an example data on the income and property of deputies from local legislative bodies of the Russian Federation for 2019, collected as part of anticorruption operations. This particular empirical fragment was selected for the pilot study of administrative data, which includes assessing the possibility of integrating scattered fragments of information into a single database, assessing quality of data and their relevance for solving research problems, particularly analysis of high-income strata and the apparent trends towards individualization of private property. The system of indicators for assessing data quality includes their timeliness, availability, interpretability, reliability, comparability, coherence, errors of representation and measurement, and relevance. In the case of the data set in question, measurement errors are more common than representation errors. Overall the article emphasizes the notion that introducing new non-target data into circulation requires their preliminary testing, while data quality assessment becomes distributed both in time and between different subjects. The transition from created data to «obtained» data shifts the functions of evaluating its quality from the researcher-creator to the researcheruser. And though in this case data quality is in part ensured by the legal support for their production, the transformation of administrative data into research data involves assessing a variety of quality measurements — from availability to uniformity and accuracy.


2008 ◽  
Vol 13 (5) ◽  
pp. 378-389 ◽  
Author(s):  
Xiaohua Douglas Zhang ◽  
Amy S. Espeseth ◽  
Eric N. Johnson ◽  
Jayne Chin ◽  
Adam Gates ◽  
...  

RNA interference (RNAi) not only plays an important role in drug discovery but can also be developed directly into drugs. RNAi high-throughput screening (HTS) biotechnology allows us to conduct genome-wide RNAi research. A central challenge in genome-wide RNAi research is to integrate both experimental and computational approaches to obtain high quality RNAi HTS assays. Based on our daily practice in RNAi HTS experiments, we propose the implementation of 3 experimental and analytic processes to improve the quality of data from RNAi HTS biotechnology: (1) select effective biological controls; (2) adopt appropriate plate designs to display and/or adjust for systematic errors of measurement; and (3) use effective analytic metrics to assess data quality. The applications in 5 real RNAi HTS experiments demonstrate the effectiveness of integrating these processes to improve data quality. Due to the effectiveness in improving data quality in RNAi HTS experiments, the methods and guidelines contained in the 3 experimental and analytic processes are likely to have broad utility in genome-wide RNAi research. ( Journal of Biomolecular Screening 2008:378-389)


Tunas Agraria ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 168-174
Author(s):  
Maslusatun Mawadah

The South Jakarta Administrative City Land Office is one of the cities targeted to be a city with complete land administration in 2020. The current condition of land parcel data demands an update, namely improving the quality of data from KW1 to KW6 towards KW1 valid. The purpose of this study is to determine the condition of land data quality in South Jakarta, the implementation of data quality improvement, as well as problems and solutions in implementing data quality improvement. The research method used is qualitative with a descriptive approach. The results showed that the condition of the data quality after the implementation of the improvement, namely KW1 increased from 86.45% to 87.01%. The roles of man, material, machine, and method have been fulfilled and the implementation of data quality improvement is not in accordance with the 2019 Complete City Guidelines in terms of territorial boundary inventory, and there are still obstacles in the implementation of improving the quality of land parcel data, namely the absence of buku tanah, surat ukur, and gambar ukur at the land office, the existence of regional division, the boundaries of the sub district are not yet certain, and the existence of land parcels that have been separated from mapping without being noticed by the office administrator.


Author(s):  
Kamalendu Pal

Global retail business has become diverse and latest Information Technology (IT) advancements have created new possibilities for the management of the deluge of data generated by world-wide business operations of its supply chain. In this business, external data from social media and supplier networks provide a huge influx to augment existing data. This is combined with data from sensors and intelligent machines, commonly known as Internet of Things (IoT) data. This data, originating from the global retail supply chain, is simply known as Big Data - because of its enormous volume, the velocity with which it arrives in the global retail business environment, its veracity to quality related issues, and values it generates for the global supply chain. Many retail products manufacturing companies are trying to find ways to enhance their quality of operational performance while reducing business support costs. They do this primarily by improving defect tracking and better forecasting. These manufacturing and operational improvements along with a favorable customer experience remain crucil to thriving in global competition. In recent years, Big Data and its associated technologies are attracting huge research interest with academics, industry practitioners, and government agencies. Big Data-based software applications are widely used within retail supply chain management - in recommendation, prediction, and decision support systems. The spectacular growth of these software systems has enormous potential for improving the daily performance of retail product and service companies. However, there are increasingly data quality problems resulting in erroneous tesing costs in retail Supply Chain Management (SCM). The heavy investment made in Big Data-based software applications puts increasing pressure on management to justify the quality assurance in these software systems. This chapter discusses about data quality and the dimensions of data quality for Big Data applications. It also examines some of the challenges presented by managing the quality and governance of Big Data, and how those can be balanced with the need of delivery usable Big Data-based software systems. Finally, the chapter highlights the importance of data governance; and it also includes some of the Big Data managerial practice related issues and their justifications for achieving application software quality assurance.


Sign in / Sign up

Export Citation Format

Share Document