Evaluating the Quality and Usefulness of Data Breach Information Systems

Author(s):  
Benjamin Ngugi ◽  
Jafar Mana ◽  
Lydia Segal

As the nation confronts a growing tide of security breaches, the importance of having quality data breach information systems becomes paramount. Yet too little attention is paid to evaluating these systems. This article draws on data quality scholarship to develop a yardstick that assesses the quality of data breach notification systems in the U.S. at both the state and national levels from the perspective of key stakeholders, who include law enforcement agencies, consumers, shareholders, investors, researchers, and businesses that sell security products. Findings reveal major shortcomings that reduce the value of data breach information to these stakeholders. The study concludes with detailed recommendations for reform.

2011 ◽  
Vol 5 (4) ◽  
pp. 31-46 ◽  
Author(s):  
Benjamin Ngugi ◽  
Jafar Mana ◽  
Lydia Segal

As the nation confronts a growing tide of security breaches, the importance of having quality data breach information systems becomes paramount. Yet too little attention is paid to evaluating these systems. This article draws on data quality scholarship to develop a yardstick that assesses the quality of data breach notification systems in the U.S. at both the state and national levels from the perspective of key stakeholders, who include law enforcement agencies, consumers, shareholders, investors, researchers, and businesses that sell security products. Findings reveal major shortcomings that reduce the value of data breach information to these stakeholders. The study concludes with detailed recommendations for reform.


Author(s):  
Nishita Shewale

Abstract: To introduce unified information systems, this will provide different establishments with an insight on how data related activities take place and there results with assured quality. Considering data accumulation, replication, missing entities, incorrect formatting, anomalies etc. can come to light in the collection of data in different information systems, which can cause an array of adverse effects on data quality, the subject of data quality should be treated with better results. This paper inspects the data quality problems in information systems and introduces the new techniques that enable organizations to improve their quality of data. Keywords: Information Systems (IS), Data Quality, Data Cleaning, Data Profiling, Standardization, Database, Organization


2017 ◽  
Vol 4 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Diana Effendi

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The  process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management.   Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.


2020 ◽  
pp. 089443932092824 ◽  
Author(s):  
Michael J. Stern ◽  
Erin Fordyce ◽  
Rachel Carpenter ◽  
Melissa Heim Viox ◽  
Stuart Michaels ◽  
...  

Social media recruitment is no longer an uncharted avenue for survey research. The results thus far provide evidence of an engaging means of recruiting hard-to-reach populations. Questions remain, however, regarding whether the data collected using this method of recruitment produce quality data. This article assesses one aspect that may influence the quality of data gathered through nonprobability sampling using social media advertisements for a hard-to-reach sexual and gender minority youth population: recruitment design formats. The data come from the Survey of Today’s Adolescent Relationships and Transitions, which used a variety of forms of advertisements as survey recruitment tools on Facebook, Instagram, and Snapchat. Results demonstrate that design decisions such as the format of the advertisement (e.g., video or static) and the use of eligibility language on the advertisements impact the quality of the data as measured by break-off rates and the use of nonsubstantive responses. Additionally, the type of device used affected the measures of data quality.


2021 ◽  
Vol 280 ◽  
pp. 08012
Author(s):  
Yordanka Anastasova ◽  
Nikolay Yanev

The purpose of this article is to present modern approaches to data storage and processing, as well as technologies to achieve the quality of data needed for specific purposes in the mining industry. The data format looks at NoSQL and NewSQL technologies, with the focus shifting from the use of common solutions (traditional RDBMS) to specific ones aimed at integrating data into industrial information systems. The information systems used in the mining industry are characterized by their specificity and diversity, which is a prerequisite for the integration of NoSQL data models in it due to their flexibility. In modern industrial information systems, data is considered high-quality if it actually reflects the described object and serves to make effective management decisions. The article also discusses the criteria for data quality from the point of view of information technology and that of its users. Technologies are also presented, providing an optimal set of necessary functions that ensure the desired quality of data in the information systems applicable in the industry. The format and quality of data in client-server based information systems is of particular importance, especially in the dynamics of data input and processing in information systems used in the mining industry.


2021 ◽  
Vol 13 (1) ◽  
pp. 1-33
Author(s):  
Nelson Novaes Neto ◽  
Stuart Madnick ◽  
Anchises Moraes G. De Paula ◽  
Natasha Malara Borges

If the mantra “data is the new oil” of our digital economy is correct, then data leak incidents are the critical disasters in the online society. The initial goal of our research was to present a comprehensive database of data breaches of personal information that took place in 2018 and 2019. This information was to be drawn from press reports, industry studies, and reports from regulatory agencies across the world. This article identified the top 430 largest data breach incidents among more than 10,000 data breach incidents. In the process, we encountered many complications, especially regarding the lack of standardization of reporting. This article should be especially interesting to the readers of JDIQ because it describes both the range of data quality and consistency issues found as well as what was learned from the database created. The database that was created, available at https://www.databreachdb.com, shows that the number of data records breached in those top 430 incidents increased from around 4B in 2018 to more than 22B in 2019. This increase occurred despite the strong efforts from regulatory agencies across the world to enforce strict rules on data protection and privacy, such as the General Data Protection Regulation (GDPR) that went into effect in Europe in May 2018. Such regulatory effort could explain the reason why there is such a large number of data breach cases reported in the European Union when compared to the U.S. (more than 10,000 data breaches publicly reported in the U.S. since 2018, while the EU reported more than 160,000 1 data breaches since May 2018). However, we still face the problem of an excessive number of breach incidents around the world. This research helps to understand the challenges of proper visibility of such incidents on a global scale. The results of this research can help government entities, regulatory bodies, security and data quality researchers, companies, and managers to improve the data quality of data breach reporting and increase the visibility of the data breach landscape around the world in the future.


2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Brian E. Dixon ◽  
Chen Wen ◽  
Tony French ◽  
Jennifer Williams ◽  
Shaun J. Grannis

ObjectiveTo extend an open source analytics and visualization platform for measuring the quality of electronic health data transmitted to syndromic surveillance systems.IntroductionEffective clinical and public health practice in the twenty-first century requires access to data from an increasing array of information systems. However, the quality of data in these systems can be poor or “unfit for use.” Therefore measuring and monitoring data quality is an essential activity for clinical and public health professionals as well as researchers1. Current methods for examining data quality largely rely on manual queries and processes conducted by epidemiologists. Better, automated tools for examining data quality are desired by the surveillance community.MethodsUsing the existing, open-source platform Atlas developed by the Observational Health Data Sciences and Informatics collaborative (OHDSI; www.ohdsi.org), we added new functionality to measure and visualize the quality of data electronically reported from disparate information systems. Our extensions focused on analysis of data reported electronically to public health agencies for disease surveillance. Specifically, we created methods for examining the completeness and timeliness of data reported as well as the information entropy of the data within syndromic surveillance messages sent from emergency department information systems.ResultsTo date we transformed 111 million syndromic surveillance message segments pertaining to 16.4 million emergency department encounters representing 6 million patients into the OHDSI common data model. We further measured completeness, timeliness and entropy of the syndromic surveillance data. In Figure-1, the OHDSI tool Atlas summarizes the analysis of data completeness for key fields in over one million syndromic surveillance messages sent to Indiana’s health department in 2014. Completeness is reported by age category (e.g., 0-10, 20-30, 60+). Gender is generally complete, but both race and ethnicity fields are often complete for less than half of the patients in the cohort. These results suggest areas for improvement with respect to data quality that could be actionable by the syndromic surveillance coordinator at the state health department.ConclusionsOur project remains a work-in-progress. While functions that assess completeness, timeliness and entropy are complete, there may be other functions important to public health that need to be developed. We are currently soliciting feedback from syndromic surveillance stakeholders to gather ideas for what other functions would be useful to epidemiologists. Suggestions could be developed into functions over the next year. We are further working with the OHDSI collaborative to distribute the Atlas enhancements to other platforms, including the National Syndromic Surveillance Platform (NSSP). Our goal is to enable epidemiologists to quickly analyze data quality at scale.References1. Dixon BE, Rosenman M, Xia Y, Grannis SJ. A vision for the systematic monitoring and improvement of the quality of electronic health data. Studies in health technology and informatics. 2013;192:884-8.


Author(s):  
Patrick Ohemeng Gyaase ◽  
Joseph Tei Boye-Doe ◽  
Christiana Okantey

Quality data from the Expanded Immunization Programme (EPI), which is pivotal in reducing infant mortalities globally, is critical for knowledge management on the EPI. This chapter assesses the quality of data from the EPI for the six childhood killer diseases from the EPI tally books, monthly reports, and the District Health Information Management System (DHIMS II) using the Data Quality Self-Assessment (DQS) tool of WHO. The study found high availability and completeness of data in the EPI tally books and the monthly EPI reports. The accuracy and currency of data on all antigens from EPI tally books compared to reported number issued were comparatively low. The composite quality index of the data from the EPI is thus low, an indication poor supervision of the EPI programme in the health facilities. There is therefore, the need for effective monitoring and data validation at the point of collection and entry to improve the data quality for knowledge management on the EPI programme.


Author(s):  
Diego Milano

Data quality is a complex concept defined by various dimensions such as accuracy, currency, completeness, and consistency (Wang & Strong, 1996). Recent research has highlighted the importance of data quality issues in various contexts. In particular, in some specific environments characterized by extensive data replication high quality of data is a strict requirement. Among such environments, this article focuses on Cooperative Information Systems. Cooperative information systems (CISs) are all distributed and heterogeneous information systems that cooperate by sharing information, constraints, and goals (Mylopoulos & Papazoglou, 1997). Quality of data is a necessary requirement for a CIS. Indeed, a system in the CIS will not easily exchange data with another system without knowledge of the quality of data provided by the other system, thus resulting in a reduced cooperation. Also, when the quality of exchanged data is poor, there is a progressive deterioration of the overall data quality in the CIS. On the other hand, the high degree of data replication that characterizes a CIS can be exploited for improving data quality, as different copies of the same data may be compared in order to detect quality problems and possibly solve them. In Scannapieco, Virgillito, Marchetti, Mecella, and Baldoni (2004) and Mecella et al. (2003), the DaQuinCIS architecture is described as an architecture managing data quality in cooperative contexts, in order to avoid the spread of low-quality data and to exploit data replication for the improvement of the overall quality of cooperative data. In this article we will describe the design of a component of our system named as, quality factory. The quality factory has the purpose of evaluating quality of XML data sources of the cooperative system. While the need for such a component had been previously identified, this article first presents the design of the quality factory and proposes an overall methodology to evaluate the quality of XML data sources. Quality values measured by the quality factory are used by the data quality broker. The data quality broker has two main functionalities: 1) quality brokering that allows users to select data in the CIS according to their quality; 2) quality improvement that diffuses best quality copies of data in the CIS.


2020 ◽  
Author(s):  
Cristina Costa-Santos ◽  
Ana Luísa Neves ◽  
Ricardo Correia ◽  
Paulo Santos ◽  
Matilde Monteiro-Soares ◽  
...  

AbstractBackgroundHigh-quality data is crucial for guiding decision making and practicing evidence-based healthcare, especially if previous knowledge is lacking. Nevertheless, data quality frailties have been exposed worldwide during the current COVID-19 pandemic. Focusing on a major Portuguese surveillance dataset, our study aims to assess data quality issues and suggest possible solutions.MethodsOn April 27th 2020, the Portuguese Directorate-General of Health (DGS) made available a dataset (DGSApril) for researchers, upon request. On August 4th, an updated dataset (DGSAugust) was also obtained. The quality of data was assessed through analysis of data completeness and consistency between both datasets.ResultsDGSAugust has not followed the data format and variables as DGSApril and a significant number of missing data and inconsistencies were found (e.g. 4,075 cases from the DGSApril were apparently not included in DGSAugust). Several variables also showed a low degree of completeness and/or changed their values from one dataset to another (e.g. the variable ‘underlying conditions’ had more than half of cases showing different information between datasets). There were also significant inconsistencies between the number of cases and deaths due to COVID-19 shown in DGSAugust and by the DGS reports publicly provided daily.ConclusionsThe low quality of COVID-19 surveillance datasets limits its usability to inform good decisions and perform useful research. Major improvements in surveillance datasets are therefore urgently needed - e.g. simplification of data entry processes, constant monitoring of data, and increased training and awareness of health care providers - as low data quality may lead to a deficient pandemic control.


Sign in / Sign up

Export Citation Format

Share Document