poor quality data Latest Research Papers

Automated detection of poor-quality data: case studies in healthcare

Scientific Reports ◽

10.1038/s41598-021-97341-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

M. A. Dakka ◽

T. V. Nguyen ◽

J. M. M. Hall ◽

S. M. Diakiw ◽

M. VerMilyea ◽

...

Keyword(s):

Data Privacy ◽

Poor Quality ◽

Patient Data ◽

Training Data ◽

Quality Data ◽

Automated Identification ◽

Data Cleansing ◽

Private Patient ◽

Healthcare Data ◽

Poor Quality Data

AbstractThe detection and removal of poor-quality data in a training set is crucial to achieve high-performing AI models. In healthcare, data can be inherently poor-quality due to uncertainty or subjectivity, but as is often the case, the requirement for data privacy restricts AI practitioners from accessing raw training data, meaning manual visual verification of private patient data is not possible. Here we describe a novel method for automated identification of poor-quality data, called Untrainable Data Cleansing. This method is shown to have numerous benefits including protection of private patient data; improvement in AI generalizability; reduction in time, cost, and data needed for training; all while offering a truer reporting of AI performance itself. Additionally, results show that Untrainable Data Cleansing could be useful as a triage tool to identify difficult clinical cases that may warrant in-depth evaluation or additional testing to support a diagnosis.

Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors

Electronics ◽

10.3390/electronics10172049 ◽

2021 ◽

Vol 10 (17) ◽

pp. 2049

Author(s):

Kennedy Edemacu ◽

Jong Wook Kim

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Data Quality ◽

Logistic Regression Model ◽

Homomorphic Encryption ◽

Poor Quality ◽

Privacy Preserving ◽

Quality Data ◽

Data Filtering ◽

Poor Quality Data

Nowadays, the internet of things (IoT) is used to generate data in several application domains. A logistic regression, which is a standard machine learning algorithm with a wide application range, is built on such data. Nevertheless, building a powerful and effective logistic regression model requires large amounts of data. Thus, collaboration between multiple IoT participants has often been the go-to approach. However, privacy concerns and poor data quality are two challenges that threaten the success of such a setting. Several studies have proposed different methods to address the privacy concern but to the best of our knowledge, little attention has been paid towards addressing the poor data quality problems in the multi-party logistic regression model. Thus, in this study, we propose a multi-party privacy-preserving logistic regression framework with poor quality data filtering for IoT data contributors to address both problems. Specifically, we propose a new metric gradient similarity in a distributed setting that we employ to filter out parameters from data contributors with poor quality data. To solve the privacy challenge, we employ homomorphic encryption. Theoretical analysis and experimental evaluations using real-world datasets demonstrate that our proposed framework is privacy-preserving and robust against poor quality data.

Poor quality data, privacy, lack of certifications: the lethal triad of new technologies in intensive care

Intensive Care Medicine ◽

10.1007/s00134-021-06473-4 ◽

2021 ◽

Author(s):

Valentina Bellini ◽

Jonathan Montomoli ◽

Elena Bignami

Keyword(s):

Intensive Care ◽

Data Privacy ◽

New Technologies ◽

Poor Quality ◽

Quality Data ◽

Poor Quality Data ◽

Lethal Triad

Automated Detection of Poor-Quality Data: Case Studies in Healthcare

10.21203/rs.3.rs-440365/v1 ◽

2021 ◽

Author(s):

M.A. Dakka ◽

T. Nguyen ◽

J.M.M. Hall ◽

S.M. Diakiw ◽

M. VerMilyea ◽

...

Keyword(s):

Data Privacy ◽

Poor Quality ◽

Patient Data ◽

Training Data ◽

Quality Data ◽

Automated Identification ◽

Data Cleansing ◽

Private Patient ◽

Healthcare Data ◽

Poor Quality Data

Abstract The detection and removal of poor-quality data in a training set is crucial to achieve high-performing AI models. In healthcare, data can be inherently poor-quality due to uncertainty or subjectivity, but as is often the case, the requirement for data privacy restricts AI practitioners from accessing raw training data, meaning manual visual verification of private patient data is not possible. Here we describe a novel method for automated identification of poor-quality data, called Untrainable Data Cleansing. This method is shown to have numerous benefits including protection of private patient data; improvement in AI generalizability; reduction in time, cost, and data needed for training; all while offering a truer reporting of AI performance itself. Additionally, results show that Untrainable Data Cleansing could be useful as a triage tool to identify difficult clinical cases that may warrant in-depth evaluation or additional testing to support a diagnosis.

Managing Patients with Failing Kidney Allograft

Clinical Journal of the American Society of Nephrology ◽

10.2215/cjn.14620920 ◽

2021 ◽

pp. CJN.14620920

Author(s):

Scott Davis ◽

Sumit Mohan

Keyword(s):

Kidney Transplant ◽

Practice Patterns ◽

Management Strategies ◽

Poor Quality ◽

Quality Data ◽

Optimal Care ◽

Excess Morbidity ◽

Poor Quality Data ◽

Transplant Failure ◽

Made In

Patients who receive a kidney transplant commonly experience failure of their allograft. Transplant failure often comes with complex management decisions, such as when and how to wean immunosuppression and start the transition to a second transplant or to dialysis. These decisions are made in the context of important concerns about competing risks, including sensitization and infection. Unfortunately, the management of the failed allograft is, at present, guided by relatively poor-quality data and, as a result, practice patterns are variable and suboptimal given that patients with failed allografts experience excess morbidity and mortality compared with their transplant-naive counterparts. In this review, we summarize the management strategies through the often-precarious transition from transplant to dialysis, highlighting the paucity of data and the critical gaps in our knowledge that are necessary to inform the optimal care of the patient with a failing kidney transplant.

Maximum Ratio Principle-Based Estimation of Difference Inter-System Bias

Journal of Navigation ◽

10.1017/s0373463320000430 ◽

2020 ◽

Vol 73 (6) ◽

pp. 1372-1386

Author(s):

Zihan Peng ◽

Chengfa Gao ◽

Rui Shang

Keyword(s):

Standard Deviation ◽

Satellite System ◽

Poor Quality ◽

Combination Method ◽

Quality Data ◽

Complex Environments ◽

Poor Quality Data ◽

System Bias ◽

Global Navigation Satellite ◽

Principle Of Maximum

The tight combination model improves the positioning accuracy of the Global Navigation Satellite System (GNSS) in complex environments by increasing the redundancy of observation. However, the ambiguity cannot be calculated directly because of the correlation between it and the phase difference inter-system bias (DISB) in the model. This paper proposes a method of DISB estimation based on the principle of maximum ratio. From the data analysis, for the standard deviation of code DISB, the improvement of the method can up to 0·179 m with the poor quality data. In addition, compared to the parameter combination method, the standard deviation of all the phase DISB was deceased with the method in the paper. About the phase DISB of GPS L1/Galileo E1, the standard deviation decreased from 0·014/0·022/0·009/0·051 cycles to 0·006/0·015/0·004/0·029 cycles of four baselines, which represents the improvement of 57·14/31·82/55·56/43·14%. About the phase DISB of GPS L1/BDS B1, the standard deviation decreased from 0·014/0·061/0·010/0·052 cycles to 0·002/0·005/0·009/0·004 cycles of four baselines, which represents the improvement of 85·71/91·80/10·00/92·31%.

Data Cleaning in Cloud Platform

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a3088.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 2535-2539

Keyword(s):

Data Cleaning ◽

Data Association ◽

Poor Quality ◽

Quality Data ◽

Quality Of Data ◽

Useful Knowledge ◽

Big Data Applications ◽

Poor Quality Data ◽

Free Data ◽

The Given

: Data is very valuable and it is generated in large volumes. The Use of high-quality data for making quality decisions has become a huge task which helps people to make better decisions, analysis, predictions. We are surrounded by data with errors, Data cleaning is a delayed, complicated task and considered costly. Data polishing is important since it is necessary to remove errors from the data before transferring to the data warehouse since poor quality data is eliminated to get the desired results. The Error-free data will produce precise and accurate results when queried. Hence consistent and proper data is required for the decision making. The characteristics of data polishing is data repairing and data association. Identifying the homogeneous object and linking it to the most associated object is defined as Association. The process of making the database reliable by repairing and finding the faults is defined as repairing. In the case of big data applications, we do not use all the existing data, we use only subsets of appropriate data. Association is the process of converting extensive amounts of raw data to subsets of appropriate data that are useful. Once we get the appropriate data, the available data is analyzed and it leads to knowledge [14]. Multiple approaches are used to associate the given data and to achieve meaningful and useful knowledge to fix or repair [12]. Maintaining polished quality of data is referred to as data polishing. Usually the objectives of data polishing are not properly defined. This paper will discuss the goals of data cleaning and different approaches for data cleaning platforms

Novel Electrodes for Reliable EEG Recordings on Coarse and Curly Hair

10.1101/2020.02.26.965202 ◽

2020 ◽

Cited By ~ 1

Author(s):

Arnelle Etienne ◽

Tarana Laroia ◽

Harper Weigle ◽

Amber Afelin ◽

Shawn K Kelly ◽

...

Keyword(s):

Neurological Disorders ◽

Poor Quality ◽

Quality Data ◽

Imaging Tool ◽

Poor Quality Data ◽

Clinical Diagnoses ◽

Curly Hair ◽

Eeg Recordings ◽

Brain Computer Interfacing ◽

Lower Impedance

AbstractEEG is a powerful and affordable brain sensing and imaging tool used extensively for the diagnosis of neurological disorders (e.g. epilepsy), brain computer interfacing, and basic neuroscience. Unfortunately, most EEG electrodes and systems are not designed to accommodate coarse and curly hair common in individuals of African descent. This can lead to poor quality data that might be discarded in scientific studies after recording from a broader population set, and for clinical diagnoses, lead to an uncomfortable and/or emotionally taxing experience, and, in the worst cases, misdiagnosis. In this work, we design a system to explicitly accommodate coarse and curly hair, and demonstrate that, across time, our electrodes, in conjunction with appropriate braiding, attain substantially (~10x) lower impedance than state-of-the-art systems. This builds on our prior work that demonstrated that braiding hair in patterns consistent with the clinical standard 10-20 arrangement leads to improved impedance with existing systems.

A mobile crowd sensing framework for suspect investigation: An objectivity analysis and de-identification approach

Computer Science and Information Systems ◽

10.2298/csis190427039e ◽

2020 ◽

Vol 17 (1) ◽

pp. 253-269

Author(s):

Alaoui El ◽

Fazziki El ◽

Fatima Ennaji ◽

Mohamed Sadgal

Keyword(s):

Smart Cities ◽

Poor Quality ◽

Quality Data ◽

Crowd Sensing ◽

Private Data ◽

Poor Quality Data ◽

Identification Approach ◽

Mobile Crowd ◽

Share Information

The ubiquity of mobile devices and their advanced features have increased the use of crowdsourcing in many areas, such as the mobility in the smart cities. With the advent of high-quality sensors on smartphones, online communities can easily collect and share information. These information are of great importance for the institutions, which must analyze the facts by facilitating the data collecting on crimes and criminals, for example. This paper proposes an approach to develop a crowdsensing framework allowing a wider collaboration between the citizens and the authorities. In addition, this framework takes advantage of an objectivity analysis to ensure the participants? credibility and the information reliability, as law enforcement is often affected by unreliable and poor quality data. In addition, the proposed framework ensures the protection of users' private data through a de-identification process. Experimental results show that the proposed framework is an interesting tool to improve the quality of crowdsensing information in a government context.

Characterizing water quality and quantity profiles with poor quality data in a machine learning algorithm

10.5004/dwt.2020.25481 ◽

2020 ◽

Vol 182 ◽

pp. 127-134

Author(s):

Zhonghyun Kim ◽

Heewon Jeong ◽

Sora Shin ◽

Jinho Jung ◽

Joon Ha Kim ◽

...

Keyword(s):

Machine Learning ◽

Water Quality ◽

Learning Algorithm ◽

Poor Quality ◽

Quality Data ◽

Machine Learning Algorithm ◽

Poor Quality Data ◽

Water Quality And Quantity

poor quality data
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automated detection of poor-quality data: case studies in healthcare

Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors

Poor quality data, privacy, lack of certifications: the lethal triad of new technologies in intensive care

Automated Detection of Poor-Quality Data: Case Studies in Healthcare

Managing Patients with Failing Kidney Allograft

Maximum Ratio Principle-Based Estimation of Difference Inter-System Bias

Data Cleaning in Cloud Platform

Novel Electrodes for Reliable EEG Recordings on Coarse and Curly Hair

A mobile crowd sensing framework for suspect investigation: An objectivity analysis and de-identification approach

Characterizing water quality and quantity profiles with poor quality data in a machine learning algorithm

Export Citation Format

poor quality dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automated detection of poor-quality data: case studies in healthcare

Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors

Poor quality data, privacy, lack of certifications: the lethal triad of new technologies in intensive care

Automated Detection of Poor-Quality Data: Case Studies in Healthcare

Managing Patients with Failing Kidney Allograft

Maximum Ratio Principle-Based Estimation of Difference Inter-System Bias

Data Cleaning in Cloud Platform

Novel Electrodes for Reliable EEG Recordings on Coarse and Curly Hair

A mobile crowd sensing framework for suspect investigation: An objectivity analysis and de-identification approach

Characterizing water quality and quantity profiles with poor quality data in a machine learning algorithm

poor quality data
Recently Published Documents