No Silver Bullet

Author(s):  
Nan Zhang ◽  
Liam O’Neill ◽  
Gautam Das ◽  
Xiuzhen Cheng ◽  
Heng Huang

In accordance with HIPAA regulations, patients’ personal information is typically removed or generalized prior to being released as public data files. However, it is not known if the standard method of de-identification is sufficient to prevent re-identification by an intruder. The authors conducted analytical processing to identify security vulnerabilities in the protocols to de-identify hospital data. Their techniques for discovering privacy leakage utilized three disclosure channels: (1) data inter-dependency, (2) biomedical domain knowledge, and (3) suppression algorithms and partial suppression results. One state’s inpatient discharge data set was used to represent the current practice of de-identification of health care data, where a systematic approach had been employed to suppress certain elements of the patient’s record. Of the 1,098 records for which the hospital ID was suppressed, the original hospital ID was recovered for 616 records, leading to a nullification rate of 56.1%. Utilizing domain knowledge based on the patient’s Diagnosis Related Group (DRG) code, the authors recovered the real age of 64 patients, the gender of 83 male patients and 713 female patients. They also successfully identified the ZIP code of 1,219 patients. The procedure used to de-identify hospital records was found to be inadequate to prevent disclosure of patient information. As the masking procedure described was found to be reversible, this increases the risk that an intruder could use this information to re-identify individual patients.

Thorax ◽  
2017 ◽  
Vol 73 (4) ◽  
pp. 339-349 ◽  
Author(s):  
Margreet Lüchtenborg ◽  
Eva J A Morris ◽  
Daniela Tataru ◽  
Victoria H Coupland ◽  
Andrew Smith ◽  
...  

IntroductionThe International Cancer Benchmarking Partnership (ICBP) identified significant international differences in lung cancer survival. Differing levels of comorbid disease across ICBP countries has been suggested as a potential explanation of this variation but, to date, no studies have quantified its impact. This study investigated whether comparable, robust comorbidity scores can be derived from the different routine population-based cancer data sets available in the ICBP jurisdictions and, if so, use them to quantify international variation in comorbidity and determine its influence on outcome.MethodsLinked population-based lung cancer registry and hospital discharge data sets were acquired from nine ICBP jurisdictions in Australia, Canada, Norway and the UK providing a study population of 233 981 individuals. For each person in this cohort Charlson, Elixhauser and inpatient bed day Comorbidity Scores were derived relating to the 4–36 months prior to their lung cancer diagnosis. The scores were then compared to assess their validity and feasibility of use in international survival comparisons.ResultsIt was feasible to generate the three comorbidity scores for each jurisdiction, which were found to have good content, face and concurrent validity. Predictive validity was limited and there was evidence that the reliability was questionable.ConclusionThe results presented here indicate that interjurisdictional comparability of recorded comorbidity was limited due to probable differences in coding and hospital admission practices in each area. Before the contribution of comorbidity on international differences in cancer survival can be investigated an internationally harmonised comorbidity index is required.


1989 ◽  
Vol 19 (2) ◽  
pp. 56-62 ◽  
Author(s):  
Don Hindle ◽  
Angela Cook ◽  
John Pilla

The content of discharge abstracts (or ‘morbidity statistics forms’ as they are popularly known in Australia) is determined by perceived needs for information about inpatients. It should be sensitive to changes in those needs. The emergence of interest in diagnosis related group (DRG) data has had an impact on discharge abstracting. However, revisions have been minor because the DRG system was designed to make use of the standard discharge data set in the United States. In Australia, it has been necessary only to adjust practices to bring them more or less in line with the American standard. In the near future, however, it is likely that more significant changes will be needed. In this paper the authors discuss one new area of interest concerning measurement of resource use by DRG. They suggest that it will lead to the addition of new fields on the discharge abstract and to major changes in the way that discharges are defined. (AMRJ (1989). 19(2), 56–62).


2021 ◽  
Vol 20 ◽  
pp. 117693512110398
Author(s):  
Dafne Alejandra Reyes ◽  
Victor Manuel Saure Sarría ◽  
Marcela Salazar-Viedma ◽  
Vívian D’Afonseca

Gastric cancer (GC) is one of the most frequent tumors in the world. Stomach adenocarcinoma is a heterogeneous tumor, turning the prognosis prediction and patients’ clinical management difficult. Some diagnosis tests for GC are been development using knowledge based in polymorphisms, somatic copy number alteration (SCNA) and aberrant histone methylation. This last event, a posttranslational modification that occurs at the chromatin level, is an important epigenetic alteration seen in several tumors including stomach adenocarcinoma. Histone methyltransferases (HMT) are the proteins responsible for the methylation in specific amino acids residues of histones tails. Here, were presented several HMTs that could be relating to GC process. We use public data from 440 patients with stomach adenocarcinoma. We evaluated the alterations as SCNAs, mutations, and genes expression level of HMTs in these aforementioned samples. As results, it was identified the 10 HMTs most altered (up to 30%) in stomach adenocarcinoma samples, which are the PRDM14, PRDM9, SUV39H2, NSD2, SMYD5, SETDB1, PRDM12, SUV39H1, NSD3, and EHMT2 genes. The PRDM9 gene is among most mutated and amplified HMTs within the data set studied. PRDM14 is downregulated in 79% of the samples and the SUV39H2 gene is down expressed in patients with recurred/progressed disease. Several HMTs are altered in many cancers. It is important to generate a genetic atlas of alterations of cancer-related genes to improve the understanding of tumorigenesis events and to propose novel tools of diagnosis and prognosis for the cancer control.


Author(s):  
Murat Dikmen ◽  
Catherine Burns

This work explores the application of Cognitive Work Analysis (CWA) in the context of Explainable Artificial Intelligence (XAI). We built an AI system using a loan evaluation data set and applied an XAI technique to obtain data-driven explanations for predictions. Using an Abstraction Hierarchy (AH), we generated domain knowledge-based explanations to accompany data-driven explanations. An online experiment was conducted to test the usefulness of AH-based explanations. Participants read financial profiles of loan applicants, the AI system’s loan approval/rejection decisions, and explanations that justify the decisions. Presence or absence of AH-based explanations was manipulated, and participants’ perceptions of the explanation quality was measured. The results showed that providing AH-based explanations helped participants learn about the loan evaluation process and improved the perceived quality of explanations. We conclude that a CWA approach can increase understandability when explaining the decisions made by AI systems.


2021 ◽  
pp. 1-21
Author(s):  
Quan-Hoang Vuong ◽  
Viet-Phuong La ◽  
Manh-Toan Ho ◽  
Thanh-Hang Pham ◽  
Thu-Trang Vuong ◽  
...  

Abstract Science, technology, engineering, and mathematics (STEM) education has become a critical factor in promoting sustainable development. Meanwhile, book reading is still an essential method for cognitive development and knowledge acquisition. In developing countries where STEM teaching and learning resources are limited, book reading is an important educational tool to promote STEM. Nevertheless, public data sets about STEM education and book reading behaviors in emerging countries are scarce. This article, therefore, aims to present a data set of 4,966 secondary school students from a school-based data collection in Vietnam. The data set comprises of five major categories: 1) students' personal information (including STEM performance), 2) family-related information, 3) book reading preferences, 4) book reading frequency/ habits, and 5) classroom activities. By introducing the designing principles, the data collection method, and the variables in the data set, we aim to provide researchers, policymakers, and educators with well-validated resources and guidelines to conduct low-cost research, pedagogical programs in emerging countries.


1996 ◽  
Vol 35 (01) ◽  
pp. 41-51 ◽  
Author(s):  
F. Molino ◽  
D. Furia ◽  
F. Bar ◽  
S. Battista ◽  
N. Cappello ◽  
...  

AbstractThe study reported in this paper is aimed at evaluating the effectiveness of a knowledge-based expert system (ICTERUS) in diagnosing jaundiced patients, compared with a statistical system based on probabilistic concepts (TRIAL). The performances of both systems have been evaluated using the same set of data in the same number of patients. Both systems are spin-off products of the European project Euricterus, an EC-COMACBME Project designed to document the occurrence and diagnostic value of clinical findings in the clinical presentation of jaundice in Europe, and have been developed as decision-making tools for the identification of the cause of jaundice based only on clinical information and routine investigations. Two groups of jaundiced patients were studied, including 500 (retrospective sample) and 100 (prospective sample) subjects, respectively. All patients were independently submitted to both decision-support tools. The input of both systems was the data set agreed within the Euricterus Project. The performances of both systems were evaluated with respect to the reference diagnoses provided by experts on the basis of the full clinical documentation. Results indicate that both systems are clinically reliable, although the diagnostic prediction provided by the knowledge-based approach is slightly better.


2021 ◽  
pp. postgradmedj-2020-139361
Author(s):  
María Matesanz-Fernández ◽  
Teresa Seoane-Pillado ◽  
Iria Iñiguez-Vázquez ◽  
Roi Suárez-Gil ◽  
Sonia Pértega-Díaz ◽  
...  

ObjectiveWe aim to identify patterns of disease clusters among inpatients of a general hospital and to describe the characteristics and evolution of each group.MethodsWe used two data sets from the CMBD (Conjunto mínimo básico de datos - Minimum Basic Hospital Data Set (MBDS)) of the Lucus Augusti Hospital (Spain), hospitalisations and patients, realising a retrospective cohort study among the 74 220 patients discharged from the Medic Area between 01 January 2000 and 31 December 2015. We created multimorbidity clusters using multiple correspondence analysis.ResultsWe identified five clusters for both gender and age. Cluster 1: alcoholic liver disease, alcoholic dependency syndrome, lung and digestive tract malignant neoplasms (age under 50 years). Cluster 2: large intestine, prostate, breast and other malignant neoplasms, lymphoma and myeloma (age over 70, mostly males). Cluster 3: malnutrition, Parkinson disease and other mobility disorders, dementia and other mental health conditions (age over 80 years and mostly women). Cluster 4: atrial fibrillation/flutter, cardiac failure, chronic kidney failure and heart valve disease (age between 70–80 and mostly women). Cluster 5: hypertension/hypertensive heart disease, type 2 diabetes mellitus, ischaemic cardiomyopathy, dyslipidaemia, obesity and sleep apnea, including mostly men (age range 60–80). We assessed significant differences among the clusters when gender, age, number of chronic pathologies, number of rehospitalisations and mortality during the hospitalisation were assessed (p<0001 in all cases).ConclusionsWe identify for the first time in a hospital environment five clusters of disease combinations among the inpatients. These clusters contain several high-incidence diseases related to both age and gender that express their own evolution and clinical characteristics over time.


Author(s):  
Sebastian Hoppe Nesgaard Jensen ◽  
Mads Emil Brix Doest ◽  
Henrik Aanæs ◽  
Alessio Del Bue

AbstractNon-rigid structure from motion (nrsfm), is a long standing and central problem in computer vision and its solution is necessary for obtaining 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting a data set created for this purpose, which is made publicly available, and considerably larger than the previous state of the art. To validate the applicability of this data set, and provide an investigation into the state of the art of nrsfm, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 18 different methods with available code that reasonably spans the state of the art in sparse nrsfm. This new public data set and evaluation protocol will provide benchmark tools for further development in this challenging field.


Sign in / Sign up

Export Citation Format

Share Document