Locational privacy-preserving distance computations with intersecting sets of randomly labeled grid points

Abstract Background We introduce and study a recently proposed method for privacy-preserving distance computations which has received little attention in the scientific literature so far. The method, which is based on intersecting sets of randomly labeled grid points, is henceforth denoted as ISGP allows calculating the approximate distances between masked spatial data. Coordinates are replaced by sets of hash values. The method allows the computation of distances between locations L when the locations at different points in time t are not known simultaneously. The distance between $$L_1$$ L 1 and $$L_2$$ L 2 could be computed even when $$L_2$$ L 2 does not exist at $$t_1$$ t 1 and $$L_1$$ L 1 has been deleted at $$t_2$$ t 2 . An example would be patients from a medical data set and locations of later hospitalizations. ISGP is a new tool for privacy-preserving data handling of geo-referenced data sets in general. Furthermore, this technique can be used to include geographical identifiers as additional information for privacy-preserving record-linkage. To show that the technique can be implemented in most high-level programming languages with a few lines of code, a complete implementation within the statistical programming language R is given. The properties of the method are explored using simulations based on large-scale real-world data of hospitals ($$n=850$$ n = 850 ) and residential locations ($$n=13,000$$ n = 13 , 000 ). The method has already been used in a real-world application. Results ISGP yields very accurate results. Our simulation study showed that—with appropriately chosen parameters – 99 % accuracy in the approximated distances is achieved. Conclusion We discussed a new method for privacy-preserving distance computations in microdata. The method is highly accurate, fast, has low computational burden, and does not require excessive storage.

Download Full-text

AMS-Net: An Attention-Based Multi-Scale Network for Classification of 3D Terracotta Warrior Fragments

Remote Sensing ◽

10.3390/rs13183713 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3713

Author(s):

Jie Liu ◽

Xin Cao ◽

Pingchuan Zhang ◽

Xueli Xu ◽

Yangyang Liu ◽

...

Keyword(s):

Real World ◽

Data Sets ◽

Semantic Features ◽

Real World Data ◽

Global Features ◽

Data Set ◽

Multi Scale ◽

Public Data ◽

High Level

As an essential step in the restoration of Terracotta Warriors, the results of fragments classification will directly affect the performance of fragments matching and splicing. However, most of the existing methods are based on traditional technology and have low accuracy in classification. A practical and effective classification method for fragments is an urgent need. In this case, an attention-based multi-scale neural network named AMS-Net is proposed to extract significant geometric and semantic features. AMS-Net is a hierarchical structure consisting of a multi-scale set abstraction block (MS-BLOCK) and a fully connected (FC) layer. MS-BLOCK consists of a local-global layer (LGLayer) and an improved multi-layer perceptron (IMLP). With a multi-scale strategy, LGLayer can parallel extract the local and global features from different scales. IMLP can concatenate the high-level and low-level features for classification tasks. Extensive experiments on the public data set (ModelNet40/10) and the real-world Terracotta Warrior fragments data set are conducted. The accuracy results with normal can achieve 93.52% and 96.22%, respectively. For real-world data sets, the accuracy is best among the existing methods. The robustness and effectiveness of the performance on the task of 3D point cloud classification are also investigated. It proves that the proposed end-to-end learning network is more effective and suitable for the classification of the Terracotta Warrior fragments.

Download Full-text

Empirical evaluation of feature subset selection based on a real-world data set

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2004.03.005 ◽

2004 ◽

Vol 17 (3) ◽

pp. 285-288 ◽

Cited By ~ 5

Author(s):

Petra Perner ◽

Chid Apte

Keyword(s):

Real World ◽

Empirical Evaluation ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Real World Data ◽

Data Set ◽

World Data

Download Full-text

Multi-Robot SLAM in Dynamic Environments with Parallel Maps

International Journal of Humanoid Robotics ◽

10.1142/s0219843621500110 ◽

2021 ◽

pp. 2150011

Author(s):

Sajad Badalkhani ◽

Ramazan Havangi ◽

Mohsen Farshad

Keyword(s):

Large Scale ◽

Dynamic Environment ◽

Dynamic Environments ◽

Extensive Literature ◽

Real World Data ◽

Data Set ◽

Cooperative Approach ◽

Localization And Mapping ◽

Multi Robot

There is an extensive literature regarding multi-robot simultaneous localization and mapping (MRSLAM). In most part of the research, the environment is assumed to be static, while the dynamic parts of the environment degrade the estimation quality of SLAM algorithms and lead to inherently fragile systems. To enhance the performance and robustness of the SLAM in dynamic environments (SLAMIDE), a novel cooperative approach named parallel-map (p-map) SLAM is introduced in this paper. The objective of the proposed method is to deal with the dynamics of the environment, by detecting dynamic parts and preventing the inclusion of them in SLAM estimations. In this approach, each robot builds a limited map in its own vicinity, while the global map is built through a hybrid centralized MRSLAM. The restricted size of the local maps, bounds computational complexity and resources needed to handle a large scale dynamic environment. Using a probabilistic index, the proposed method differentiates between stationary and moving landmarks, based on their relative positions with other parts of the environment. Stationary landmarks are then used to refine a consistent map. The proposed method is evaluated with different levels of dynamism and for each level, the performance is measured in terms of accuracy, robustness, and hardware resources needed to be implemented. The method is also evaluated with a publicly available real-world data-set. Experimental validation along with simulations indicate that the proposed method is able to perform consistent SLAM in a dynamic environment, suggesting its feasibility for MRSLAM applications.

Download Full-text

Network Data Characteristics

Statistical Techniques for Network Security ◽

10.4018/978-1-59904-708-9.ch004 ◽

2011 ◽

pp. 104-122

Author(s):

Yu Wang

Keyword(s):

Real World ◽

Random Variables ◽

Network Data ◽

Traffic Data ◽

Real World Data ◽

Additional Information ◽

Data Points ◽

Key Features ◽

Basic Concepts ◽

Data Elements

Data represents the natural phenomena of our real world. Data is constructed by rows and columns; usually rows represent the observations and columns represent the variables. Observations, also called subjects, records, or data points, represent a phenomenon in the real world and variables, as also known as data elements or data fields, represent the characteristics of observations in data. Variables take different values for different observations, which can make observations independent of each other. Figure 4.1 illustrates a section of TCP/IP traffic data, in which the rows are individual network traffics, and the columns, separated by a space, are characteristics of the traffics. In this example, the first column is a session index of each connection and the second column is the date when the connection occurred. In this chapter, we will discuss some fundamental key features of variables and network data. We will present detailed discussions on variable characteristics and distributions in Sections Random Variables and Variables Distributions, and describe network data modules in Section Network Data Modules. The material covered in this chapter will help readers who do not have a solid background in this area gain an understanding of the basic concepts of variables and data. Additional information can be found from Introduction to the Practice of Statistics by Moore and McCabe (1998).

Download Full-text

Characterizing the Feasibility and Performance of Real-World Tumor Progression End Points and Their Association With Overall Survival in a Large Advanced Non–Small-Cell Lung Cancer Data Set

JCO Clinical Cancer Informatics ◽

10.1200/cci.19.00013 ◽

2019 ◽

pp. 1-13 ◽

Cited By ~ 14

Author(s):

Sandra D. Griffith ◽

Rebecca A. Miksad ◽

Geoff Calkins ◽

Paul You ◽

Nicole G. Lipitz ◽

...

Keyword(s):

Lung Cancer ◽

Small Cell Lung Cancer ◽

Cell Lung Cancer ◽

Real World ◽

Large Scale ◽

Progression Free Survival ◽

Small Cell ◽

Small Cell Lung ◽

Data Set ◽

End Points

PURPOSE Large, generalizable real-world data can enhance traditional clinical trial results. The current study evaluates reliability, clinical relevance, and large-scale feasibility for a previously documented method with which to characterize cancer progression outcomes in advanced non–small-cell lung cancer from electronic health record (EHR) data. METHODS Patients who were diagnosed with advanced non–small-cell lung cancer between January 1, 2011, and February 28, 2018, with two or more EHR-documented visits and one or more systemic therapy line initiated were identified in Flatiron Health’s longitudinal EHR-derived database. After institutional review board approval, we retrospectively characterized real-world progression (rwP) dates, with a random duplicate sample to ascertain interabstractor agreement. We calculated real-world progression-free survival, real-world time to progression, real-world time to next treatment, and overall survival (OS) using the Kaplan-Meier method (index date was the date of first-line therapy initiation), and correlations between OS and other end points were assessed at the patient level (Spearman’s ρ). RESULTS Of 30,276 eligible patients,16,606 (55%) had one or more rwP event. Of these patients, 11,366 (68%) had subsequent death, treatment discontinuation, or new treatment initiation. Correlation of real-world progression-free survival with OS was moderate to high (Spearman’s ρ, 0.76; 95% CI, 0.75 to 0.77; evaluable patients, n = 20,020), and for real-world time to progression correlation with OS was lower (Spearman’s ρ, 0.69; 95% CI, 0.68 to 0.70; evaluable patients, n = 11,902). Interabstractor agreement on rwP occurrence was 0.94 (duplicate sample, n = 1,065) and on rwP date 0.85 (95% CI, 0.81 to 0.89; evaluable patients n = 358 [patients with two independent event captures within 30 days]). Median rwP abstraction time from individual EHRs was 18.0 minutes (interquartile range, 9.7 to 34.4 minutes). CONCLUSION We demonstrated that rwP-based end points correlate with OS, and that rwP curation from a large, contemporary EHR data set can be reliable, clinically relevant, and feasible on a large scale.

Download Full-text

Consensus Development of a Modern Ontology of Emergency Department Presenting Problems—The Hierarchical Presenting Problem Ontology (HaPPy)

Applied Clinical Informatics ◽

10.1055/s-0039-1691842 ◽

2019 ◽

Vol 10 (03) ◽

pp. 409-420 ◽

Cited By ~ 3

Author(s):

Steven Horng ◽

Nathaniel R. Greenbaum ◽

Larry A. Nathanson ◽

James C. McClay ◽

Foster R. Goss ◽

...

Keyword(s):

Emergency Department ◽

Real World ◽

Emergency Department Patient ◽

Snomed Ct ◽

Presenting Problems ◽

Real World Data ◽

Validation Data ◽

Data Set ◽

World Data ◽

Academic Level

Objective Numerous attempts have been made to create a standardized “presenting problem” or “chief complaint” list to characterize the nature of an emergency department visit. Previous attempts have failed to gain widespread adoption as they were not freely shareable or did not contain the right level of specificity, structure, and clinical relevance to gain acceptance by the larger emergency medicine community. Using real-world data, we constructed a presenting problem list that addresses these challenges. Materials and Methods We prospectively captured the presenting problems for 180,424 consecutive emergency department patient visits at an urban, academic, Level I trauma center in the Boston metro area. No patients were excluded. We used a consensus process to iteratively derive our system using real-world data. We used the first 70% of consecutive visits to derive our ontology, followed by a 6-month washout period, and the remaining 30% for validation. All concepts were mapped to Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT). Results Our system consists of a polyhierarchical ontology containing 692 unique concepts, 2,118 synonyms, and 30,613 nonvisible descriptions to correct misspellings and nonstandard terminology. Our ontology successfully captured structured data for 95.9% of visits in our validation data set. Discussion and Conclusion We present the HierArchical Presenting Problem ontologY (HaPPy). This ontology was empirically derived and then iteratively validated by an expert consensus panel. HaPPy contains 692 presenting problem concepts, each concept being mapped to SNOMED CT. This freely sharable ontology can help to facilitate presenting problem-based quality metrics, research, and patient care.

Download Full-text

Structure Identification-Based Clustering According to Density Consistency

Mathematical Problems in Engineering ◽

10.1155/2011/890901 ◽

2011 ◽

Vol 2011 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Chunzhong Li ◽

Zongben Xu

Keyword(s):

High Dimension ◽

Real World ◽

Clustering Algorithm ◽

Density Difference ◽

Structure Identification ◽

Data Sets ◽

Critical Importance ◽

Real World Data ◽

Data Set ◽

High Dimension Data

Structure of data set is of critical importance in identifying clusters, especially the density difference feature. In this paper, we present a clustering algorithm based on density consistency, which is a filtering process to identify same structure feature and classify them into same cluster. This method is not restricted by the shapes and high dimension data set, and meanwhile it is robust to noises and outliers. Extensive experiments on synthetic and real world data sets validate the proposed the new clustering algorithm.

Download Full-text

Recovery and Resilience After a Nuclear Power Plant Disaster: A Medical Decision Model for Managing an Effective, Timely, and Balanced Response

Disaster Medicine and Public Health Preparedness ◽

10.1017/dmp.2013.5 ◽

2013 ◽

Vol 7 (2) ◽

pp. 136-145 ◽

Cited By ~ 6

Author(s):

C. Norman Coleman ◽

Daniel J. Blumenthal ◽

Charles A. Casto ◽

Michael Alfant ◽

Steven L. Simon ◽

...

Keyword(s):

Power Plant ◽

Nuclear Power Plant ◽

Nuclear Power ◽

Decision Model ◽

Large Scale ◽

Medical Decision ◽

Incident Management ◽

Additional Information ◽

Medical Issues ◽

High Level

AbstractResilience after a nuclear power plant or other radiation emergency requires response and recovery activities that are appropriately safe, timely, effective, and well organized. Timely informed decisions must be made, and the logic behind them communicated during the evolution of the incident before the final outcome is known. Based on our experiences in Tokyo responding to the Fukushima Daiichi nuclear power plant crisis, we propose a real-time, medical decision model by which to make key health-related decisions that are central drivers to the overall incident management. Using this approach, on-site decision makers empowered to make interim decisions can act without undue delay using readily available and high-level scientific, medical, communication, and policy expertise. Ongoing assessment, consultation, and adaption to the changing conditions and additional information are additional key features. Given the central role of health and medical issues in all disasters, we propose that this medical decision model, which is compatible with the existing US National Response Framework structure, be considered for effective management of complex, large-scale, and large-consequence incidents. (Disaster Med Public Health Preparedness. 2012;0:1-10)

Download Full-text

A Real-time Dynamic Simulation Scheme for Large-Scale Flood Hazard Using 3D Real World Data

2007 11th International Conference Information Visualization (IV '07) ◽

10.1109/iv.2007.15 ◽

2007 ◽

Cited By ~ 6

Author(s):

C Wang ◽

T. R. Wan ◽

I. J. Palmer

Keyword(s):

Real Time ◽

Dynamic Simulation ◽

Real World ◽

Large Scale ◽

Flood Hazard ◽

Real World Data ◽

World Data ◽

Time Dynamic ◽

Simulation Scheme

Download Full-text

Proton Pump Inhibitors and Risk of Dementia: A Hypothesis Generated but Not Adequately Tested

American Journal of Alzheimer s Disease & Other Dementias® ◽

10.1177/15333175211062413 ◽

2021 ◽

Vol 36 ◽

pp. 153331752110624

Author(s):

Mishah Azhar ◽

Lawrence Fiedler ◽

Patricio S. Espinosa ◽

Charles H. Hennekens

Keyword(s):

Proton Pump Inhibitors ◽

Proton Pump ◽

Real World ◽

Large Scale ◽

Basic Research ◽

The United States ◽

Epidemiological Studies ◽

Real World Data ◽

Health Authorities ◽

Public Health Authorities

We reviewed the evidence on proton pump inhibitors (PPIs) and dementia. PPIs are among the most widely utilized drugs in the world. Dementia affects roughly 5% of the population of the United States (US) and world aged 60 years and older. With respect to PPIs and dementia, basic research has suggested plausible mechanisms but descriptive and analytic epidemiological studies are not inconsistent. In addition, a single large-scale randomized trial showed no association. When the evidence is incomplete, it is appropriate for clinicians and researchers to remain uncertain. Regulatory or public health authorities sometimes need to make real-world decisions based on real-world data. When the evidence is complete, then the most rational judgments for individual patients the health of the general public are possible At present, the evidence on PPIs and dementia suggests more reassurance than alarm. Further large-scale randomized evidence is necessary to do so.

Download Full-text