Pattern Mining for Outbreak Discovery Preparedness

2012 ◽

pp. 125-137

Author(s):

Zalizah Awang Long ◽

Abdul Razak Hamdan ◽

Azuraliza Abu Bakar ◽

Mazrura Sahani

Keyword(s):

False Alarm ◽

Process Model ◽

Pattern Mining ◽

Moving Average ◽

False Positive Rate ◽

Outbreak Detection ◽

Positive Rate ◽

Knowledge Rules ◽

Overall Performance ◽

Public Health Surveillance System

Today, the objective of public health surveillance system is to reduce the impact of outbreaks by enabling appropriate intervention. Commonly used techniques are based on the changes or aberration in health events when compared with normal history to detect an outbreak. The main problem encountered in outbreaks is high rates of false alarm. High false alarm rates can lead to unnecessary interventions, and falsely detected outbreaks will lead to costly investigation. In this chapter, the authors review data mining techniques focusing on frequent and outlier mining to develop generic outbreak detection process model, named as “Frequent-outlier” model. The process model was tested against the real dengue dataset obtained from FSK, UKM, and also tested on the synthetic respiratory dataset obtained from AUTON LAB. The ROC was run to analyze the overall performance of “frequent-outlier” with CUSUM and Moving Average (MA). The results were promising and were evaluated using detection rate, false positive rate, and overall performance. An important outcome of this study is the knowledge rules derived from the notification of the outbreak cases to be used in counter measure assessment for outbreak preparedness.

Download Full-text

P6334Daily body weight in patients with chronic heart failure: improved diagnostic value by analysing prolonged time intervals

European Heart Journal ◽

10.1093/eurheartj/ehz746.0931 ◽

2019 ◽

Vol 40 (Supplement_1) ◽

Author(s):

K Gudmundsson ◽

P Lynga ◽

A Langius-Eklof ◽

E Hagglund ◽

A Hagg-Martinell ◽

...

Keyword(s):

Heart Failure ◽

Body Weight ◽

Chronic Heart Failure ◽

Diagnostic Performance ◽

Moving Average ◽

False Positive Rate ◽

Diagnostic Value ◽

Time Interval ◽

Time Intervals ◽

Positive Rate

Abstract Background Daily body weight (BW) is a mainstay in the management of patients with chronic heart failure (HF). Guidelines recommend to take action if BW increases more than 2kg within 3 days. However, the evidence behind the 2kg/3d rule is unclear and studies have shown poor diagnostic performance of this algorithm. Purpose To assess the diagnostic value of different BW thresholds and time intervals to alert for imminent HF decompensation. Methods We studied 184 patients with HF (age 71±10 yr, EF 26±11%). 43% had been hospitalized for HF during the preceding year. They were assessed by daily BW using digital scales with direct data transfer to a central data base. The mean follow-up was 286 days. To decrease day-to-day variability, BW was analysed based on a daily moving average over 3 days. We retrospectively calculated the sensitivity and false-positive rate of BW thresholds at 1.5, 2.0, 2.5, 3.0 and 3.5 kg and time intervals between 2 and 30 days. Threshold crossings occurring within 30 days prior to a hospitalization for decompensated HF were deemed a positive alert. Results The sensitivity of 2kg/3d was poor (13%). Prolonging the time interval of weight changes markedly improved sensitivity. Increasing the weight threshold decreased the false positive rate. Greatest sensitivity (60%) was achieved using a 14 day interval at a weight threshold of 1.5 kg. However, this was associated with a high rate of false alerts (3.1 per patient/year). A weight threshold of 3.5 kg resulted in excellent specificity (0.3 false alerts per patient/year), however sensitivity was low (20%, 20 day time interval). Conclusion Monitoring daily BW using a 2kg/3d algorithm is associated with poor diagnostic performance. Generally, by analyzing stable trends over time (moving average) and using prolonged time intervals, BW monitoring with digital scales can achieve a clinically meaningful diagnostic performance. This new approach to BW monitoring may improve early detection of imminent HF decompensation.

Download Full-text

Mutual Information Pre-processing Based Broken-stick Linear Regression Technique for Web User Behaviour Pattern Mining

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2021.0228.24 ◽

2021 ◽

Vol 14 (1) ◽

pp. 244-256

Author(s):

Gokulapriya Raman ◽

◽

Ganesh Raj ◽

Keyword(s):

Linear Regression ◽

Mutual Information ◽

Pattern Mining ◽

False Positive Rate ◽

Behaviour Pattern ◽

Time Requirement ◽

User Behaviour ◽

Web Log ◽

Log Files ◽

Positive Rate

Web usage behaviour mining is a substantial research problem to be resolved as it identifies different user’s behaviour pattern by analysing web log files. But, accuracy of finding the usage behaviour of users frequently accessed web patterns was limited and also it requires more time. Mutual Information Pre-processing based Broken-Stick Linear Regression (MIP-BSLR) technique is proposed for refining the performance of web user behaviour pattern mining with higher accuracy. Initially, web log files from Apache web log dataset and NASA dataset are considered as input. Then, Mutual Information based Pre-processing (MI-P) method is applied to compute mutual dependence between the two web patterns. Based on the computed value, web access patterns which relevant are taken for further processing and irrelevant patterns are removed. After that, Broken-Stick Linear Regression analysis (BLRA) is performed in MIPBSLR for Web User Behaviour analysis. By applying the BLRA, the frequently visited web patterns are identified. With the identification of frequently visited web patterns, MIP-BSLR technique exactly predicts the usage behaviour of web users, and also increases the performance of web usage behaviour mining. Experimental evaluation of MIPBSLR method is conducted on factors such as pattern mining accuracy, false positives, time requirements and space requirements with respect to number of web patterns. Outcomes show that the proposed technique improves the pattern mining accuracy by 14%, and reduces the false positive rate by 52%, time requirement by 19% and space complexity by 21% using Apache web log dataset as compared to conventional methods. Similarly, the pattern mining accuracy of NASA dataset is increased by 16% with the reduction of false positive rate by 47%, time requirement by 20% and space complexity by 22% as compared to conventional methods.

Download Full-text

A statistical algorithm for outbreak detection in a multi-site setting: the case of sick leave monitoring

10.1101/2020.09.22.20199406 ◽

2020 ◽

Author(s):

Tom Duchemin ◽

Angela Noufaily ◽

Mounia N. Hocine

Keyword(s):

Sick Leave ◽

Negative Binomial ◽

Management Practice ◽

False Positive Rate ◽

Probability Of Detection ◽

Outbreak Detection ◽

Mixed Effect ◽

Wide Range ◽

Positive Rate ◽

Data Surveillance

Surveillance for infectious disease outbreak or for other processes should sometimes be implemented simultaneously on multiple sites to detect local events. Sick leave can be monitored accross companies to detect issues such as local outbreaks and identify companies-related issues as local spreading of infectious diseases or bad management practice. In this context, we proposed an adaptation of the Quasi-Poisson regression-based Farrington algorithm for multi-site surveillance. The proposed algorithm consists of a Negative-Binomial mixed effect regression with a new re-weighting procedure to account for past outbreaks and increase sensitivity of the model. We perform a wide range simulations to assess the performance of the model in terms of False Positive Rate and Probability of Detection. We propose an application to sick leave rate in the context of COVID-19. The proposed algorithm provides good overall performance and opens up new opportunities for multi-site data surveillance.

Download Full-text

INVARIANT DESCRIPTOR LEARNING USING A SIAMESE CONVOLUTIONAL NEURAL NETWORK

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-3-11-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 11-18 ◽

Cited By ~ 1

Author(s):

L. Chen ◽

F. Rottensteiner ◽

C. Heipke

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Moving Average ◽

False Positive Rate ◽

Feature Space ◽

Recall Rate ◽

L2 Norm ◽

Positive Rate ◽

Benchmark Datasets ◽

Input Module

In this paper we describe learning of a descriptor based on the Siamese Convolutional Neural Network (CNN) architecture and evaluate our results on a standard patch comparison dataset. The descriptor learning architecture is composed of an input module, a Siamese CNN descriptor module and a cost computation module that is based on the L2 Norm. The cost function we use pulls the descriptors of matching patches close to each other in feature space while pushing the descriptors for non-matching pairs away from each other. Compared to related work, we optimize the training parameters by combining a moving average strategy for gradients and Nesterov's Accelerated Gradient. Experiments show that our learned descriptor reaches a good performance and achieves state-of-art results in terms of the false positive rate at a 95 % recall rate on standard benchmark datasets.

Download Full-text

INVARIANT DESCRIPTOR LEARNING USING A SIAMESE CONVOLUTIONAL NEURAL NETWORK

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iii-3-11-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 11-18

Author(s):

L. Chen ◽

F. Rottensteiner ◽

C. Heipke

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Moving Average ◽

False Positive Rate ◽

Feature Space ◽

Recall Rate ◽

L2 Norm ◽

Positive Rate ◽

Benchmark Datasets ◽

Input Module

In this paper we describe learning of a descriptor based on the Siamese Convolutional Neural Network (CNN) architecture and evaluate our results on a standard patch comparison dataset. The descriptor learning architecture is composed of an input module, a Siamese CNN descriptor module and a cost computation module that is based on the L2 Norm. The cost function we use pulls the descriptors of matching patches close to each other in feature space while pushing the descriptors for non-matching pairs away from each other. Compared to related work, we optimize the training parameters by combining a moving average strategy for gradients and Nesterov's Accelerated Gradient. Experiments show that our learned descriptor reaches a good performance and achieves state-of-art results in terms of the false positive rate at a 95 % recall rate on standard benchmark datasets.

Download Full-text

Improving diagnosis of acute appendicitis with atypical findings by Tc-99m HMPAO leukocyte scan

Nuklearmedizin ◽

10.1055/s-0038-1623994 ◽

2002 ◽

Vol 41 (01) ◽

pp. 37-41 ◽

Cited By ~ 3

Author(s):

S. Shung-Shung ◽

S. Yu-Chien ◽

Y. Mei-Due ◽

W. Hwei-Chung ◽

A. Kao

Keyword(s):

Acute Appendicitis ◽

False Positive ◽

False Positive Rate ◽

Accurate Method ◽

Clinical Findings ◽

Pathological Findings ◽

Lower Quadrant ◽

Predictive Values ◽

Positive Rate

Summary Aim: Even with careful observation, the overall false-positive rate of laparotomy remains 10-15% when acute appendicitis was suspected. Therefore, the clinical efficacy of Tc-99m HMPAO labeled leukocyte (TC-WBC) scan for the diagnosis of acute appendicitis in patients presenting with atypical clinical findings is assessed. Patients and Methods: Eighty patients presenting with acute abdominal pain and possible acute appendicitis but atypical findings were included in this study. After intravenous injection of TC-WBC, serial anterior abdominal/pelvic images at 30, 60, 120 and 240 min with 800k counts were obtained with a gamma camera. Any abnormal localization of radioactivity in the right lower quadrant of the abdomen, equal to or greater than bone marrow activity, was considered as a positive scan. Results: 36 out of 49 patients showing positive TC-WBC scans received appendectomy. They all proved to have positive pathological findings. Five positive TC-WBC were not related to acute appendicitis, because of other pathological lesions. Eight patients were not operated and clinical follow-up after one month revealed no acute abdominal condition. Three of 31 patients with negative TC-WBC scans received appendectomy. They also presented positive pathological findings. The remaining 28 patients did not receive operations and revealed no evidence of appendicitis after at least one month of follow-up. The overall sensitivity, specificity, accuracy, positive and negative predictive values for TC-WBC scan to diagnose acute appendicitis were 92, 78, 86, 82, and 90%, respectively. Conclusion: TC-WBC scan provides a rapid and highly accurate method for the diagnosis of acute appendicitis in patients with equivocal clinical examination. It proved useful in reducing the false-positive rate of laparotomy and shortens the time necessary for clinical observation.

Download Full-text

Predicting Fetal Chromosome Anomalies in the First Trimester Using Pregnancy Associated Plasma Protein-A: A Comparison of Statistical Methods

Methods of Information in Medicine ◽

10.1055/s-0038-1634910 ◽

1993 ◽

Vol 32 (02) ◽

pp. 175-179 ◽

Cited By ~ 7

Author(s):

B. Brambati ◽

T. Chard ◽

J. G. Grudzinskas ◽

M. C. M. Macintosh

Keyword(s):

Logistic Regression ◽

General Population ◽

Likelihood Ratio ◽

False Positive ◽

False Positive Rate ◽

Ratio Method ◽

Detection Rates ◽

Gaussian Distributions ◽

Positive Rate ◽

Likelihood Ratio Method

Abstract:The analysis of the clinical efficiency of a biochemical parameter in the prediction of chromosome anomalies is described, using a database of 475 cases including 30 abnormalities. A comparison was made of two different approaches to the statistical analysis: the use of Gaussian frequency distributions and likelihood ratios, and logistic regression. Both methods computed that for a 5% false-positive rate approximately 60% of anomalies are detected on the basis of maternal age and serum PAPP-A. The logistic regression analysis is appropriate where the outcome variable (chromosome anomaly) is binary and the detection rates refer to the original data only. The likelihood ratio method is used to predict the outcome in the general population. The latter method depends on the data or some transformation of the data fitting a known frequency distribution (Gaussian in this case). The precision of the predicted detection rates is limited by the small sample of abnormals (30 cases). Varying the means and standard deviations (to the limits of their 95% confidence intervals) of the fitted log Gaussian distributions resulted in a detection rate varying between 42% and 79% for a 5% false-positive rate. Thus, although the likelihood ratio method is potentially the better method in determining the usefulness of a test in the general population, larger numbers of abnormal cases are required to stabilise the means and standard deviations of the fitted log Gaussian distributions.

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

Forms, Importance, and Ineffability of Factor Interactions to Define Personality Disorders: Comment on Lilienfeld et al. (2019)

10.31234/osf.io/v8t7q ◽

2019 ◽

Author(s):

Stephen D Benning ◽

Edward Smith

Keyword(s):

Personality Disorders ◽

False Positive Rate ◽

Empirical Work ◽

Design Studies ◽

Linear Threshold ◽

Positive Rate ◽

Saturation Effects ◽

Factor Interactions ◽

Main Effects ◽

Criterion Variables

The emergent interpersonal syndrome (EIS) approach conceptualizes personality disorders as the interaction among their constituent traits to predict important criterion variables. We detail the difficulties we have experienced finding such interactive predictors in our empirical work on psychopathy, even when using uncorrelated traits that maximize power. Rather than explaining a large absolute proportion of variance in interpersonal outcomes, EIS interactions might explain small amounts of variance relative to the main effects of each trait. Indeed, these interactions may necessitate samples of almost 1,000 observations for 80% power and a false positive rate of .05. EIS models must describe which specific traits’ interactions constitute a particular EIS, as effect sizes appear to diminish as higher-order trait interactions are analyzed. Considering whether EIS interactions are ordinal with non-crossing slopes, disordinal with crossing slopes, or entail non-linear threshold or saturation effects may help researchers design studies, sampling strategies, and analyses to model their expected effects efficiently.

Download Full-text