Mutual Information Pre-processing Based Broken-stick Linear Regression Technique for Web User Behaviour Pattern Mining

Web usage behaviour mining is a substantial research problem to be resolved as it identifies different user’s behaviour pattern by analysing web log files. But, accuracy of finding the usage behaviour of users frequently accessed web patterns was limited and also it requires more time. Mutual Information Pre-processing based Broken-Stick Linear Regression (MIP-BSLR) technique is proposed for refining the performance of web user behaviour pattern mining with higher accuracy. Initially, web log files from Apache web log dataset and NASA dataset are considered as input. Then, Mutual Information based Pre-processing (MI-P) method is applied to compute mutual dependence between the two web patterns. Based on the computed value, web access patterns which relevant are taken for further processing and irrelevant patterns are removed. After that, Broken-Stick Linear Regression analysis (BLRA) is performed in MIPBSLR for Web User Behaviour analysis. By applying the BLRA, the frequently visited web patterns are identified. With the identification of frequently visited web patterns, MIP-BSLR technique exactly predicts the usage behaviour of web users, and also increases the performance of web usage behaviour mining. Experimental evaluation of MIPBSLR method is conducted on factors such as pattern mining accuracy, false positives, time requirements and space requirements with respect to number of web patterns. Outcomes show that the proposed technique improves the pattern mining accuracy by 14%, and reduces the false positive rate by 52%, time requirement by 19% and space complexity by 21% using Apache web log dataset as compared to conventional methods. Similarly, the pattern mining accuracy of NASA dataset is increased by 16% with the reduction of false positive rate by 47%, time requirement by 20% and space complexity by 22% as compared to conventional methods.

Download Full-text

Pattern Mining for Outbreak Discovery Preparedness

Data Mining ◽

10.4018/978-1-4666-2455-9.ch105 ◽

2013 ◽

pp. 2057-2068

Author(s):

Zalizah Awang Long ◽

Abdul Razak Hamdan ◽

Azuraliza Abu Bakar ◽

Mazrura Sahani

Keyword(s):

False Alarm ◽

Process Model ◽

Pattern Mining ◽

Moving Average ◽

False Positive Rate ◽

Outbreak Detection ◽

Positive Rate ◽

Knowledge Rules ◽

Overall Performance ◽

Public Health Surveillance System

Today, the objective of public health surveillance system is to reduce the impact of outbreaks by enabling appropriate intervention. Commonly used techniques are based on the changes or aberration in health events when compared with normal history to detect an outbreak. The main problem encountered in outbreaks is high rates of false alarm. High false alarm rates can lead to unnecessary interventions, and falsely detected outbreaks will lead to costly investigation. In this chapter, the authors review data mining techniques focusing on frequent and outlier mining to develop generic outbreak detection process model, named as “Frequent-outlier” model. The process model was tested against the real dengue dataset obtained from FSK, UKM, and also tested on the synthetic respiratory dataset obtained from AUTON LAB. The ROC was run to analyze the overall performance of “frequent-outlier” with CUSUM and Moving Average (MA). The results were promising and were evaluated using detection rate, false positive rate, and overall performance. An important outcome of this study is the knowledge rules derived from the notification of the outbreak cases to be used in counter measure assessment for outbreak preparedness.

Download Full-text

Pattern Mining for Outbreak Discovery Preparedness

Medical Applications of Intelligent Data Analysis - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-4666-1803-9.ch008 ◽

2012 ◽

pp. 125-137

Author(s):

Zalizah Awang Long ◽

Abdul Razak Hamdan ◽

Azuraliza Abu Bakar ◽

Mazrura Sahani

Keyword(s):

False Alarm ◽

Process Model ◽

Pattern Mining ◽

Moving Average ◽

False Positive Rate ◽

Outbreak Detection ◽

Positive Rate ◽

Knowledge Rules ◽

Overall Performance ◽

Public Health Surveillance System

Download Full-text

Validating the efficacy of the binomial pointwise linear regression method to detect glaucoma progression with multicentral database

British Journal of Ophthalmology ◽

10.1136/bjophthalmol-2019-314136 ◽

2019 ◽

Vol 104 (4) ◽

pp. 569-574

Author(s):

Shotaro Asano ◽

Hiroshi Murata ◽

Masato Matsuura ◽

Yuri Fujino ◽

Atsuya Miki ◽

...

Keyword(s):

Visual Field ◽

Linear Regression ◽

Trend Analysis ◽

Predictive Value ◽

False Positive Rate ◽

Linear Regression Method ◽

Trend Analyses ◽

Total Deviation ◽

Positive Rate ◽

Test Points

Background/aimWe previously reported the benefit of applying binomial pointwise linear regression (PLR: binomial PLR) to detect 10–2 glaucomatous visual field (VF) progression. The purpose of the current study was to validate the usefulness of the binomial PLR to detect glaucomatous VF progression in the central 24°.MethodsSeries of 15 VFs (Humphrey Field Analyzer 24–2 SITA-standard) from 341 eyes of 233 patients, obtained over 7.9±2.1 years (mean±SD), were investigated. PLR was performed by regressing the total deviation of all test points. VF progression was determined from the VF test points analyses using the binomial test (one side, p<0.025). The time needed to detect VF progression was compared across the binomial PLR, permutation analysis of PLR (PoPLR) and mean total deviation (mTD) trend analysis.ResultsThe binomial PLR was comparable with PoPLR and mTD trend analyses in the positive predictive value (0.18–0.87), the negative predictive value (0.89–0.95) and the false positive rate (0.057–0.35) to evaluate glaucomatous VF progression. The time to classify progression with binomial PLR (5.8±2.8 years) was significantly shorter than those with mTD trend analysis (6.7±2.8 years) and PoPLR (6.6±2.7 years).ConclusionsThe binomial PLR method, which detected glaucomatous VF progression in the central 24° significantly earlier than PoPLR and mTD trend analyses, shows promise for improving our ability to detect visual field progression for clinical management of glaucoma and in clinical trials of new glaucoma therapies.

Download Full-text

Frequent Pattern Mining of Web Log Files Working Principles

International Journal of Computer Applications ◽

10.5120/ijca2017912642 ◽

2017 ◽

Vol 157 (3) ◽

pp. 1-5

Author(s):

K. Suguna ◽

K. Nandhini

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Web Log ◽

Log Files

Download Full-text

User Behaviour Pattern Mining from Weblog

International Journal of Data Warehousing and Mining ◽

10.4018/jdwm.2012040101 ◽

2012 ◽

Vol 8 (2) ◽

pp. 1-22 ◽

Cited By ~ 12

Author(s):

Vishnu Priya ◽

A. Vadivel

Keyword(s):

Pattern Mining ◽

Sequential Patterns ◽

Behaviour Pattern ◽

User Behaviour ◽

Minimum Support ◽

Interactive Mining ◽

Frequent Items

In this paper, the authors build a tree using both frequent as well as non-frequent items and named as Revised PLWAP with Non-frequent Items RePLNI-tree in single scan. While mining sequential patterns, the links related to the non-frequent items are virtually discarded. Hence, it is not required to delete or maintain the information of nodes while revising the tree for mining updated weblog. It is not required to reconstruct the tree from scratch and re-compute the patterns each time, while weblog is updated or minimum support changed, since the algorithm supports both incremental and interactive mining. The performance of the proposed tree is better, even the size of incremental database is more than 50% of existing one, while it is not so in recently proposed algorithm. For evaluation purpose, the authors have used the benchmark weblog and found that the performance of proposed tree is encouraging compared to some of the recently proposed approaches.

Download Full-text

Improving diagnosis of acute appendicitis with atypical findings by Tc-99m HMPAO leukocyte scan

Nuklearmedizin ◽

10.1055/s-0038-1623994 ◽

2002 ◽

Vol 41 (01) ◽

pp. 37-41 ◽

Cited By ~ 3

Author(s):

S. Shung-Shung ◽

S. Yu-Chien ◽

Y. Mei-Due ◽

W. Hwei-Chung ◽

A. Kao

Keyword(s):

Acute Appendicitis ◽

False Positive ◽

False Positive Rate ◽

Accurate Method ◽

Clinical Findings ◽

Pathological Findings ◽

Lower Quadrant ◽

Predictive Values ◽

Positive Rate

Summary Aim: Even with careful observation, the overall false-positive rate of laparotomy remains 10-15% when acute appendicitis was suspected. Therefore, the clinical efficacy of Tc-99m HMPAO labeled leukocyte (TC-WBC) scan for the diagnosis of acute appendicitis in patients presenting with atypical clinical findings is assessed. Patients and Methods: Eighty patients presenting with acute abdominal pain and possible acute appendicitis but atypical findings were included in this study. After intravenous injection of TC-WBC, serial anterior abdominal/pelvic images at 30, 60, 120 and 240 min with 800k counts were obtained with a gamma camera. Any abnormal localization of radioactivity in the right lower quadrant of the abdomen, equal to or greater than bone marrow activity, was considered as a positive scan. Results: 36 out of 49 patients showing positive TC-WBC scans received appendectomy. They all proved to have positive pathological findings. Five positive TC-WBC were not related to acute appendicitis, because of other pathological lesions. Eight patients were not operated and clinical follow-up after one month revealed no acute abdominal condition. Three of 31 patients with negative TC-WBC scans received appendectomy. They also presented positive pathological findings. The remaining 28 patients did not receive operations and revealed no evidence of appendicitis after at least one month of follow-up. The overall sensitivity, specificity, accuracy, positive and negative predictive values for TC-WBC scan to diagnose acute appendicitis were 92, 78, 86, 82, and 90%, respectively. Conclusion: TC-WBC scan provides a rapid and highly accurate method for the diagnosis of acute appendicitis in patients with equivocal clinical examination. It proved useful in reducing the false-positive rate of laparotomy and shortens the time necessary for clinical observation.

Download Full-text

Predicting Fetal Chromosome Anomalies in the First Trimester Using Pregnancy Associated Plasma Protein-A: A Comparison of Statistical Methods

Methods of Information in Medicine ◽

10.1055/s-0038-1634910 ◽

1993 ◽

Vol 32 (02) ◽

pp. 175-179 ◽

Cited By ~ 7

Author(s):

B. Brambati ◽

T. Chard ◽

J. G. Grudzinskas ◽

M. C. M. Macintosh

Keyword(s):

Logistic Regression ◽

General Population ◽

Likelihood Ratio ◽

False Positive ◽

False Positive Rate ◽

Ratio Method ◽

Detection Rates ◽

Gaussian Distributions ◽

Positive Rate ◽

Likelihood Ratio Method

Abstract:The analysis of the clinical efficiency of a biochemical parameter in the prediction of chromosome anomalies is described, using a database of 475 cases including 30 abnormalities. A comparison was made of two different approaches to the statistical analysis: the use of Gaussian frequency distributions and likelihood ratios, and logistic regression. Both methods computed that for a 5% false-positive rate approximately 60% of anomalies are detected on the basis of maternal age and serum PAPP-A. The logistic regression analysis is appropriate where the outcome variable (chromosome anomaly) is binary and the detection rates refer to the original data only. The likelihood ratio method is used to predict the outcome in the general population. The latter method depends on the data or some transformation of the data fitting a known frequency distribution (Gaussian in this case). The precision of the predicted detection rates is limited by the small sample of abnormals (30 cases). Varying the means and standard deviations (to the limits of their 95% confidence intervals) of the fitted log Gaussian distributions resulted in a detection rate varying between 42% and 79% for a 5% false-positive rate. Thus, although the likelihood ratio method is potentially the better method in determining the usefulness of a test in the general population, larger numbers of abnormal cases are required to stabilise the means and standard deviations of the fitted log Gaussian distributions.

Download Full-text

Review of Improvement of Web Search Based on Web Log File

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v3i2b.2880 ◽

2012 ◽

Vol 3 (2) ◽

pp. 298-300 ◽

Cited By ~ 1

Author(s):

Soniya P. Chaudhari ◽

Prof. Hitesh Gupta ◽

S. J. Patil

Keyword(s):

Neural Network ◽

Fuzzy Logic ◽

Web Search ◽

Pattern Mining ◽

Efficiency Improvement ◽

Web Searching ◽

Important Method ◽

Web Log ◽

Journal Paper ◽

Log File

In this paper we review various research of journal paper as Web Searching efficiency improvement. Some important method based on sequential pattern Mining. Some are based on supervised learning or unsupervised learning. And also used for other method such as Fuzzy logic and neural network

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

Forms, Importance, and Ineffability of Factor Interactions to Define Personality Disorders: Comment on Lilienfeld et al. (2019)

10.31234/osf.io/v8t7q ◽

2019 ◽

Author(s):

Stephen D Benning ◽

Edward Smith

Keyword(s):

Personality Disorders ◽

False Positive Rate ◽

Empirical Work ◽

Design Studies ◽

Linear Threshold ◽

Positive Rate ◽

Saturation Effects ◽

Factor Interactions ◽

Main Effects ◽

Criterion Variables

The emergent interpersonal syndrome (EIS) approach conceptualizes personality disorders as the interaction among their constituent traits to predict important criterion variables. We detail the difficulties we have experienced finding such interactive predictors in our empirical work on psychopathy, even when using uncorrelated traits that maximize power. Rather than explaining a large absolute proportion of variance in interpersonal outcomes, EIS interactions might explain small amounts of variance relative to the main effects of each trait. Indeed, these interactions may necessitate samples of almost 1,000 observations for 80% power and a false positive rate of .05. EIS models must describe which specific traits’ interactions constitute a particular EIS, as effect sizes appear to diminish as higher-order trait interactions are analyzed. Considering whether EIS interactions are ordinal with non-crossing slopes, disordinal with crossing slopes, or entail non-linear threshold or saturation effects may help researchers design studies, sampling strategies, and analyses to model their expected effects efficiently.

Download Full-text