Tuning IMS station processing parameters and detection thresholds to increase detection precision and decrease detection miss rate

Mapping Intimacies ◽

10.5194/egusphere-egu2020-8949 ◽

2020 ◽

Author(s):

Christos Saragiotis ◽

Ivan Kitov

Keyword(s):

False Discovery Rate ◽

Binary Classification ◽

False Negative ◽

Detection Threshold ◽

False Negative Rate ◽

Processing Parameters ◽

Detection Algorithm ◽

Detection Scheme ◽

Statistical Measures ◽

False Discovery

Two principal performance measures of the International Monitoring System (IMS) stations detection capability are the rate of automatic detections associated with events in the Reviewed Event Bulletin (REB) and the rate of detections manually added to the REB. These two metrics roughly correspond to the precision (which is the complement of the false-discovery rate) and miss rate or false-negative rate statistical measures of a binary classification test, respectively. The false-discovery and miss rates are clearly significantly influenced by the number of phases detected by the detection algorithm, which in turn depends on prespecified slowness-, frequency- and azimuth- dependent threshold values used in the short-term average over long-term average ratio detection scheme of the IMS stations. In particular, the lower the threshold, the more the detections and therefore the lower the miss rate but the higher the false discovery rate; the higher the threshold, the less the detections and therefore the higher the miss rate but also the lower the false discovery rate. In that sense decreasing both the false-discovery rate and the miss rate are conflicting goals that need to be balanced. On one hand, it is essential that the miss rate is as low as possible since no nuclear explosion should go unnoticed by the IMS. On the other hand, a high false-discovery rate compromises the quality of the automatically generated event lists and adds heavy and unnecessary workload to the seismic analysts during the interactive processing stage.A previous study concluded that a way to decrease both the miss and false-discovery rates as well as the analyst workload is to increase the retiming interval, i.e., the maximum allowable time that an analyst is allowed to move an arrival pick without having to declare a new arrival. Indeed, when a detection needs to be moved by an interval larger than the retiming interval, not only is this a much more time-consuming task for the analyst than just retiming it, but it also affects negatively both the associated rate (the automatic detection is deleted and therefore not associated to an event) and the added rate (a new arrival has to be added to arrival list). The International Data Centre has increased the retiming interval from 4 s to 10 s since October 2018. We show how this change affected the associated-detections and added-detections rates and how the values of these metrics can be further improved by tuning the detection threshold levels.

Download Full-text

Using mixture models to detect differentially expressed genes

Australian Journal of Experimental Agriculture ◽

10.1071/ea05051 ◽

2005 ◽

Vol 45 (8) ◽

pp. 859 ◽

Cited By ~ 13

Author(s):

G. J. McLachlan ◽

R. W. Bean ◽

L. Ben-Tovim Jones ◽

J. X. Zhu

Keyword(s):

False Discovery Rate ◽

Mixture Models ◽

False Negative ◽

False Negative Rate ◽

Multiple Hypothesis Testing ◽

Differentially Expressed ◽

False Discovery ◽

Mixture Model Approach ◽

Number Of Classes ◽

Selection Of

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Download Full-text

Sensorless Detection of Cavitation in Centrifugal Pumps

Manufacturing Engineering and Textile Engineering ◽

10.1115/imece2006-14655 ◽

2006 ◽

Cited By ~ 2

Author(s):

Parasuram P. Harihara ◽

Alexander G. Parlos

Keyword(s):

Mechanical Load ◽

A Priori ◽

Detection Threshold ◽

Cost Effective ◽

Detection Algorithm ◽

Design Parameters ◽

Centrifugal Pumps ◽

Detection Scheme ◽

Pump Design ◽

Electric Power Supply

Analysis of electrical signatures has been in use for some time for estimating the condition of induction motors, by extracting spectral indicators from motor current waveforms. In most applications, motors are used to drive dynamic loads, such as pumps, fans, and blowers, by means of power transmission devices, such as belts, couplers, gear-boxes. Failure of either the electric motors or the driven loads is associated with operational disruption. The large costs associated with the resulting idle equipment and personnel can often be avoided if the degradation is detected in its early stages prior to reaching failure conditions. Hence the need arises for cost-effective detection schemes not only for assessing the condition of the motor but also of the driven load. This prompts one to consider approaches that use no add-on sensors, in order to avoid any reduction in overall system reliability and increased costs. This paper presents an experimentally demonstrated sensorless approach to detecting varying levels of cavitation in centrifugal pumps. The proposed approach is sensorless in the sense that no mechanical sensors are required on either the pump or the motor driving the pump. Rather, onset of pump cavitation is detected using only the line voltages and phase currents of the electric motor driving the pump. Moreover, most industrial motor switchgear are equipped with potential transformers and current transformers which can be used to measure the motor voltages and currents. The developed fault detection scheme is insensitive to electric power supply and mechanical load variations. Furthermore, it does not require a priori knowledge of a motor or pump model or any detailed motor or pump design parameters; a model of the system is adaptively estimated on-line. The developed detection algorithm has been tested on data collected from a centrifugal pump connected to a 3 φ, 3 hp induction motor. Several cavitation levels are staged with increased severity. In addition to these staged pump faults, extensive experiments are also conducted to test the false alarm performance of the algorithm. Results from these experiments allow us to offer the conclusion that for the cases under consideration, the proposed model-based detection scheme reveals cavitation detection times that are comparable to those obtained from vibration analysis with a detection threshold that is significantly lower than used in industrial practice.

Download Full-text

Malicious Domain Names Detection Algorithm Based on N-Gram

Journal of Computer Networks and Communications ◽

10.1155/2019/4612474 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Hong Zhao ◽

Zhaobin Chang ◽

Guangbin Bao ◽

Xiangyan Zeng

Keyword(s):

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Detection Algorithm ◽

Internet Security ◽

Domain Name ◽

Domain Names ◽

Malicious Attack ◽

Positive Rate ◽

N Gram

Malicious domain name attacks have become a serious issue for Internet security. In this study, a malicious domain names detection algorithm based on N-Gram is proposed. The top 100,000 domain names in Alexa 2013 are used in the N-Gram method. Each domain name excluding the top-level domain is segmented into substrings according to its domain level with the lengths of 3, 4, 5, 6, and 7. The substring set of the 100,000 domain names is established, and the weight value of a substring is calculated according to its occurrence number in the substring set. To detect a malicious attack, the domain name is also segmented by the N-Gram method and its reputation value is calculated based on the weight values of its substrings. Finally, the judgment of whether the domain name is malicious is made by thresholding. In the experiments on Alexa 2017 and Malware domain list, the proposed detection algorithm yielded an accuracy rate of 94.04%, a false negative rate of 7.42%, and a false positive rate of 6.14%. The time complexity is lower than other popular malicious domain names detection algorithms.

Download Full-text

An Effective Method for Controlling False Discovery and False Nondiscovery Rates in Genome-Scale RNAi Screens

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057110381783 ◽

2010 ◽

Vol 15 (9) ◽

pp. 1116-1122 ◽

Cited By ~ 15

Author(s):

Xiaohua Douglas Zhang

Keyword(s):

False Negative ◽

False Negative Rate ◽

False Positives ◽

P Value ◽

Statistical Control ◽

Small Interfering Rnas ◽

False Negatives ◽

Activation Effect ◽

False Discovery ◽

Genome Scale

In most genome-scale RNA interference (RNAi) screens, the ultimate goal is to select siRNAs with a large inhibition or activation effect. The selection of hits typically requires statistical control of 2 errors: false positives and false negatives. Traditional methods of controlling false positives and false negatives do not take into account the important feature in RNAi screens: many small-interfering RNAs (siRNAs) may have very small but real nonzero average effects on the measured response and thus cannot allow us to effectively control false positives and false negatives. To address for deficiencies in the application of traditional approaches in RNAi screening, the author proposes a new method for controlling false positives and false negatives in RNAi high-throughput screens. The false negatives are statistically controlled through a false-negative rate (FNR) or false nondiscovery rate (FNDR). FNR is the proportion of false negatives among all siRNAs examined, whereas FNDR is the proportion of false negatives among declared nonhits. The author also proposes new concepts, q*-value and p*-value, to control FNR and FNDR, respectively. The proposed method should have broad utility for hit selection in which one needs to control both false discovery and false nondiscovery rates in genome-scale RNAi screens in a robust manner.

Download Full-text

MIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS

International Journal of Neural Systems ◽

10.1142/s0129065706000755 ◽

2006 ◽

Vol 16 (05) ◽

pp. 353-362 ◽

Cited By ~ 2

Author(s):

LIAT BEN-TOVIM JONES ◽

RICHARD BEAN ◽

GEOFFREY J. MCLACHLAN ◽

JUSTIN XI ZHU

Keyword(s):

Mixture Models ◽

False Negative ◽

False Negative Rate ◽

Multiple Hypothesis Testing ◽

Differentially Expressed ◽

Data Set ◽

False Discovery ◽

Mixture Model Approach ◽

Number Of Classes ◽

Selection Of

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.

Download Full-text

GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing

Genome Biology ◽

10.1186/s13059-021-02423-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Daniel L. Cameron ◽

Jonathan Baber ◽

Charles Shale ◽

Jose Espejo Valle-Inclan ◽

Nicolle Besselink ◽

...

Keyword(s):

Copy Number ◽

Structural Variation ◽

False Negative ◽

False Negative Rate ◽

Structural Variants ◽

Structural Variant ◽

Complex Rearrangement ◽

False Discovery ◽

Copy Number Changes ◽

Paired End Sequencing

AbstractGRIDSS2 is the first structural variant caller to explicitly report single breakends—breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32–100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.

Download Full-text

A New Source Detection Algorithm Using the False-Discovery Rate

The Astronomical Journal ◽

10.1086/338316 ◽

2002 ◽

Vol 123 (2) ◽

pp. 1086-1094 ◽

Cited By ~ 84

Author(s):

A. M. Hopkins ◽

C. J. Miller ◽

A. J. Connolly ◽

C. Genovese ◽

R. C. Nichol ◽

...

Keyword(s):

False Discovery Rate ◽

Detection Algorithm ◽

Source Detection ◽

False Discovery

Download Full-text

An Abnormal Network Traffic Detection Algorithm Based on Big Data Analysis

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2016.4.2315 ◽

2016 ◽

Vol 11 (4) ◽

pp. 567 ◽

Cited By ~ 6

Author(s):

Haipeng Yao ◽

Yiqing Liu ◽

Chao Fang

Keyword(s):

Big Data ◽

Data Analysis ◽

Network Traffic ◽

Network Flow ◽

False Negative ◽

False Negative Rate ◽

Detection Algorithm ◽

Big Data Analysis ◽

Detection Accuracy ◽

Malicious Behavior

Anomaly network detection is a very important way to analyze and detect malicious behavior in network. How to effectively detect anomaly network flow under the pressure of big data is a very important area, which has attracted more and more researchers’ attention. In this paper, we propose a new model based on big data analysis, which can avoid the influence brought by adjustment of network traffic distribution, increase detection accuracy and reduce the false negative rate. Simulation results reveal that, compared with k-means, decision tree and random forest algorithms, the proposed model has a much better performance, which can achieve a detection rate of 95.4% on normal data, 98.6% on DoS attack, 93.9% on Probe attack, 56.1% on U2R attack, and 77.2% on R2L attack.

Download Full-text

Using Optimal F-Measure and Random Resampling in Gene Ontology Enrichment Calculations

10.1101/218248 ◽

2017 ◽

Cited By ~ 4

Author(s):

Weihao Ge ◽

Zeeshan Fazal ◽

Eric Jakobsson

Keyword(s):

Gene Ontology ◽

Analytical Methods ◽

Binary Classification ◽

False Negative ◽

Heuristic Method ◽

Classification Theory ◽

Significance Threshold ◽

False Discovery ◽

Gene Ontology Enrichment ◽

F Measure

AbstractBackgroundA central question in bioinformatics is how to minimize arbitrariness and bias in analysis of patterns of enrichment in data. A prime example of such a question is enrichment of gene ontology (GO) classes in lists of genes. Our paper deals with two issues within this larger question. One is how to calculate the false discovery rate (FDR) within a set of apparently enriched ontologies, and the second how to set that FDR within the context of assessing significance for addressing biological questions, to answer these questions we compare a random resampling method with a commonly used method for assessing FDR, the Benjamini-Hochberg (BH) method. We further develop a heuristic method for evaluating Type II (false negative) errors to enable utilization of F-Measure binary classification theory for distinguishing “significant” from “non-significant” degrees of enrichment.ResultsThe results show the preferability and feasibility of random resampling assessment of FDR over the analytical methods with which we compare it. They also show that the reasonableness of any arbitrary threshold depends strongly on the structure of the dataset being tested, suggesting that the less arbitrary method of F-measure optimization to determine significance threshold is preferable.ConclusionTherefore, we suggest using F-measure optimization instead of placing an arbitrary threshold to evaluate the significance of Gene Ontology Enrichment results, and using resampling to replace analytical methods

Download Full-text

A Machine Vision-Based Method Optimized for Restoring Broiler Chicken Images Occluded by Feeding and Drinking Equipment

Animals ◽

10.3390/ani11010123 ◽

2021 ◽

Vol 11 (1) ◽

pp. 123

Author(s):

Yangyang Guo ◽

Samuel E. Aggrey ◽

Adelumola Oladeinde ◽

Jasmine Johnson ◽

Gregory Zock ◽

...

Keyword(s):

Machine Vision ◽

Broiler Chickens ◽

Broiler Chicken ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Detection Algorithm ◽

Imaging Method ◽

Restoration Method ◽

Top View

The presence equipment (e.g., water pipes, feed buckets, and other presence equipment, etc.) in the poultry house can occlude the areas of broiler chickens taken via top view. This can affect the analysis of chicken behaviors through a vision-based machine learning imaging method. In our previous study, we developed a machine vision-based method for monitoring the broiler chicken floor distribution, and here we processed and restored the areas of broiler chickens which were occluded by presence equipment. To verify the performance of the developed restoration method, a top-view video of broiler chickens was recorded in two research broiler houses (240 birds equally raised in 12 pens per house). First, a target detection algorithm was used to initially detect the target areas in each image, and then Hough transform and color features were used to remove the occlusion equipment in the detection result further. In poultry images, the broiler chicken occluded by equipment has either two areas (TA) or one area (OA). To reconstruct the occluded area of broiler chickens, the linear restoration method and the elliptical fitting restoration method were developed and tested. Three evaluation indices of the overlap rate (OR), false-positive rate (FPR), and false-negative rate (FNR) were used to evaluate the restoration method. From images collected on d2, d9, d16, and d23, about 100-sample images were selected for testing the proposed method. And then, around 80 high-quality broiler areas detected were further evaluated for occlusion restoration. According to the results, the average value of OR, FPR, and FNR for TA was 0.8150, 0.0032, and 0.1850, respectively. For OA, the average values of OR, FPR, and FNR were 0.8788, 0.2227, and 0.1212, respectively. The study provides a new method for restoring occluded chicken areas that can hamper the success of vision-based machine predictions.

Download Full-text