scholarly journals A Comparison Between New Modification of ANWK and Classical ANWK Methods in Nonparametric Regression

2021 ◽  
Vol 5 (2) ◽  
pp. 32-37
Author(s):  
Hazhar T. A. Blbas ◽  
Wasfi T. Kahwachi

Nonparametric kernel estimators are mostly used in a variety of statistical research fields. Nadaraya-Watson kernel estimator (NWK) is one of the most important nonparametric kernel estimator that is often used in regression models with a fixed bandwidth. In this article, we consider the four new Proposed Adaptive Nadaraya-Watson Kernel Regression Estimators (Interquartile Range, Standard Deviation, Mean Absolute Devotion, and Median Absolute Deviation) rather than (Fixed Bandwidth, Adaptive Geometric, Adaptive Mean, Adaptive Range, and Adaptive Median). The outcomes in both simulation and actual data in Leukemia Cancer show that the four new ANW Kernel Estimators (Interquartile Range, Standard Deviation, Mean Absolute devotion, and Median Absolute Deviation) is more effective than the kernel estimations with fixed bandwidth in previous studies using Mean Square Error (MSE) Criterion.

2021 ◽  
Vol 14 (11) ◽  
pp. 2114-2126
Author(s):  
Zhiwei Chen ◽  
Shaoxu Song ◽  
Ziheng Wei ◽  
Jingyun Fang ◽  
Jiang Long

The median absolute deviation (MAD) is a statistic measuring the variability of a set of quantitative elements. It is known to be more robust to outliers than the standard deviation (SD), and thereby widely used in outlier detection. Computing the exact MAD however is costly, e.g., by calling an algorithm of finding median twice, with space cost O ( n ) over n elements in a set. In this paper, we propose the first fully mergeable approximate MAD algorithm, OP-MAD, with one-pass scan of the data. Remarkably, by calling the proposed algorithm at most twice, namely TP-MAD, it guarantees to return an (ϵ, 1)-accurate MAD, i.e., the error relative to the exact MAD is bounded by the desired ϵ or 1. The space complexity is reduced to O ( m ) while the time complexity is O ( n + m log m ), where m is the size of the sketch used to compress data, related to the desired error bound ϵ. To get a more accurate MAD, i.e., with smaller ϵ, the sketch size m will be larger, a trade-off between effectiveness and efficiency. In practice, we often have the sketch size m ≪ n , leading to constant space cost O (1) and linear time cost O ( n ). The extensive experiments over various datasets demonstrate the superiority of our solution, e.g., 160000× less memory and 18x faster than the aforesaid exact method in datasets pareto and norm . Finally, we further implement and evaluate the parallelizable TP-MAD in Apache Spark, and the fully mergeable OP-MAD in Structured Streaming.


1994 ◽  
Vol 77 (6) ◽  
pp. 1660-1663
Author(s):  
Richard L Johnson ◽  
George W Latimer ◽  
Cliff Spiegelman

Abstract Improved standard deviation estimates from possibly biased duplicate measurements can be derived from appropriately trimmed plots of standard deviation estimates using pairs of replicates vs the quantiles of a half-normal distribution. Simulated studies show that these estimates exhibit generally lower mean-squared errors and biases than do more standard robust estimators of location—¾ times the interquartile range and 3/2 times the mean absolute deviation from the median.


2021 ◽  
Vol 10 (4) ◽  
pp. 2212-2222
Author(s):  
Alvincent E. Danganan ◽  
Edjie Malonzo De Los Reyes

Improved multi-cluster overlapping k-means extension (IMCOKE) uses median absolute deviation (MAD) in detecting outliers in datasets makes the algorithm more effective with regards to overlapping clustering. Nevertheless, analysis of the applied MAD positioning was not considered. In this paper, the incorporation of MAD used to detect outliers in the datasets was analyzed to determine the appropriate position in identifying the outlier before applying it in the clustering application. And the assumption of the study was the size of the cluster and cluster that are close to each other can led to a higher runtime performance in terms of overlapping clusters. Therefore, additional parameters such as radius of clusters and distance between clusters are added measurements in the algorithm procedures. Evaluation was done through experimentations using synthetic and real datasets. The performance of the eHMCOKE was evaluated via F1-measure criterion, speed and percentage of improvement. Evaluation results revealed that the eHMCOKE takes less time to discover overlap clusters with an improvement rate of 22% and achieved the best performance of 91.5% accuracy rate via F1-measure in identifying overlapping clusters over the IMCOKE algorithm. These results proved that the eHMCOKE significantly outruns the IMCOKE algorithm on mosts of the test conducted.


2021 ◽  
Author(s):  
Deepanshu Sharma ◽  
Surya Priya Ulaganathan ◽  
Vinay Sharma ◽  
Sakshi Piplani ◽  
Ravi Ranjan Kumar Niraj

Abstract Background and objectivesMeta-analysis is a statistical procedure which enables the researcher to integrate the results of various studies that were conducted for the same purpose. However, more often than not, researchers find themselves in a position unable to proceed further due to the complexity of the mathematics involved and unavailability of raw data. To alleviate the said difficulty, we are presenting a tool that will enable researchers to process raw data.MethodsThe GUI tool is written in python. The tool offers an automated conversion and obtainment of mean and standard deviation (SD) from median and interquartile range, utilizing the methods offered by Hozo et al. 2005 and Bland 2015.ResultsThe tool is tested on some sample data and validation is performed for Bland method on the data provided in the Bland method publication (14).ConclusionsThe provided tool is an easy alternative for the preparation of input data required for clinical meta-analysis in the required format.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 654 ◽  
Author(s):  
Wilmar Hernandez ◽  
Alfredo Mendez ◽  
Rasa Zalakeviciute ◽  
Angela Maria Diaz-Marquez

In this article, robust confidence intervals for PM2.5 (particles with size less than or equal to 2.5   μ m ) concentration measurements performed in La Carolina Park, Quito, Ecuador, have been built. Different techniques have been applied for the construction of the confidence intervals, and routes around the park and through the middle of it have been used to build the confidence intervals and classify this urban park in accordance with categories established by the Quito air quality index. These intervals have been based on the following estimators: the mean and standard deviation, median and median absolute deviation, median and semi interquartile range, a -trimmed mean and Winsorized standard error of order a , location and scale estimators based on the Andrew’s wave, biweight location and scale estimators, and estimators based on the bootstrap- t method. The results of the classification of the park and its surrounding streets showed that, in terms of air pollution by PM2.5, the park is not at caution levels. The results of the classification of the routes that were followed through the park and its surrounding streets showed that, in terms of air pollution by PM2.5, these routes are at either desirable, acceptable or caution levels. Therefore, this urban park is actually removing or attenuating unwanted PM2.5 concentration measurements.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 339 ◽  
Author(s):  
Yongsong Li ◽  
Zhengzhou Li ◽  
Kai Wei ◽  
Weiqi Xiong ◽  
Jiangpeng Yu ◽  
...  

Noise estimation for image sensor is a key technique in many image pre-processing applications such as blind de-noising. The existing noise estimation methods for additive white Gaussian noise (AWGN) and Poisson-Gaussian noise (PGN) may underestimate or overestimate the noise level in the situation of a heavy textured scene image. To cope with this problem, a novel homogenous block-based noise estimation method is proposed to calculate these noises in this paper. Initially, the noisy image is transformed into the map of local gray statistic entropy (LGSE), and the weakly textured image blocks can be selected with several biggest LGSE values in a descending order. Then, the Haar wavelet-based local median absolute deviation (HLMAD) is presented to compute the local variance of these selected homogenous blocks. After that, the noise parameters can be estimated accurately by applying the maximum likelihood estimation (MLE) to analyze the local mean and variance of selected blocks. Extensive experiments on synthesized noised images are induced and the experimental results show that the proposed method could not only more accurately estimate the noise of various scene images with different noise levels than the compared state-of-the-art methods, but also promote the performance of the blind de-noising algorithm.


Sign in / Sign up

Export Citation Format

Share Document