kernel smoothing
Recently Published Documents


TOTAL DOCUMENTS

262
(FIVE YEARS 60)

H-INDEX

28
(FIVE YEARS 2)

2022 ◽  
Vol 23 (S1) ◽  
Author(s):  
Xifang Sun ◽  
Donglin Wang ◽  
Jiaqiang Zhu ◽  
Shiquan Sun

Abstract Background DNA methylation has long been known as an epigenetic gene silencing mechanism. For a motivating example, the methylomes of cancer and non-cancer cells show a number of methylation differences, indicating that certain features characteristics of cancer cells may be related to methylation characteristics. Robust methods for detecting differentially methylated regions (DMRs) could help scientists narrow down genome regions and even find biologically important regions. Although some statistical methods were developed for detecting DMR, there is no default or strongest method. Fisher’s exact test is direct, but not suitable for data with multiple replications, while regression-based methods usually come with a large number of assumptions. More complicated methods have been proposed, but those methods are often difficult to interpret. Results In this paper, we propose a three-step nonparametric kernel smoothing method that is both flexible and straightforward to implement and interpret. The proposed method relies on local quadratic fitting to find the set of equilibrium points (points at which the first derivative is 0) and the corresponding set of confidence windows. Potential regions are further refined using biological criteria, and finally selected based on a Bonferroni adjusted t-test cutoff. Using a comparison of three senescent and three proliferating cell lines to illustrate our method, we were able to identify a total of 1077 DMRs on chromosome 21. Conclusions We proposed a completely nonparametric, statistically straightforward, and interpretable method for detecting differentially methylated regions. Compared with existing methods, the non-reliance on model assumptions and the straightforward nature of our method makes it one competitive alternative to the existing statistical methods for defining DMRs.


Author(s):  
Paula Saavedra-Nieves ◽  
Rosa M. Crujeiras

AbstractHighest density regions (HDRs) are defined as level sets containing sample points of relatively high density. Although Euclidean HDR estimation from a random sample, generated from the underlying density, has been widely considered in the statistical literature, this problem has not been contemplated for directional data yet. In this work, directional HDRs are formally defined and plug-in estimators based on kernel smoothing and associated confidence regions are proposed. We also provide a new suitable bootstrap bandwidth selector for plug-in HDRs estimation based on the minimization of an error criteria that involves the Hausdorff distance between the boundaries of the theoretical and estimated HDRs. An extensive simulation study shows the performance of the resulting estimator for the circle and for the sphere. The methodology is applied to analyze two real data sets in animal orientation and seismology.


Author(s):  
M. I. Borrajo ◽  
C. Comas ◽  
S. Costafreda-Aumedes ◽  
J. Mateu

AbstractWildlife-vehicle collisions on road networks represent a natural problem between human populations and the environment, that affects wildlife management and raise a risk to the life and safety of car drivers. We propose a statistically principled method for kernel smoothing of point pattern data on a linear network when the first-order intensity depends on covariates. In particular, we present a consistent kernel estimator for the first-order intensity function that uses a convenient relationship between the intensity and the density of events location over the network, which also exploits the theoretical relationship between the original point process on the network and its transformed process through the covariate. We derive the asymptotic bias and variance of the estimator, and adapt some data-driven bandwidth selectors to estimate the optimal bandwidth. The performance of the estimator is analysed through a simulation study under inhomogeneous scenarios. We present a real data analysis on wildlife-vehicle collisions in a region of North-East of Spain.


Computation ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 83
Author(s):  
Vladimir Viktorovich Bukhtoyarov ◽  
Vadim Sergeevich Tynchenko

This article deals with the problem of designing regression models for evaluating the parameters of the operation of complex technological equipment—hydroturbine units. A promising approach to the construction of regression models based on nonparametric Nadaraya–Watson kernel estimates is considered. A known problem in applying this approach is to determine the effective values of kernel-smoothing coefficients. Kernel-smoothing factors significantly impact the accuracy of the regression model, especially under conditions of variability of noise and parameters of samples in the input space of models. This fully corresponds to the characteristics of the problem of estimating the parameters of hydraulic turbines. We propose to use the evolutionary genetic algorithm with an addition in the form of a local-search stage to adjust the smoothing coefficients. This ensures the local convergence of the tuning procedure, which is important given the high sensitivity of the quality criterion of the nonparametric model. On a set of test problems, the results were obtained showing a reduction in the modeling error by 20% and 28% for the methods of adjusting the coefficients by the standard and hybrid genetic algorithms, respectively, in comparison with the case of an arbitrary choice of the values of such coefficients. For the task of estimating the parameters of the operation of a hydroturbine unit, a number of promising approaches to constructing regression models based on artificial neural networks, multidimensional adaptive splines, and an evolutionary method of genetic programming were included in the research. The proposed nonparametric approach with a hybrid smoothing coefficient tuning scheme was found to be most effective with a reduction in modeling error of about 5% compared with the best of the alternative approaches considered in the study, which, according to the results of numerical experiments, was the method of multivariate adaptive regression splines.


2021 ◽  
Vol 10 (2) ◽  
pp. 180-188
Author(s):  
Sarwar A. Hamad ◽  
Kawa S. Mohamed Ali

Two non-parametric statistical methods are studied in this work. These are the nearest neighbor regression and the Nadaraya Watson kernel smoothing technique. We have proven that under a precise circumstance, the nearest neighborhood estimator and the Nadaraya Watson smoothing produce a smoothed data with a same error level, which means they have the same performance. Another result of the paper is that nearest neighborhood estimator performs better locally, but it graphically shows a weakness point when a large data set is considered on a global scale.


Water ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 1422
Author(s):  
Kaiyang Wang ◽  
Lingrong Kong ◽  
Zixin Yang ◽  
Prateek Singh ◽  
Fangyu Guo ◽  
...  

This study explores the quality of data produced by Global Precipitation Measurement (GPM) and the potential of GPM for real-time short-term nowcasting using MATLAB and the Short-Term Ensemble Prediction System (STEPS). Precipitation data obtained by rain gauges during the period 2015 to 2017 were used in this comparative analysis. The results show that the quality of GPM precipitation has different degrees efficacies at the national scale, which were revealed at the performance analysis stage of the study. After data quality checking, five representative precipitation events were selected for nowcasting evaluation. The GPM estimated precipitation compared to a 30 min forecast using STEPS precipitation nowcast results, showing that the GPM precipitation data performed well in nowcasting between 0 to 120 min. However, the accuracy and quality of nowcasting precipitation significantly reduced with increased lead time. A major finding from the study is that the quality of precipitation data can be improved through blending processes such as kriging with external drift and the double-kernel smoothing method, which enhances the quality of nowcast over longer lead times.


Author(s):  
G. Shenbrot ◽  
B. Kryštufek

Habitat niche breadth for Palearctic Arvicolinae species was estimated at both local (α- niche) and global (the entire geographic range, γ-niche) scales using occurrence records of species and environmental (climate, topography, and vegetation) data. Niche breadth was estimated in the space of the first two principal components of environmental variables using kernel smoothing of the densities of species occurrence points. The breadth of α-niches was estimated for a set of random points inside the geographic range in a series of buffers of increasing size around these points. Within each buffer, we calculated the overlap between the distribution of environment values for the kernel smoothed densities of species occurrence points and the distribution of environment values in the background environment. The α-niche breadth was calculated as the slope of the linear regression of the niche breadth for buffers of different size by the ln area of these buffers with a zero intercept. The γ-niche breadth was calculated as the overlap between the distributions of environmental values for the kernel smoothed densities of species occurrence points over the whole geographic range and the distribution of environmental values in the background environment and also approximated by linear regression of the species’ average α-niche to the geographic range area of this species. The results demonstrated that the geographic range size was significantly related with the α- and γ-niche breadth. The γ-niche breadth was significantly positively correlated with the α-niche breadth. Finally, the differences between the γ-niche breadth values that were directly estimated and extrapolated from the α-niche breadth (Δ) values were positively correlated with the geographic range size. Thus, we conclude that the species occupy larger geographic ranges because they have broader niches. Our estimations of the γ-niche breadth increase with the geographic range size not due to a parallel increase of the environmental diversity (spatial autocorrelation in the environment).


Sign in / Sign up

Export Citation Format

Share Document