scholarly journals Two-Step Method for Assessing Similarity of Random Sets

2021 ◽  
Vol 40 (3) ◽  
pp. 127-140
Author(s):  
Vesna Gotovac Đogaš ◽  
Kateřina Helisová ◽  
Bogdan Radović ◽  
Jakub Staněk ◽  
Markéta Zikmundová ◽  
...  

The paper concerns a new statistical method for assessing dissimilarity of two random sets based on one realisation of each of them. The method focuses on shapes of the components of the random sets, namely on the curvature of their boundaries together with the ratios of their perimeters and areas. Theoretical background is introduced and then, the method is described, justified by a simulation study and applied to real data of two different types of tissue - mammary cancer and mastopathy.

Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1259 ◽  
Author(s):  
Henry Velasco ◽  
Henry Laniado ◽  
Mauricio Toro ◽  
Víctor Leiva ◽  
Yuhlong Lio

Both cell-wise and case-wise outliers may appear in a real data set at the same time. Few methods have been developed in order to deal with both types of outliers when formulating a regression model. In this work, a robust estimator is proposed based on a three-step method named 3S-regression, which uses the comedian as a highly robust scatter estimate. An intensive simulation study is conducted in order to evaluate the performance of the proposed comedian 3S-regression estimator in the presence of cell-wise and case-wise outliers. In addition, a comparison of this estimator with recently developed robust methods is carried out. The proposed method is also extended to the model with continuous and dummy covariates. Finally, a real data set is analyzed for illustration in order to show potential applications.


2021 ◽  
Author(s):  
Jakob Raymaekers ◽  
Peter J. Rousseeuw

AbstractMany real data sets contain numerical features (variables) whose distribution is far from normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box–Cox and Yeo–Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformation parameter is highly sensitive to outliers, and will often try to move outliers inward at the expense of the normality of the central part of the data. We propose a modification of these transformations as well as an estimator of the transformation parameter that is robust to outliers, so the transformed data can be approximately normal in the center and a few outliers may deviate from it. It compares favorably to existing techniques in an extensive simulation study and on real data.


2021 ◽  
Vol 9 (4) ◽  
pp. 410
Author(s):  
Fan Zhang ◽  
Xin Peng ◽  
Liang Huang ◽  
Man Zhu ◽  
Yuanqiao Wen ◽  
...  

In this study, a method for dynamically establishing ship domain in inland waters is proposed to help make decisions about ship collision avoidance. The surrounding waters of the target ship are divided to grids and then calculating the grid densities of ships in each moment to determine the shape and size of ship domain of different types of ships. At last, based on the spatiotemporal statistical method, the characteristics of ship domains of different types of ship in different navigational environments were analyzed. The proposed method is applied to establish ship domains of different types of ship in Wuhan section of the Yangtze River in January, February, July, and August in 2014. The results show that the size of ship domain increases as the ship size increases in each month. The domain size is significantly influenced by the water level, and the ship domain size in dry seasons is larger than in the wet seasons of inland waters.


Author(s):  
Liangli Yang ◽  
Yongmei Su ◽  
Xinjian Zhuo

The outbreak of COVID-19 has a great impact on the world. Considering that there are different infection delays among different populations, which can be expressed as distributed delay, and the distributed time-delay is rarely used in fractional-order model to simulate the real data, here we establish two different types of fractional order (Caputo and Caputo–Fabrizio) COVID-19 models with distributed time-delay. Parameters are estimated by the least-square method according to the report data of China and other 12 countries. The results of Caputo and Caputo–Fabrizio model with distributed time-delay and without delay, the integer-order model with distributed delay are compared. These show that the fractional-order model can be better in fitting the real data. Moreover, Caputo order is better in short-term time fitting, Caputo–Fabrizio order is better in long-term fitting and prediction. Finally, the influence of several parameters is simulated in Caputo order model, which further verifies the importance of taking strict quarantine measures and paying close attention to the incubation period population.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Liansheng Larry Tang ◽  
Michael Caudy ◽  
Faye Taxman

Multiple meta-analyses may use similar search criteria and focus on the same topic of interest, but they may yield different or sometimes discordant results. The lack of statistical methods for synthesizing these findings makes it challenging to properly interpret the results from multiple meta-analyses, especially when their results are conflicting. In this paper, we first introduce a method to synthesize the meta-analytic results when multiple meta-analyses use the same type of summary effect estimates. When meta-analyses use different types of effect sizes, the meta-analysis results cannot be directly combined. We propose a two-step frequentist procedure to first convert the effect size estimates to the same metric and then summarize them with a weighted mean estimate. Our proposed method offers several advantages over existing methods by Hemming et al. (2012). First, different types of summary effect sizes are considered. Second, our method provides the same overall effect size as conducting a meta-analysis on all individual studies from multiple meta-analyses. We illustrate the application of the proposed methods in two examples and discuss their implications for the field of meta-analysis.


2016 ◽  
Vol 32 (4) ◽  
pp. 887-905 ◽  
Author(s):  
Luciana Dalla Valle

Abstract Official statistics are a fundamental source of publicly available information that periodically provides a great amount of data on all major areas of citizens’ lives, such as economics, social development, education, and the environment. However, these extraordinary sources of information are often neglected, especially by business and industrial statisticians. In particular, data collected from small businesses, like small and medium-sized enterprizes (SMEs), are rarely integrated with official statistics data. In official statistics data integration, the quality of data is essential to guarantee reliable results. Considering the analysis of surveys on SMEs, one of the most common issues related to data quality is the high proportion of nonresponses that leads to self-selection bias. This work illustrates a flexible methodology to deal with self-selection bias, based on the generalization of Heckman’s two-step method with the introduction of copulas. This approach allows us to assume different distributions for the marginals and to express various dependence structures. The methodology is illustrated through a real data application, where the parameters are estimated according to the Bayesian approach and official statistics data are incorporated into the model via informative priors.


2021 ◽  
Vol 9 (1) ◽  
pp. 190-210
Author(s):  
Arvid Sjölander ◽  
Ola Hössjer

Abstract Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008256
Author(s):  
Shuonan Chen ◽  
Jackson Loper ◽  
Xiaoyin Chen ◽  
Alex Vaughan ◽  
Anthony M. Zador ◽  
...  

Modern spatial transcriptomics methods can target thousands of different types of RNA transcripts in a single slice of tissue. Many biological applications demand a high spatial density of transcripts relative to the imaging resolution, leading to partial mixing of transcript rolonies in many voxels; unfortunately, current analysis methods do not perform robustly in this highly-mixed setting. Here we develop a new analysis approach, BARcode DEmixing through Non-negative Spatial Regression (BarDensr): we start with a generative model of the physical process that leads to the observed image data and then apply sparse convex optimization methods to estimate the underlying (demixed) rolony densities. We apply BarDensr to simulated and real data and find that it achieves state of the art signal recovery, particularly in densely-labeled regions or data with low spatial resolution. Finally, BarDensr is fast and parallelizable. We provide open-source code as well as an implementation for the ‘NeuroCAAS’ cloud platform.


2021 ◽  
Vol 22 (1) ◽  
pp. 91-107
Author(s):  
F. S. Lobato ◽  
G. M. Platt ◽  
G. B. Libotte ◽  
A. J. Silva Neto

Different types of mathematical models have been used to predict the dynamic behavior of the novel coronavirus (COVID-19). Many of them involve the formulation and solution of inverse problems. This kind of problem is generally carried out by considering the model, the vector of design variables, and system parameters as deterministic values. In this contribution, a methodology based on a double loop iteration process and devoted to evaluate the influence of uncertainties on inverse problem is evaluated. The inner optimization loop is used to find the solution associated with the highest probability value, and the outer loop is the regular optimization loop used to determine the vector of design variables. For this task, we use an inverse reliability approach and Differential Evolution algorithm. For illustration purposes, the proposed methodology is applied to estimate the parameters of SIRD (Susceptible-Infectious-Recovery-Dead) model associated with dynamic behavior of COVID-19 pandemic considering real data from China's epidemic and uncertainties in the basic reproduction number (R0). The obtained results demonstrate, as expected, that the increase of reliability implies the increase of the objective function value.


2017 ◽  
Vol 40 (2) ◽  
pp. 205-221 ◽  
Author(s):  
Shahryar Mirzaei ◽  
Gholam Reza Mohtashami Borzadaran ◽  
Mohammad Amini

In this paper, we consider two well-known methods for analysis of the Gini index, which are U-statistics and linearization for some incomedistributions. In addition, we evaluate two different methods for some properties of their proposed estimators. Also, we compare two methods with resampling techniques in approximating some properties of the Gini index. A simulation study shows that the linearization method performs 'well' compared to the Gini estimator based on U-statistics. A brief study on real data supports our findings.


Sign in / Sign up

Export Citation Format

Share Document