scholarly journals Simple and Fast Generalized - M (GM) Estimator and Its Application to Real Data Set

2021 ◽  
Vol 50 (3) ◽  
pp. 859-867
Author(s):  
HABSHAH MIDI ◽  
SHELAN SAIED ISMAEEL ◽  
JAYANTHI ARASAN ◽  
MOHAMMED A MOHAMMED

It is now evident that some robust methods such as MM-estimator do not address the concept of bounded influence function, which means that their estimates still be affected by outliers in the X directions or high leverage points (HLPs), even though they have high efficiency and high breakdown point (BDP). The Generalized M(GM) estimator, such as the GM6 estimator is put forward with the main aim of making a bound for the influence of HLPs by some weight function. The limitation of GM6 is that it gives lower weight to both bad leverage points (BLPs) and good leverage points (GLPs) which make its efficiency decreases when more GLPs are present in a data set. Moreover, the GM6 takes longer computational time. In this paper, we develop a new version of GM-estimator which is based on simple and fast algorithm. The attractive feature of this method is that it only downs weights BLPs and vertical outliers (VOs) and increases its efficiency. The merit of our proposed GM estimator is studied by simulation study and well-known aircraft data set.

Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1259 ◽  
Author(s):  
Henry Velasco ◽  
Henry Laniado ◽  
Mauricio Toro ◽  
Víctor Leiva ◽  
Yuhlong Lio

Both cell-wise and case-wise outliers may appear in a real data set at the same time. Few methods have been developed in order to deal with both types of outliers when formulating a regression model. In this work, a robust estimator is proposed based on a three-step method named 3S-regression, which uses the comedian as a highly robust scatter estimate. An intensive simulation study is conducted in order to evaluate the performance of the proposed comedian 3S-regression estimator in the presence of cell-wise and case-wise outliers. In addition, a comparison of this estimator with recently developed robust methods is carried out. The proposed method is also extended to the model with continuous and dummy covariates. Finally, a real data set is analyzed for illustration in order to show potential applications.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Shokrya Saleh

Akaike Information Criterion (AIC) based on least squares (LS) regression minimizes the sum of the squared residuals; LS is sensitive to outlier observations. Alternative criterion, which is less sensitive to outlying observation, has been proposed; examples are robust AIC (RAIC), robust Mallows Cp (RCp), and robust Bayesian information criterion (RBIC). In this paper, we propose a robust AIC by replacing the scale estimate with a high breakdown point estimate of scale. The robustness of the proposed methods is studied through its influence function. We show that, the proposed robust AIC is effective in selecting accurate models in the presence of outliers and high leverage points, through simulated and real data examples.


2019 ◽  
Vol XVI (2) ◽  
pp. 1-11
Author(s):  
Farrukh Jamal ◽  
Hesham Mohammed Reyad ◽  
Soha Othman Ahmed ◽  
Muhammad Akbar Ali Shah ◽  
Emrah Altun

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.


Author(s):  
Parisa Torkaman

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.


2019 ◽  
Vol 14 (2) ◽  
pp. 148-156
Author(s):  
Nighat Noureen ◽  
Sahar Fazal ◽  
Muhammad Abdul Qadir ◽  
Muhammad Tanvir Afzal

Background: Specific combinations of Histone Modifications (HMs) contributing towards histone code hypothesis lead to various biological functions. HMs combinations have been utilized by various studies to divide the genome into different regions. These study regions have been classified as chromatin states. Mostly Hidden Markov Model (HMM) based techniques have been utilized for this purpose. In case of chromatin studies, data from Next Generation Sequencing (NGS) platforms is being used. Chromatin states based on histone modification combinatorics are annotated by mapping them to functional regions of the genome. The number of states being predicted so far by the HMM tools have been justified biologically till now. Objective: The present study aimed at providing a computational scheme to identify the underlying hidden states in the data under consideration. </P><P> Methods: We proposed a computational scheme HCVS based on hierarchical clustering and visualization strategy in order to achieve the objective of study. Results: We tested our proposed scheme on a real data set of nine cell types comprising of nine chromatin marks. The approach successfully identified the state numbers for various possibilities. The results have been compared with one of the existing models as well which showed quite good correlation. Conclusion: The HCVS model not only helps in deciding the optimal state numbers for a particular data but it also justifies the results biologically thereby correlating the computational and biological aspects.


2021 ◽  
Vol 13 (9) ◽  
pp. 1703
Author(s):  
He Yan ◽  
Chao Chen ◽  
Guodong Jin ◽  
Jindong Zhang ◽  
Xudong Wang ◽  
...  

The traditional method of constant false-alarm rate detection is based on the assumption of an echo statistical model. The target recognition accuracy rate and the high false-alarm rate under the background of sea clutter and other interferences are very low. Therefore, computer vision technology is widely discussed to improve the detection performance. However, the majority of studies have focused on the synthetic aperture radar because of its high resolution. For the defense radar, the detection performance is not satisfactory because of its low resolution. To this end, we herein propose a novel target detection method for the coastal defense radar based on faster region-based convolutional neural network (Faster R-CNN). The main processing steps are as follows: (1) the Faster R-CNN is selected as the sea-surface target detector because of its high target detection accuracy; (2) a modified Faster R-CNN based on the characteristics of sparsity and small target size in the data set is employed; and (3) soft non-maximum suppression is exploited to eliminate the possible overlapped detection boxes. Furthermore, detailed comparative experiments based on a real data set of coastal defense radar are performed. The mean average precision of the proposed method is improved by 10.86% compared with that of the original Faster R-CNN.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zahra Arefinia ◽  
Dip Prakash Samajdar

AbstractNumerical-based simulations of plasmonic polymer solar cells (PSCs) incorporating a disordered array of non-uniform sized plasmonic nanoparticles (NPs) impose a prohibitively long-time and complex computational demand. To surmount this limitation, we present a novel semi-analytical modeling, which dramatically reduces computational time and resource consumption and yet is acceptably accurate. For this purpose, the optical modeling of active layer-incorporated plasmonic metal NPs, which is described by a homogenization theory based on a modified Maxwell–Garnett-Mie theory, is inputted in the electrical modeling based on the coupled equations of Poisson, continuity, and drift–diffusion. Besides, our modeling considers the effects of absorption in the non-active layers, interference induced by electrodes, and scattered light escaping from the PSC. The modeling results satisfactorily reproduce a series of experimental data for photovoltaic parameters of plasmonic PSCs, demonstrating the validity of our modeling approach. According to this, we implement the semi-analytical modeling to propose a new high-efficiency plasmonic PSC based on the PM6:Y6 PSC, having the highest reported power conversion efficiency (PCE) to date. The results show that the incorporation of plasmonic NPs into PM6:Y6 active layer leads to the PCE over 18%.


2021 ◽  
Vol 1978 (1) ◽  
pp. 012047
Author(s):  
Xiaona Sheng ◽  
Yuqiu Ma ◽  
Jiabin Zhou ◽  
Jingjing Zhou

2021 ◽  
pp. 1-11
Author(s):  
Velichka Traneva ◽  
Stoyan Tranev

Analysis of variance (ANOVA) is an important method in data analysis, which was developed by Fisher. There are situations when there is impreciseness in data In order to analyze such data, the aim of this paper is to introduce for the first time an intuitionistic fuzzy two-factor ANOVA (2-D IFANOVA) without replication as an extension of the classical ANOVA and the one-way IFANOVA for a case where the data are intuitionistic fuzzy rather than real numbers. The proposed approach employs the apparatus of intuitionistic fuzzy sets (IFSs) and index matrices (IMs). The paper also analyzes a unique set of data on daily ticket sales for a year in a multiplex of Cinema City Bulgaria, part of Cineworld PLC Group, applying the two-factor ANOVA and the proposed 2-D IFANOVA to study the influence of “ season ” and “ ticket price ” factors. A comparative analysis of the results, obtained after the application of ANOVA and 2-D IFANOVA over the real data set, is also presented.


Author(s):  
A Salman Avestimehr ◽  
Seyed Mohammadreza Mousavi Kalan ◽  
Mahdi Soltanolkotabi

Abstract Dealing with the shear size and complexity of today’s massive data sets requires computational platforms that can analyze data in a parallelized and distributed fashion. A major bottleneck that arises in such modern distributed computing environments is that some of the worker nodes may run slow. These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. A recent computational framework, called encoded optimization, creates redundancy in the data to mitigate the effect of stragglers. In this paper, we develop novel mathematical understanding for this framework demonstrating its effectiveness in much broader settings than was previously understood. We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy) and straggler toleration in this framework.


Sign in / Sign up

Export Citation Format

Share Document