breakdown point
Recently Published Documents





Atmosphere ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 145
Siti Mariana Che Mat Nor ◽  
Shazlyn Milleana Shaharudin ◽  
Shuhaida Ismail ◽  
Sumayyah Aimi Mohd Najib ◽  
Mou Leong Tan ◽  

This study was conducted to identify the spatiotemporal torrential rainfall patterns of the East Coast of Peninsular Malaysia, as it is the region most affected by the torrential rainfall of the Northeast Monsoon season. Dimension reduction, such as the classical Principal Components Analysis (PCA) coupled with the clustering approach, is often applied to reduce the dimension of the data while simultaneously performing cluster partitions. However, the classical PCA is highly insensitive to outliers, as it assigns equal weights to each set of observations. Hence, applying the classical PCA could affect the cluster partitions of the rainfall patterns. Furthermore, traditional clustering algorithms only allow each element to exclusively belong to one cluster, thus observations within overlapping clusters of the torrential rainfall datasets might not be captured effectively. In this study, a statistical model of torrential rainfall pattern recognition was proposed to alleviate these issues. Here, a Robust PCA (RPCA) based on Tukey’s biweight correlation was introduced and the optimum breakdown point to extract the number of components was identified. A breakdown point of 0.4 at 85% cumulative variance percentage efficiently extracted the number of components to avoid low-frequency variations or insignificant clusters on a spatial scale. Based on the extracted components, the rainfall patterns were further characterized based on cluster solutions attained using Fuzzy C-means clustering (FCM) to allow data elements to belong to more than one cluster, as the rainfall data structure permits this. Lastly, data generated using a Monte Carlo simulation were used to evaluate the performance of the proposed statistical modeling. It was found that the proposed RPCA-FCM performed better using RPCA-FCM compared to the classical PCA coupled with FCM in identifying the torrential rainfall patterns of Peninsular Malaysia’s East Coast.

Liang Chen ◽  
Jintang Li ◽  
Qibiao Peng ◽  
Yang Liu ◽  
Zibin Zheng ◽  

Recent studies have shown that Graph Convolutional Networks (GCNs) are vulnerable to adversarial attacks on the graph structure. Although multiple works have been proposed to improve their robustness against such structural adversarial attacks, the reasons for the success of the attacks remain unclear. In this work, we theoretically and empirically demonstrate that structural adversarial examples can be attributed to the non-robust aggregation scheme (i.e., the weighted mean) of GCNs. Specifically, our analysis takes advantage of the breakdown point which can quantitatively measure the robustness of aggregation schemes. The key insight is that weighted mean, as the basic design of GCNs, has a low breakdown point and its output can be dramatically changed by injecting a single edge. We show that adopting the aggregation scheme with a high breakdown point (e.g., median or trimmed mean) could significantly enhance the robustness of GCNs against structural attacks. Extensive experiments on four real-world datasets demonstrate that such a simple but effective method achieves the best robustness performance compared to state-of-the-art models.

2021 ◽  
Vol 10 (3) ◽  
pp. 329

Produksi padi di Kabupaten Blitar mengalami peningkatan dan penurunan, hal ini dipengaruhi oleh beberapa faktor, diantaranya jumlah petani, alokasi pupuk, ratarata curah hujan, luas panen, luas tanam, produktivitas, dan alat pengolah padi. Oleh karena itu, untuk mengetahui faktor-faktor yang lebih signifikan tersebut, guna mencapai produksi padi yang optimal dapat digunakan analisis regresi. Namun, adanya data pencilan pada suatu data penelitian dapat mengganggu proses analisis data. Regresi robust merupakan metode yang efisien untuk menganalisis data yang mengandung pencilan. Regresi robust memiliki beberapa metode estimasi, dua diantaranya adalah Least Trimmed Square (LTS) dan Estimasi S yang memiliki persamaan karateristik pada efisiensi dan breakdown point. Penelitian ini bertujuan untuk membandingkan kedua metode tersebut pada data produksi padi di Kabupaten Blitar tahun 2018 dengan tujuh variabel bebas (jumlah petani, alokasi pupuk, rata-rata curah hujan, luas panen, luas tanam, produktivitas, dan alat pengolah padi). Pengambilan data pada tahun 2018 didasarkan pada kelengkapan dokumen serta adanya kekhawatiran pandemi Covid-19 mempengaruhi data. Estimasi regresi robust menggunakan metode Least Trimmed Square (LTS) pada produksi padi di Kabupaten Blitar diperoleh model: Y = −11262, 756 − 0, 01x1 + 0, 031x2 − 14, 304x3 + 2, 292x4 + 3, 741x5 + 188, 274x6 − 0, 419x7 dan estimasi regresi robust menggunakan metode Estimasi S pada produksi padi di Kabupaten Blitar diperoleh model: Y = −9698, 949−0, 14x1−0, 49x2−19, 531x3+0, 133x4+5, 714x5+175, 018x6−0, 507x7. Hasil penelitian menunjukan regresi robust metode Least Trimmed Square (LTS) merupakan metode yang menghasilkan model terbaik, karena metode Least Trimmed Square (LTS) memiliki nilai koefisien determinasi (R2 ) sebesar 0, 99999 yang lebih besar dibandingkan nilai koefisien determinasi (R2 ) metode Estimasi S sebesar 0,99882, dan metode Least Trimmed Square (LTS) memiliki nilai Mean Square Error (MSE) sebesar 0,62105 yang lebih kecil dibandingkan nilai Mean Square Error (MSE) metode Estimasi S sebesar 9,04800.Kata Kunci: Data Pencilan (outlier), Produksi Padi, Regresi Robust

2021 ◽  
Vol 16 (2) ◽  
pp. 109-115
Nicholas P. Dibal ◽  
Hamadu Dallah

Observations on certain real-life cases include units that are incompatible with other data sets. Values that are extreme in nature do influence estimates obtained by conventional estimators. Robust estimators are therefore necessary for efficient estimation of parameters. This paper uses stratification with simple random sampling without replacement to optimize sample allocation in stratum for efficient parameter estimation as an alternative method of handling highly contaminated samples. Our proposed method stratifies the highly contaminated population into two non-overlapping sub-populations, and stratified samples of sizes 50, 200, and 500 was drawn. We estimate the model parameters form the contaminated sampled data using ordinary least squares under the proposed method, and using the two high breakdown point estimators; the Least Median of Squares and Least Trimmed Squares. Our findings shows that the proposed method did not perform well for low contamination levels (⩽ 30%) but outperformed Least Median of Squares and Least Trimmed Squares for higher contamination rates (⩾ 40%). This indicates that our proposed method compares well and compete favorably with the two high breakdown point estimators.

Jurnal Varian ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 91-98
Trianingsih Eni Lestari ◽  
Rike Desy Tri Yuansa Yuansa

The surface response method is similar to the regression analysis method which uses procedures or ways of estimating the response function regression model based on the Ordinary Least Square (OLS) method. Unfortunately, using the quadratic method has no drawbacks because it is easily sensitive to assumption deviations due to outlier cases. One of the solutions to the outlier problem is using robust regression. The method of parameters in the regression is very diverse, but the methods used in this study are the Least Trimmed Square (LTS) and MM-estimator methods because both methods have a high breakdown point of nearly 50%. The variables studied were the response variable consisting of red roselle plant height (Y1) and red roselle flower weight (Y2). While the independent variables were soil moisture factor (X1) and NPK fertilizer application factor (X2). The purpose of this study is to estimate the response surface regression parameters. using the LTS and MM-estimator methods on data that contains outliers. The resulting model in data analysis shows the same result that the best model is using the LTS estimation method. The modeling result of plant height obtained an R-Square value of 98,27% with an error is 1,243. Meanwhile, for the red rosella plant flower weight model, the R-Square value was 97,31% with an error is 0.6632.

Stats ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 327-347
Francesca Torti ◽  
Aldo Corbellini ◽  
Anthony C. Atkinson

The forward search (FS) is a general method of robust data fitting that moves smoothly from very robust to maximum likelihood estimation. The regression procedures are included in the MATLAB toolbox FSDA. The work on a SAS version of the FS originates from the need for the analysis of large datasets expressed by law enforcement services operating in the European Union that use our SAS software for detecting data anomalies that may point to fraudulent customs returns. Specific to our SAS implementation, the fsdaSAS package, we describe the approximation used to provide fast analyses of large datasets using an FS which progresses through the inclusion of batches of observations, rather than progressing one observation at a time. We do, however, test for outliers one observation at a time. We demonstrate that our SAS implementation becomes appreciably faster than the MATLAB version as the sample size increases and is also able to analyse larger datasets. The series of fits provided by the FS leads to the adaptive data-dependent choice of maximally efficient robust estimates. This also allows the monitoring of residuals and parameter estimates for fits of differing robustness levels. We mention that our fsdaSAS also applies the idea of monitoring to several robust estimators for regression for a range of values of breakdown point or nominal efficiency, leading to adaptive values for these parameters. We have also provided a variety of plots linked through brushing. Further programmed analyses include the robust transformations of the response in regression. Our package also provides the SAS community with methods of monitoring robust estimators for multivariate data, including multivariate data transformations.

2021 ◽  
Vol 50 (3) ◽  
pp. 859-867

It is now evident that some robust methods such as MM-estimator do not address the concept of bounded influence function, which means that their estimates still be affected by outliers in the X directions or high leverage points (HLPs), even though they have high efficiency and high breakdown point (BDP). The Generalized M(GM) estimator, such as the GM6 estimator is put forward with the main aim of making a bound for the influence of HLPs by some weight function. The limitation of GM6 is that it gives lower weight to both bad leverage points (BLPs) and good leverage points (GLPs) which make its efficiency decreases when more GLPs are present in a data set. Moreover, the GM6 takes longer computational time. In this paper, we develop a new version of GM-estimator which is based on simple and fast algorithm. The attractive feature of this method is that it only downs weights BLPs and vertical outliers (VOs) and increases its efficiency. The merit of our proposed GM estimator is studied by simulation study and well-known aircraft data set.

2021 ◽  
Vol 151 ◽  
pp. 106810
Yulong Wang ◽  
Baoxing Duan ◽  
Licheng Sun ◽  
Xin Yang ◽  
Yunjia Huang ◽  

Shazlyn Milleana Shaharudin ◽  
Norhaiza Ahmad ◽  
Siti Mariana Che Mat Nor

This paper presents a modified correlation in principal component analysis (PCA) for selection number of clusters in identifying rainfall patterns. The approach of a clustering as guided by PCA is extensively employed in data with high dimension especially in identifying the spatial distribution patterns of daily torrential rainfall. Typically, a common method of identifying rainfall patterns for climatological investigation employed T mode-based Pearson correlation matrix to extract the relative variance retained. However, the data of rainfall in Peninsular Malaysia involved skewed observations in the direction of higher values with pure tendencies of values that are positive. Therefore, using Pearson correlation which was basing on PCA on rainfall set of data has the potentioal to influence the partitions of cluster as well as producing exceptionally clusters that are eneven in a space with high dimension. For current research, to resolve the unbalanced clusters challenge regarding the patterns of rainfall caused by the skewed character of the data, a robust dimension reduction method in PCA was employed. Thus, it led to the introduction of a robust measure in PCA with Tukey’s biweight correlation to downweigh observations along with the optimal breakdown point to obtain PCA’s quantity of components. Outcomes of this study displayed a highly substantial progress for the robust PCA, contrasting with the PCA-based Pearson correlation in respects to the average amount of acquired clusters and indicated 70% variance cumulative percentage at the breakdown point of 0.4.

S.L. Ting ◽  
P.K. Tan ◽  
Y.L. Pan ◽  
H.H.W. Thoungh ◽  
S.Y. Thum ◽  

Abstract Gate oxide breakdown has always been a critical reliability issue in Complementary Metal-Oxide-Silicon (CMOS) devices. Pinhole analysis is one of the commonly use failure analysis (FA) technique to analysis Gate oxide breakdown issue. However, in order to have a better understanding of the root cause and mechanism, a defect physically without any damaged or chemical attacked is required by the customer and process/module departments. In other words, it is crucial to have Transmission Electron Microscopy (TEM) analysis at the exact Gate oxide breakdown point. This is because TEM analysis provides details of physical evidence and insights to the root cause of the gate oxide failures. It is challenging to locate the site for TEM analysis in cases when poly gate layout is of a complex structure rather than a single line. In this paper, we developed and demonstrated the use of cross-sectional Scanning Electron Microscope (XSEM) passive voltage contrast (PVC) to isolate the defective leaky Polysilicon (PC) Gate and subsequently prepared TEM lamella in a perpendicular direction from the post-XSEM PVC sample. This technique provides an alternative approach to identify defective leaky polysilicon Gate for subsequent TEM analysis.

Sign in / Sign up

Export Citation Format

Share Document