Information in Missing Patterns: Enhancing Prediction Accuracy in Weighted Linear Regression with Missing Data Using Soft Clustering

10.36227/techrxiv.12191667.v1 ◽

2020 ◽

Author(s):

Ashkan Esmaeili ◽

Mohammadamin Fakharian ◽

Yasaman Amiri Abyaneh

Keyword(s):

Prediction Accuracy ◽

Mean Squared Error ◽

State Of The Art ◽

Missing Information ◽

Learning Capability ◽

Constraint Matrix ◽

New Methods ◽

Squared Error ◽

The Mean ◽

Bias Variance

The linear system with missing information is <div>investigated in this paper. New methods are </div><div>introduced to improve the Mean Squared Error (MSE) </div><div>on the test set in comparison to state-of-the-art method</div><div>s, through appropriate tuning of Bias-Variance </div><div>trade-off. The concept is to cluster the data and </div><div>adapt the learning model to each cluster. Hence, </div><div>we set forth a controlled bias into the problem and </div><div>positively utilize it to enhance learning capability on </div><div>the instances considered in some specific </div><div>neighborhood. To deal with missing infrormation, </div><div>we propose a novel algorithm "Missing-SCOP" based </div><div>on SCOP-KMEANS algorithm introduced by Wagstaff,</div><div> et al., utilizing the missing pattern of the dataset for </div><div>construction of a soft-constraint matrix and clustering </div><div>in missing scenario. It is shown that controlled </div><div>over-fitting suggested by our algorithm improves </div><div>prediction accuracy in various cases. </div><div>Numerical experiments approve the efficacy of our</div><div> proposed algorithm in enhancing the prediction</div><div> accuracy.</div>

Download Full-text

Optimizing MSE for Clustering with Balanced Size Constraints

Symmetry ◽

10.3390/sym11030338 ◽

2019 ◽

Vol 11 (3) ◽

pp. 338 ◽

Cited By ~ 6

Author(s):

Wei Tang ◽

Yang Yang ◽

Lanling Zeng ◽

Yongzhao Zhan

Keyword(s):

Clustering Algorithm ◽

Mean Squared Error ◽

State Of The Art ◽

Simplex Algorithm ◽

Linear Program ◽

Constraint Matrix ◽

Squared Error ◽

Group Data ◽

Size Constraints ◽

The Mean

Clustering is to group data so that the observations in the same group are more similar to each other than to those in other groups. k-means is a popular clustering algorithm in data mining. Its objective is to optimize the mean squared error (MSE). The traditional k-means algorithm is not suitable for applications where the sizes of clusters need to be balanced. Given n observations, our objective is to optimize the MSE under the constraint that the observations need to be evenly divided into k clusters. In this paper, we propose an iterative method for the task of clustering with balanced size constraints. Each iteration can be split into two steps, namely an assignment step and an update step. In the assignment step, the data are evenly assigned to each cluster. The balanced assignment task here is formulated as an integer linear program (ILP), and we prove that the constraint matrix of this ILP is totally unimodular. Thus the ILP is relaxed as a linear program (LP) which can be efficiently solved with the simplex algorithm. In the update step, the new centers are updated as the centroids of the observations in the clusters. Assuming that there are n observations and the algorithm needs m iterations to converge, we show that the average time complexity of the proposed algorithm is O ( m n 1 . 65 ) – O ( m n 1 . 70 ) . Experimental results indicate that, comparing with state-of-the-art methods, the proposed algorithm is efficient in deriving more accurate clustering.

Download Full-text

Adaptive Kernel Estimation of the Conditional Quantiles

International Journal of Statistics and Probability ◽

10.5539/ijsp.v5n1p79 ◽

2015 ◽

Vol 5 (1) ◽

pp. 79

Author(s):

Raid B. Salha ◽

Hazem I. El Shekh Ahmed ◽

Hossam O. EL-Sayed

Keyword(s):

Distribution Function ◽

Conditional Distribution ◽

Mean Squared Error ◽

Kernel Estimation ◽

Conditional Quantiles ◽

Squared Error ◽

Adaptive Kernel ◽

The Mean ◽

Bias Variance ◽

Kernel Estimations

In this paper, we define the adaptive kernel estimation of the conditional distribution function (cdf) for independent and identically distributed (iid) data using varying bandwidth. The bias, variance and the mean squared error of the proposed estimator are investigated. Moreover, the asymptotic normality of the proposed estimator is investigated.<br /><br />The results of the simulation study show that the adaptive kernel estimation of the conditional quantiles with varying bandwidth have better performance than the kernel estimations with fixed bandwidth.

Download Full-text

ESTIMATING THE INTENSITY IN THE FORM OF A POWER FUNCTION OF AN INHOMOGENEOUS POISSON PROCESS

Journal of Mathematics and Its Applications ◽

10.29244/jmap.4.1.51-57 ◽

2005 ◽

Vol 4 (1) ◽

pp. 51

Author(s):

I W. MANGKU ◽

I. WIDIYASTUTI ◽

I G. P. PURNABA

Keyword(s):

Asymptotic Normality ◽

Poisson Process ◽

Power Function ◽

Mean Squared Error ◽

Asymptotic Bias ◽

Inhomogeneous Poisson Process ◽

Squared Error ◽

Single Realization ◽

The Mean ◽

Bias Variance

<p>An estimator of the intensity in the form of a power function of an inhomogeneous Poisson process is constructed and investigated. It is assumed that only a single realization of the Poisson process is observed in a bounded window. We prove that the proposed estimator is consistent when the size of the window indeﬁnitely expands. The asymptotic bias, variance and the mean- squared error of the proposed estimator are computed. Asymptotic normality of the estimator is also established.</p>

Download Full-text

The Mean Squared Error of the Instrumental Variables Estimator When the Disturbance Has an Elliptical Distribution

Econometric Reviews ◽

10.1080/07474930500545488 ◽

2006 ◽

Vol 25 (1) ◽

pp. 117-138 ◽

Cited By ~ 1

Author(s):

Fernanda P. M. Peixe ◽

Alastair R. Hall ◽

Kostas Kyriakoulis

Keyword(s):

Instrumental Variables ◽

Mean Squared Error ◽

Elliptical Distribution ◽

Squared Error ◽

The Mean

Download Full-text

Computing trade-offs in robust design: Perspectives of the mean squared error

Computers & Industrial Engineering ◽

10.1016/j.cie.2010.11.006 ◽

2011 ◽

Vol 60 (2) ◽

pp. 248-255 ◽

Cited By ~ 30

Author(s):

Sangmun Shin ◽

Funda Samanlioglu ◽

Byung Rae Cho ◽

Margaret M. Wiecek

Keyword(s):

Robust Design ◽

Mean Squared Error ◽

Squared Error ◽

Trade Offs ◽

The Mean

Download Full-text

Measuring The Tensile Strain of Wood By Visible And Near-Infrared Spatially Resolved Spectroscopy

10.21203/rs.3.rs-570550/v1 ◽

2021 ◽

Author(s):

Ma Te ◽

Tetsuya Inagaki ◽

Masato Yoshida ◽

Mayumi Ichino ◽

Satoru Tsuchikawa

Keyword(s):

Light Scattering ◽

Prediction Accuracy ◽

Tensile Strain ◽

Near Infrared ◽

Mean Squared Error ◽

Calibration Model ◽

Spatially Resolved ◽

Squared Error ◽

Stiffness Evaluation ◽

Spatially Resolved Spectroscopy

Abstract Wood has various mechanical properties, so stiffness evaluation is critical for quality management. Using conventional strain gauges constantly is high cost, also challenging to measure precious wood materials due to the use of strong adhesive. This study demonstrates the correlation between light scattering changes inside the wood cell walls and tensile strain. A multifiber-based visible-near-infrared (Vis–NIR) spatially resolved spectroscopy (SRS) system was designed to rapidly and conventiently acquire such light scattering changes. For the preliminary experiment, samples with different thicknesses were measured to evaluate the influence of thickness. The differences in Vis–NIR SRS spectral data diminish with an increase in sample thickness, which suggests that the SRS method can successfully measure the whole strain (i.e., surface and inside) of wood samples. Then, for the primary experiment, 18 wood samples with the same thickness (2 mm) were tested to construct a strain calibration model. The prediction accuracy was characterized by a determination coefficient (R2) of 0.86 with a root mean squared error (RMSE) of 297.89 με for five-fold cross-validation; for test validation, The prediction accuracy was characterized by an R2 of 0.82 and an RMSE of 345.44 με.

Download Full-text

Day-Ahead Forecasting of Hourly Photovoltaic Power Based on Robust Multilayer Perception

Sustainability ◽

10.3390/su10124863 ◽

2018 ◽

Vol 10 (12) ◽

pp. 4863 ◽

Cited By ~ 6

Author(s):

Chao Huang ◽

Longpeng Cao ◽

Nanxin Peng ◽

Sijia Li ◽

Jing Zhang ◽

...

Keyword(s):

Power Plants ◽

Mean Squared Error ◽

Absolute Error ◽

Multilayer Perception ◽

Squared Error ◽

The Mean ◽

Effectiveness And Efficiency ◽

Mlp Network ◽

Grid Operation ◽

Better Than

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).

Download Full-text

On double stage minimax-shrinkage estimator for generalized Rayleigh model

International Journal of Applied Mathematical Research ◽

10.14419/ijamr.v5i1.5553 ◽

2016 ◽

Vol 5 (1) ◽

pp. 39 ◽

Cited By ~ 1

Author(s):

Abbas Najim Salman ◽

Maymona Ameen

Keyword(s):

Sample Size ◽

Shape Parameter ◽

Mean Squared Error ◽

Scale Parameter ◽

Rayleigh Distribution ◽

Shrinkage Estimator ◽

Squared Error ◽

Expected Sample Size ◽

Generalized Rayleigh Distribution ◽

The Mean

<p>This paper is concerned with minimax shrinkage estimator using double stage shrinkage technique for lowering the mean squared error, intended for estimate the shape parameter (a) of Generalized Rayleigh distribution in a region (R) around available prior knowledge (a<sub>0</sub>) about the actual value (a) as initial estimate in case when the scale parameter (l) is known .</p><p>In situation where the experimentations are time consuming or very costly, a double stage procedure can be used to reduce the expected sample size needed to obtain the estimator.</p><p>The proposed estimator is shown to have smaller mean squared error for certain choice of the shrinkage weight factor y(<strong>×</strong>) and suitable region R.</p><p>Expressions for Bias, Mean squared error (MSE), Expected sample size [E (n/a, R)], Expected sample size proportion [E(n/a,R)/n], probability for avoiding the second sample and percentage of overall sample saved for the proposed estimator are derived.</p><p>Numerical results and conclusions for the expressions mentioned above were displayed when the consider estimator are testimator of level of significanceD.</p><p>Comparisons with the minimax estimator and with the most recent studies were made to shown the effectiveness of the proposed estimator.</p>

Download Full-text

Performance Analysis of AOA-Based Localization Using the LS Approach: Explicit Expression of Mean-Squared Error

Journal of Sensors ◽

10.1155/2020/9346142 ◽

2020 ◽

Vol 2020 ◽

pp. 1-22

Author(s):

Byung-Kwon Son ◽

Do-Jin An ◽

Joon-Ho Lee

Keyword(s):

Explicit Expression ◽

Mean Squared Error ◽

Weighted Least Squares ◽

Localization Algorithm ◽

Angle Of Arrival ◽

Squared Error ◽

Distance Weighted ◽

The Mean ◽

Passive Localization ◽

Location Estimate

In this paper, a passive localization of the emitter using noisy angle-of-arrival (AOA) measurements, called Brown DWLS (Distance Weighted Least Squares) algorithm, is considered. The accuracy of AOA-based localization is quantified by the mean-squared error. Various estimates of the AOA-localization algorithm have been derived (Doğançay and Hmam, 2008). Explicit expression of the location estimate of the previous study is used to get an analytic expression of the mean-squared error (MSE) of one of the various estimates. To validate the derived expression, we compare the MSE from the Monte Carlo simulation with the analytically derived MSE.

Download Full-text