Information in Missing Patterns: Enhancing Prediction Accuracy in Weighted Linear Regression with Missing Data Using Soft Clustering
The linear system with missing information is <div>investigated in this paper. New methods are </div><div>introduced to improve the Mean Squared Error (MSE) </div><div>on the test set in comparison to state-of-the-art method</div><div>s, through appropriate tuning of Bias-Variance </div><div>trade-off. The concept is to cluster the data and </div><div>adapt the learning model to each cluster. Hence, </div><div>we set forth a controlled bias into the problem and </div><div>positively utilize it to enhance learning capability on </div><div>the instances considered in some specific </div><div>neighborhood. To deal with missing infrormation, </div><div>we propose a novel algorithm "Missing-SCOP" based </div><div>on SCOP-KMEANS algorithm introduced by Wagstaff,</div><div> et al., utilizing the missing pattern of the dataset for </div><div>construction of a soft-constraint matrix and clustering </div><div>in missing scenario. It is shown that controlled </div><div>over-fitting suggested by our algorithm improves </div><div>prediction accuracy in various cases. </div><div>Numerical experiments approve the efficacy of our</div><div> proposed algorithm in enhancing the prediction</div><div> accuracy.</div>