On Bias Plus Variance

1997 ◽  
Vol 9 (6) ◽  
pp. 1211-1243 ◽  
Author(s):  
David H. Wolpert

This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as in Bayesian analysis) and training sets are averaged over (as in the conventional bias plus variance formula). Another additive correction casts conventional fixed-trainingset Bayesian analysis directly in terms of bias plus variance. Another correction is appropriate for measuring full generalization error over a test set rather than (as with conventional bias plus variance) error at a single point. Yet another correction can help explain the recent counterintuitive bias-variance decomposition of Friedman for zero-one loss. After presenting these corrections, this article discusses some other loss function-specific aspects of supervised learning. In particular, there is a discussion of the fact that if the loss function is a metric (e.g., zero-one loss), then there is bound on the change in generalization error accompanying changing the algorithm's guess from h1 to h2, a bound that depends only on h1 and h2 and not on the target. This article ends by presenting versions of the bias-plus-variance formula appropriate for logarithmic and quadratic scoring, and then all the additive corrections appropriate to those formulas. All the correction terms presented are a covariance, between the learning algorithm and the posterior distribution over targets. Accordingly, in the (very common) contexts in which those terms apply, there is not a “bias-variance trade-off” or a “bias-variance dilemma,” as one often hears. Rather there is a bias-variance-covariance trade-off.

2017 ◽  
Vol 5 (2) ◽  
pp. 141
Author(s):  
Wajiha Nasir

In this study, Frechet distribution has been studied by using Bayesian analysis. Posterior distribution has been derived by using gamma and exponential. Bayes estimators and their posterior risks has been derived using five different loss functions. Elicitation of hyperparameters has been done by using prior predictive distributions. Simulation study is carried out to study the behavior of posterior distribution. Quasi quadratic loss function and exponential prior are found better among all.


Author(s):  
Innocent Boyle Eraikhuemen ◽  
Olateju Alao Bamigbala ◽  
Umar Alhaji Magaji ◽  
Bassa Shiwaye Yakura ◽  
Kabiru Ahmed Manju

In the present paper, a three-parameter Weibull-Lindley distribution is considered for Bayesian analysis. The estimation of a shape parameter of Weibull-Lindley distribution is obtained with the help of both the classical and Bayesian methods. Bayesian estimators are obtained by using Jeffrey’s prior, uniform prior and Gamma prior under square error loss function, quadratic loss function and Precautionary loss function. Estimation by the method of Maximum likelihood is also discussed. These methods are compared by using mean square error through simulation study with varying parameter values and sample sizes.


2012 ◽  
Vol 190-191 ◽  
pp. 977-981 ◽  
Author(s):  
Xian Bin Wu

This paper presents the Bayesian analysis of the zero-failure data with double hyper parameters a, b. We take prior distribution of failure probability pi be its conjugated distribution—Beta (pi-1, 1, 1, b) and hyper parameter b as the uniform distribution in (1, c). With quadratic loss function, If pi  (pi-1, 1), the E-Bayesian estimation of pi is . When 0 < c < si, and satisfy (I) ;(II) . The results satisfy . The properties of E-Bayesian estimation are given. A Simulation example is discussed, which shows that the method is both efficiency and easy to operate.


2021 ◽  
Vol 13 (9) ◽  
pp. 1779
Author(s):  
Xiaoyan Yin ◽  
Zhiqun Hu ◽  
Jiafeng Zheng ◽  
Boyong Li ◽  
Yuanyuan Zuo

Radar beam blockage is an important error source that affects the quality of weather radar data. An echo-filling network (EFnet) is proposed based on a deep learning algorithm to correct the echo intensity under the occlusion area in the Nanjing S-band new-generation weather radar (CINRAD/SA). The training dataset is constructed by the labels, which are the echo intensity at the 0.5° elevation in the unblocked area, and by the input features, which are the intensity in the cube including multiple elevations and gates corresponding to the location of bottom labels. Two loss functions are applied to compile the network: one is the common mean square error (MSE), and the other is a self-defined loss function that increases the weight of strong echoes. Considering that the radar beam broadens with distance and height, the 0.5° elevation scan is divided into six range bands every 25 km to train different models. The models are evaluated by three indicators: explained variance (EVar), mean absolute error (MAE), and correlation coefficient (CC). Two cases are demonstrated to compare the effect of the echo-filling model by different loss functions. The results suggest that EFnet can effectively correct the echo reflectivity and improve the data quality in the occlusion area, and there are better results for strong echoes when the self-defined loss function is used.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2803
Author(s):  
Rabeea Jaffari ◽  
Manzoor Ahmed Hashmani ◽  
Constantino Carlos Reyes-Aldasoro

The segmentation of power lines (PLs) from aerial images is a crucial task for the safe navigation of unmanned aerial vehicles (UAVs) operating at low altitudes. Despite the advances in deep learning-based approaches for PL segmentation, these models are still vulnerable to the class imbalance present in the data. The PLs occupy only a minimal portion (1–5%) of the aerial images as compared to the background region (95–99%). Generally, this class imbalance problem is addressed via the use of PL-specific detectors in conjunction with the popular class balanced cross entropy (BBCE) loss function. However, these PL-specific detectors do not work outside their application areas and a BBCE loss requires hyperparameter tuning for class-wise weights, which is not trivial. Moreover, the BBCE loss results in low dice scores and precision values and thus, fails to achieve an optimal trade-off between dice scores, model accuracy, and precision–recall values. In this work, we propose a generalized focal loss function based on the Matthews correlation coefficient (MCC) or the Phi coefficient to address the class imbalance problem in PL segmentation while utilizing a generic deep segmentation architecture. We evaluate our loss function by improving the vanilla U-Net model with an additional convolutional auxiliary classifier head (ACU-Net) for better learning and faster model convergence. The evaluation of two PL datasets, namely the Mendeley Power Line Dataset and the Power Line Dataset of Urban Scenes (PLDU), where PLs occupy around 1% and 2% of the aerial images area, respectively, reveal that our proposed loss function outperforms the popular BBCE loss by 16% in PL dice scores on both the datasets, 19% in precision and false detection rate (FDR) values for the Mendeley PL dataset and 15% in precision and FDR values for the PLDU with a minor degradation in the accuracy and recall values. Moreover, our proposed ACU-Net outperforms the baseline vanilla U-Net for the characteristic evaluation parameters in the range of 1–10% for both the PL datasets. Thus, our proposed loss function with ACU-Net achieves an optimal trade-off for the characteristic evaluation parameters without any bells and whistles. Our code is available at Github.


Technologies ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 14
Author(s):  
James Dzisi Gadze ◽  
Akua Acheampomaa Bamfo-Asante ◽  
Justice Owusu Agyemang ◽  
Henry Nunoo-Mensah ◽  
Kwasi Adu-Boahen Opare

Software-Defined Networking (SDN) is a new paradigm that revolutionizes the idea of a software-driven network through the separation of control and data planes. It addresses the problems of traditional network architecture. Nevertheless, this brilliant architecture is exposed to several security threats, e.g., the distributed denial of service (DDoS) attack, which is hard to contain in such software-based networks. The concept of a centralized controller in SDN makes it a single point of attack as well as a single point of failure. In this paper, deep learning-based models, long-short term memory (LSTM) and convolutional neural network (CNN), are investigated. It illustrates their possibility and efficiency in being used in detecting and mitigating DDoS attack. The paper focuses on TCP, UDP, and ICMP flood attacks that target the controller. The performance of the models was evaluated based on the accuracy, recall, and true negative rate. We compared the performance of the deep learning models with classical machine learning models. We further provide details on the time taken to detect and mitigate the attack. Our results show that RNN LSTM is a viable deep learning algorithm that can be applied in the detection and mitigation of DDoS in the SDN controller. Our proposed model produced an accuracy of 89.63%, which outperformed linear-based models such as SVM (86.85%) and Naive Bayes (82.61%). Although KNN, which is a linear-based model, outperformed our proposed model (achieving an accuracy of 99.4%), our proposed model provides a good trade-off between precision and recall, which makes it suitable for DDoS classification. In addition, it was realized that the split ratio of the training and testing datasets can give different results in the performance of a deep learning algorithm used in a specific work. The model achieved the best performance when a split of 70/30 was used in comparison to 80/20 and 60/40 split ratios.


Energies ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 3654
Author(s):  
Nastaran Gholizadeh ◽  
Petr Musilek

In recent years, machine learning methods have found numerous applications in power systems for load forecasting, voltage control, power quality monitoring, anomaly detection, etc. Distributed learning is a subfield of machine learning and a descendant of the multi-agent systems field. Distributed learning is a collaboratively decentralized machine learning algorithm designed to handle large data sizes, solve complex learning problems, and increase privacy. Moreover, it can reduce the risk of a single point of failure compared to fully centralized approaches and lower the bandwidth and central storage requirements. This paper introduces three existing distributed learning frameworks and reviews the applications that have been proposed for them in power systems so far. It summarizes the methods, benefits, and challenges of distributed learning frameworks in power systems and identifies the gaps in the literature for future studies.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Zhangguo Tang ◽  
Junfeng Wang ◽  
Huanzhou Li ◽  
Jian Zhang ◽  
Junhao Wang

In the intelligent era of human-computer symbiosis, the use of machine learning method for covert communication confrontation has become a hot topic of network security. The existing covert communication technology focuses on the statistical abnormality of traffic behavior and does not consider the sensory abnormality of security censors, so it faces the core problem of lack of cognitive ability. In order to further improve the concealment of communication, a game method of “cognitive deception” is proposed, which is aimed at eliminating the anomaly of traffic in both behavioral and cognitive dimensions. Accordingly, a Wasserstein Generative Adversarial Network of Covert Channel (WCCGAN) model is established. The model uses the constraint sampling of cognitive priors to construct the constraint mechanism of “functional equivalence” and “cognitive equivalence” and is trained by a dynamic strategy updating learning algorithm. Among them, the generative module adopts joint expression learning which integrates network protocol knowledge to improve the expressiveness and discriminability of traffic cognitive features. The equivalent module guides the discriminant module to learn the pragmatic relevance features through the activity loss function of traffic and the application loss function of protocol for end-to-end training. The experimental results show that WCCGAN can directly synthesize traffic with comprehensive concealment ability, and its behavior concealment and cognitive deception are as high as 86.2% and 96.7%, respectively. Moreover, the model has good convergence and generalization ability and does not depend on specific assumptions and specific covert algorithms, which realizes a new paradigm of cognitive game in covert communication.


Sign in / Sign up

Export Citation Format

Share Document