The SWAG solution for probabilistic predictions with a single neural network

Author(s):  
Yann Haddad ◽  
Michaël Defferrard ◽  
Gionata Ghiggi

<p>Ensemble predictions are essential to characterize the forecast uncertainty and the likelihood of an event to occur. Stochasticity in predictions comes from data and model uncertainty. In deep learning (DL), data uncertainty can be approached by training an ensemble of DL models on data subsets or by performing data augmentations (e.g., random or singular value decomposition (SVD) perturbations). Model uncertainty is typically addressed by training a DL model multiple times from different weight initializations (DeepEnsemble) or by training sub-networks by dropping weights (Dropout). Dropout is cheap but less effective, while DeepEnsemble is computationally expensive.</p><p>We propose instead to tackle model uncertainty with SWAG (Maddox et al., 2019), a method to learn stochastic weights—the sampling of which allows to draw hundreds of forecast realizations at a fraction of the cost required by DeepEnsemble. In the context of data-driven weather forecasting, we demonstrate that the SWAG ensemble has i) better deterministic skills than a single DL model trained in the usual way, and ii) approaches deterministic and probabilistic skills of DeepEnsemble at a fraction of the cost. Finally, multiSWAG (SWAG applied on top of DeepEnsemble models) provides a trade-off between computational cost, model diversity, and performance.</p><p>We believe that the method we present will become a common tool to generate large ensembles at a fraction of the current cost. Additionally, the possibility of sampling DL models allows the design of data-driven/emulated stochastic model components and sub-grid parameterizations.</p><p><strong>Reference</strong></p><p>Maddox W.J, Garipov T., Izmailov P., Vetrov D., Wilson A. G., 2019: A Simple Baseline for Bayesian Uncertainty in Deep Learning. arXiv:1902.02476</p>

2021 ◽  
Author(s):  
Lei Xu ◽  
Nengcheng Chen ◽  
Chao Yang

Abstract. Precipitation forecasting is an important mission in weather science. In recent years, data-driven precipitation forecasting techniques could complement numerical prediction, such as precipitation nowcasting, monthly precipitation projection and extreme precipitation event identification. In data-driven precipitation forecasting, the predictive uncertainty arises mainly from data and model uncertainties. Current deep learning forecasting methods could model the parametric uncertainty by random sampling from the parameters. However, the data uncertainty is usually ignored in the forecasting process and the derivation of predictive uncertainty is incomplete. In this study, the input data uncertainty, target data uncertainty and model uncertainty are jointly modeled in a deep learning precipitation forecasting framework to estimate the predictive uncertainty. Specifically, the data uncertainty is estimated a priori and the input uncertainty is propagated forward through model weights according to the law of error propagation. The model uncertainty is considered by sampling from the parameters and is coupled with input and target data uncertainties in the objective function during the training process. Finally, the predictive uncertainty is produced by propagating the input uncertainty and sampling the weights in the testing process. The experimental results indicate that the proposed joint uncertainty modeling and precipitation forecasting framework exhibits comparable forecasting accuracy with existing methods, while could reduce the predictive uncertainty to a large extent relative to two existing joint uncertainty modeling approaches. The developed joint uncertainty modeling method is a general uncertainty estimation approach for data-driven forecasting applications.


2020 ◽  
Author(s):  
Ben Geoffrey

The rise in application of methods of data science and machine/deep learning in chemical and biological sciences must be discussed in the light of the fore-running disciplines of bio/chem-informatics and computational chemistry and biology which helped in the accumulation ofenormous research data because of which successful application of data-driven approaches have been made possible now. Many of the tasks and goals of Ab initio methods in computational chemistry such as determination of optimized structure and other molecular properties of atoms, molecules, and compounds are being carried out with much lesser computational cost with data-driven machine/deep learning-based predictions. One observes a similar trend in computational biology, wherein, data-driven machine/deep learning methods are being proposed to predict the structure and dynamical of interactions of biological macromolecules such as proteins and DNA over computational expensive molecular dynamics based methods. In the cheminformatics space,one sees the rise of deep neural network-based methods that have scaled traditional structure-property/structure-activity to handle big data to design new materials with desired property and drugs with required activity in deep learning-based de novo molecular design methods. In thebioinformatics space, data-driven machine/deep learning approaches to genomic and proteomic data have led to interesting applications in fields such as precision medicine, prognosis prediction, and more. Thus the success story of the application of data science, machine/deep learning, andartificial intelligence to the disciple of chem/bio-informatics, and computational chemistry and biology has been told in light of how these fore-running disciplines had created huge repositories of data for data-driven approaches to be successful in these disciplines.


2022 ◽  
Vol 19 (1) ◽  
pp. 1-26
Author(s):  
Prasanth Chatarasi ◽  
Hyoukjun Kwon ◽  
Angshuman Parashar ◽  
Michael Pellauer ◽  
Tushar Krishna ◽  
...  

A spatial accelerator’s efficiency depends heavily on both its mapper and cost models to generate optimized mappings for various operators of DNN models. However, existing cost models lack a formal boundary over their input programs (operators) for accurate and tractable cost analysis of the mappings, and this results in adaptability challenges to the cost models for new operators. We consider the recently introduced Maestro Data-Centric (MDC) notation and its analytical cost model to address this challenge because any mapping expressed in the notation is precisely analyzable using the MDC’s cost model. In this article, we characterize the set of input operators and their mappings expressed in the MDC notation by introducing a set of conformability rules . The outcome of these rules is that any loop nest that is perfectly nested with affine tensor subscripts and without conditionals is conformable to the MDC notation. A majority of the primitive operators in deep learning are such loop nests. In addition, our rules enable us to automatically translate a mapping expressed in the loop nest form to MDC notation and use the MDC’s cost model to guide upstream mappers. Our conformability rules over the input operators result in a structured mapping space of the operators, which enables us to introduce a mapper based on our decoupled off-chip/on-chip approach to accelerate mapping space exploration. Our mapper decomposes the original higher-dimensional mapping space of operators into two lower-dimensional off-chip and on-chip subspaces and then optimizes the off-chip subspace followed by the on-chip subspace. We implemented our overall approach in a tool called Marvel , and a benefit of our approach is that it applies to any operator conformable with the MDC notation. We evaluated Marvel over major DNN operators and compared it with past optimizers.


2020 ◽  
Author(s):  
Yu Zhao ◽  
Yue Yin ◽  
Guan Gui

Decentralized edge computing techniques have been attracted strongly attentions in many applications of intelligent internet of things (IIoT). Among these applications, intelligent edge surveillance (LEDS) techniques play a very important role to recognize object feature information automatically from surveillance video by virtue of edge computing together with image processing and computer vision. Traditional centralized surveillance techniques recognize objects at the cost of high latency, high cost and also require high occupied storage. In this paper, we propose a deep learning-based LEDS technique for a specific IIoT application. First, we introduce depthwise separable convolutional to build a lightweight neural network to reduce its computational cost. Second, we combine edge computing with cloud computing to reduce network traffic. Third, we apply the proposed LEDS technique into the practical construction site for the validation of a specific IIoT application. The detection speed of our proposed lightweight neural network reaches 16 frames per second in edge devices. After cloud server fine detection, the precision of the detection reaches 89\%. In addition, the operating cost at the edge device is only one-tenth of that of the centralized server.


2020 ◽  
Author(s):  
Yu Zhao ◽  
Yue Yin ◽  
Guan Gui

Decentralized edge computing techniques have been attracted strongly attentions in many applications of intelligent internet of things (IIoT). Among these applications, intelligent edge surveillance (LEDS) techniques play a very important role to recognize object feature information automatically from surveillance video by virtue of edge computing together with image processing and computer vision. Traditional centralized surveillance techniques recognize objects at the cost of high latency, high cost and also require high occupied storage. In this paper, we propose a deep learning-based LEDS technique for a specific IIoT application. First, we introduce depthwise separable convolutional to build a lightweight neural network to reduce its computational cost. Second, we combine edge computing with cloud computing to reduce network traffic. Third, we apply the proposed LEDS technique into the practical construction site for the validation of a specific IIoT application. The detection speed of our proposed lightweight neural network reaches 16 frames per second in edge devices. After cloud server fine detection, the precision of the detection reaches 89\%. In addition, the operating cost at the edge device is only one-tenth of that of the centralized server.


2021 ◽  
Author(s):  
Marwa Majdi ◽  
David Delene

<p>Unmanned Aircraft System (UAS) operations have spread rapidly worldwide performing a variety of military and civilian applications. The ability and performance of UAS to carry out these applications are strongly affected by poor weather conditions. Fog is one of the critical issues that threaten the safety of UAS missions by altering visibility. Therefore, the mission planning based on accurate visibility nowcasts prior to Beyond Visual Line Of Sight (BVLOS) UAS missions will be mandatory to ensure safer UAS operations.</p><p>Two types of models are generally considered for visibility nowcasting: physics-based or data-driven models. However, physics-based visibility forecasts remain expensive and difficult to use operationally. Recently, with the increase of the number of available historical data, data-driven models, especially those using deep learning approaches in particular, have attracted increasing attention in weather forecasting and have proven themselves as a powerful prediction tool.</p><p>This study aims at developing a Visibility Nowcasting System (VNS) that improves the performance and the capability of nowcasting the visibility using deep learning over the U.S.. To that end,  a deep neural network, called an encoder-decoder convolutional neural network (CNN), is used to demonstrate specifically how basic NWP fields such as temperature, wind speed, relative humidity, etc. and visibility from surface observations can provide accurate visibility nowcasts. The VNS will be then tested in different geographical environments where UAS flights are deployed (for example, over North Dakota) since it can learn the time and space correlation according to the historical data.</p><p>To train the network, we created a labeled data set from available METAR reports and hourly reanalysis data from the High-Resolution Rapid Refresh (HRRR) model. This dataset will be also used to test the CNN and evaluate their nowcasting performance. The model will be then evaluated in operational use cases and compared to other available visibility observations during fog events.</p><p> </p>


2020 ◽  
Author(s):  
Everardo González Ávalos ◽  
Ewa Burwicz

<p>Over the past decade deep learning has been used to solve a wide array of regression and classification tasks. Compared to classical machine learning approaches (k-Nearest Neighbours, Random Forests,… ) deep learning algorithms excel at learning complex, non-linear internal representations in part due to the highly over-parametrised nature of their underling models; thus, this advantage often comes at the cost of interpretability. In this work we used deep neural network to construct global total organic carbon (TOC) seafloor concentration map. Implementing Softmax distributions on implicitly continuous data (regression tasks) we were able to obtain probability distributions to asses prediction reliability. A variation of Dropout called Monte Carlo Dropout is also used during the inference step providing a tool to model prediction uncertainties. We used these techniques to create a model information map which is a key element to develop new data-driven sampling strategies for data acquisition. </p>


2020 ◽  
Vol 13 (4) ◽  
pp. 627-640 ◽  
Author(s):  
Avinash Chandra Pandey ◽  
Dharmveer Singh Rajpoot

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3937
Author(s):  
Seungeon Song ◽  
Bongseok Kim ◽  
Sangdong Kim ◽  
Jonghun Lee

Recently, Doppler radar-based foot gesture recognition has attracted attention as a hands-free tool. Doppler radar-based recognition for various foot gestures is still very challenging. So far, no studies have yet dealt deeply with recognition of various foot gestures based on Doppler radar and a deep learning model. In this paper, we propose a method of foot gesture recognition using a new high-compression radar signature image and deep learning. By means of a deep learning AlexNet model, a new high-compression radar signature is created by extracting dominant features via Singular Value Decomposition (SVD) processing; four different foot gestures including kicking, swinging, sliding, and tapping are recognized. Instead of using an original radar signature, the proposed method improves the memory efficiency required for deep learning training by using a high-compression radar signature. Original and reconstructed radar images with high compression values of 90%, 95%, and 99% were applied for the deep learning AlexNet model. As experimental results, movements of all four different foot gestures and of a rolling baseball were recognized with an accuracy of approximately 98.64%. In the future, due to the radar’s inherent robustness to the surrounding environment, this foot gesture recognition sensor using Doppler radar and deep learning will be widely useful in future automotive and smart home industry fields.


Sign in / Sign up

Export Citation Format

Share Document