scholarly journals Nowcasting Foehn Wind Events Using the AdaBoost Machine Learning Algorithm

2017 ◽  
Vol 32 (3) ◽  
pp. 1079-1099 ◽  
Author(s):  
Michael Sprenger ◽  
Sebastian Schemm ◽  
Roger Oechslin ◽  
Johannes Jenkner

Abstract The south foehn is a characteristic downslope windstorm in the valleys of the northern Alps in Europe that demands reliable forecasts because of its substantial economic and societal impacts. Traditionally, a foehn is predicted based on pressure differences and tendencies across the Alpine ridge. Here, a new objective method for foehn prediction is proposed based on a machine learning algorithm (called AdaBoost, short for adaptive boosting). Three years (2000–02) of hourly simulations of the Consortium for Small-Scale Modeling’s (COSMO) numerical weather prediction (NWP) model and corresponding foehn wind observations are used to train the algorithm to distinguish between foehn and nonfoehn events. The predictors (133 in total) are subjectively extracted from the 7-km COSMO reanalysis dataset based on the main characteristics of foehn flows. The performance of the algorithm is then assessed with a validation dataset based on a contingency table that concisely summarizes the cooccurrence of observed and predicted (non)foehn events. The main performance measures are probability of detection (88.2%), probability of false detection (2.9%), missing rate (11.8%), correct alarm ratio (66.2%), false alarm ratio (33.8%), and missed alarm ratio (0.8%). To gain insight into the prediction model, the relevance of the single predictors is determined, resulting in a predominance of pressure differences across the Alpine ridge (i.e., similar to the traditional methods) and wind speeds at the foehn stations. The predominance of pressure-related predictors is further established in a sensitivity experiment where ~2500 predictors are objectively incorporated into the prediction model using the AdaBoost algorithm. The performance is very similar to the run with the subjectively determined predictors. Finally, some practical aspects of the new foehn index are discussed (e.g., the predictability of foehn events during the four seasons). The correct alarm rate is highest in winter (86.5%), followed by spring (79.6%), and then autumn (69.2%). The lowest rates are found in summer (51.2%).

2019 ◽  
Vol 8 (07) ◽  
pp. 24680-24782
Author(s):  
Manisha Bagri ◽  
Neha Aggarwal

By 2020 around 25-50 billion devices are likely to be connected to the internet. Due to this new development, it gives rise to something called Internet of Things (IoT). The interconnected devices can generate and share data over a network. Machine Learning plays a key role in IoT to handle the vast amount of data. It gives IoT and devices a brain to think, which is often called as intelligence. The data can be feed to machines for learning patterns, based on training the machines can identify to predict for the future. This paper gives a brief explanation of IoT. This paper gives a crisp explanation of machine learning algorithm and its types. However, Support Vector Machine (SVM) is explained in details along with its merits and demerits. An algorithm is also proposed for weather prediction using SVM for IoT.


2021 ◽  
Vol 8 (3) ◽  
pp. 209-221
Author(s):  
Li-Li Wei ◽  
Yue-Shuai Pan ◽  
Yan Zhang ◽  
Kai Chen ◽  
Hao-Yu Wang ◽  
...  

Abstract Objective To study the application of a machine learning algorithm for predicting gestational diabetes mellitus (GDM) in early pregnancy. Methods This study identified indicators related to GDM through a literature review and expert discussion. Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis, and the collected indicators were retrospectively analyzed. Based on Python, the indicators were classified and modeled using a random forest regression algorithm, and the performance of the prediction model was analyzed. Results We obtained 4806 analyzable data from 1625 pregnant women. Among these, 3265 samples with all 67 indicators were used to establish data set F1; 4806 samples with 38 identical indicators were used to establish data set F2. Each of F1 and F2 was used for training the random forest algorithm. The overall predictive accuracy of the F1 model was 93.10%, area under the receiver operating characteristic curve (AUC) was 0.66, and the predictive accuracy of GDM-positive cases was 37.10%. The corresponding values for the F2 model were 88.70%, 0.87, and 79.44%. The results thus showed that the F2 prediction model performed better than the F1 model. To explore the impact of sacrificial indicators on GDM prediction, the F3 data set was established using 3265 samples (F1) with 38 indicators (F2). After training, the overall predictive accuracy of the F3 model was 91.60%, AUC was 0.58, and the predictive accuracy of positive cases was 15.85%. Conclusions In this study, a model for predicting GDM with several input variables (e.g., physical examination, past history, personal history, family history, and laboratory indicators) was established using a random forest regression algorithm. The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy. In addition, there are certain requirements for the proportions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.


2021 ◽  
Author(s):  
Jincheng Yang

BACKGROUND Diabetes mellitus and cancer are amongst the leading causes of deaths worldwide; hyperglycemia plays a major contributory role in neoplastic transformation risk. Support Vector Machine (SVM) is a type of supervised learning method which analyzes data and recognizes patterns, mainly used for statistical classification and regression. OBJECTIVE From reported adverse events of PD-1 or PD-L1 (programmed death 1 or ligand 1) inhibitors in post-marketing monitoring, we aimed to construct an effective machine learning algorithm to predict the probability of hyperglycemic adverse reaction from PD-1/PD-L1 inhibitors treated patients efficiently and rapidly. METHODS Raw data was downloaded from US Food and Drug Administration Adverse Event Reporting System (FDA FAERS). Signal of relationship between drug and adverse reaction based on disproportionality analysis and Bayesian analysis. A multivariate pattern classification of SVM was used to construct classifier to separate adverse hyperglycemic reaction patients. A 10-fold-3-time cross validation for model setup within training data (80% data) output best parameter values in SVM within R software. The model was validated in each testing data (20% data) and two total drug data, with exactly predictor parameter variables: gamma and nu. RESULTS Total 95918 case files were downloaded from 7 relevant drugs (cemiplimab, avelumab, durvalumab, atezolizumab, pembrolizumab, ipilimumab, nivolumab). The number-type/number-optimization method was selected to optimize model. Both gamma and nu values correlated with case number showed high adjusted r2 in curve regressions (both r2 >0.95). Indexes of accuracy, F1 score, kappa and sensitivity were greatly improved from the prediction model in training data and two total drug data. CONCLUSIONS The SVM prediction model established here can non-invasively and precisely predict occurrence of hyperglycemic adverse drug reaction (ADR) in PD-1/PD-L1 inhibitors treated patients. Such information is vital to overcome ADR and to improve outcomes by distinguish high hyperglycemia-risk patients, and this machine learning algorithm can eventually add value onto clinical decision making. CLINICALTRIAL N/A


2020 ◽  
Author(s):  
Jincheng Yang ◽  
Weilong Lin ◽  
Liming Shi ◽  
Ming Deng ◽  
Wenjing Yang

Abstract Background: Diabetes mellitus and cancer are amongst the leading causes of deaths worldwide; hyperglycemia plays a major contributory role in neoplastic transformation risk. From reported adverse events of PD-1 or PD-L1 (programmed death 1 or ligand 1) inhibitors in post-marketing monitoring, we aimed to construct an effective machine learning algorithm to predict the probability of hyperglycemic adverse reaction from PD-1/PD-L1 inhibitors treated patients efficiently and rapidly. Methods: Raw data was downloaded from US Food and Drug Administration Adverse Event Reporting System (FDA FAERS). Signal of relationship between drug and adverse reaction based on disproportionality analysis and Bayesian analysis. A multivariate pattern classification of Support Vector Machine (SVM) was used to construct classifier to separate adverse hyperglycemic reaction patients. A 10-fold-3-time cross validation for model setup within training data (80% data) output best parameter values in SVM within R software. The model was validated in each testing data (20% data) and two total drug data, with exactly predictor parameter variables: gamma and nu. Results: Total 95918 case files were downloaded from 7 relevant drugs (cemiplimab, avelumab, durvalumab, atezolizumab, pembrolizumab, ipilimumab, nivolumab). The number-type/number-optimization method was selected to optimize model. Both gamma and nu values correlated with case number showed high adjusted r2 in curve regressions (both r2 >0.95). Indexes of accuracy, F1 score, kappa and sensitivity were greatly improved from the prediction model in training data and two total drug data. Conclusions: The SVM prediction model established here can non-invasively and precisely predict occurrence of hyperglycemic adverse drug reaction (ADR) in PD-1/PD-L1 inhibitors treated patients. Such information is vital to overcome ADR and to improve outcomes by distinguish high hyperglycemia-risk patients, and this machine learning algorithm can eventually add value onto clinical decision making.


2021 ◽  
Author(s):  
Quentin Lenouvel ◽  
Vincent Génot ◽  
Philippe Garnier ◽  
Benoit Lavraud ◽  
Sergio Toledo

<p>The understanding of magnetic reconnection's physical processes has considerably been improved thanks to the data of the Magnetopsheric Multiscale mission (MMS). However, a lot of work still has to be done to better characterize the core of the reconnection process : the electron diffusion region (EDR). We previously developed a machine learning algorithm to automatically detect EDR candidates, in order to increase the available list of events identified in the literature. However, identifying the parameters that are the most relevant to describe EDRs is complex, all the more that some of the small scale plasma/fields parameters show limitations in some configurations such as for low particle densities or large guide fields cases. In this study, we perform a statistical study of previously reported dayside EDRs as well as newly reported EDR candidates found using machine learning methods. We also show different single and multi-spacecraft parameters that can be used to better identify dayside EDRs in time series from MMS data recorded at the magnetopause. And finally we show an analysis of the link between the guide field and the strength of the energy conversion around each EDR.</p>


This project proposes a method for forecasting weather conditions and predicting rainfall by means of machine learning. Here, there are two set ups: one, to measure the weather parameters like temperature, humidity using sensors along with Arduino and another set up, to display the current values(status) and predicted rainfall based on the trained machine learning data sets. The weather forecasting and prediction is done based on the older datasets collected and compared with the current values. The user need not have a backup of huge data to predict the rainfall. Instead a machine learning algorithm can suffice the same. The temperature, humidity sensor modules are used to measure weather parameters and interfaced to an Arduino controller. The proposed setup will compare the forecast value with real-time data, and the predict rainfall based on the dataset fed to the machine learning algorithm.


Sign in / Sign up

Export Citation Format

Share Document