Operational aspects of machine learning in a met-service

<p>Machine Learning has a big potential for various tasks along the whole value chain of a national Met-Service. Indeed, many research groups, private and national weather services have started to explore the possibilities and first real-time operational implementations are in place already. However, the building up of the expertise is difficult, large amounts of data have to be made available in an efficient way and the necessary tools have special and demanding requirements concerning infrastructure and maintenance. Also, the transition from research results towards operational tools being operated in realtime is a particular challenge. Not least, trust from end-users must be built, while trying to avoid falling into the short-term hype trap.</p><p>In this presentation, we want to present some examples of machine-learning at MeteoSwiss that are in operational use or soon to be. This includes the use in a measurement system to identify pollen species, the quality control of meteorological observations, the postprocessing of numerical weather forecasts and the condensation of weather forecast information for the meteorologists. These examples have different characteristics and cover a wide range of applications, but also share some common properties. We want to juxtapose these properties with the incentives and conditions how machine learning methods are developed and employed in a more research oriented context like in academia. It turns out that an operational setup of machine learning has very different requirements than machine learning in a research context. The identification of these differences, but also the similarities, could help to understand the challenge of bringing research results into operation and how to alleviate this challenge in the future.</p>

Download Full-text

Artificial Learning Dispatch Planning with Probabilistic Forecasts: Using Uncertainties as an Asset

Energies ◽

10.3390/en13030616 ◽

2020 ◽

Vol 13 (3) ◽

pp. 616

Author(s):

Ana Carolina do Amaral Burghi ◽

Tobias Hirsch ◽

Robert Pitz-Paal

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Weather Forecast ◽

Machine Learning Algorithm ◽

Post Processing ◽

Weather Forecasts ◽

Additional Information ◽

Model Following ◽

Probabilistic Forecasts ◽

Artificial Learning

Weather forecast uncertainty is a key element for energy market volatility. By intelligently considering uncertainties on the schedule development, renewable energy systems with storage could improve dispatching accuracy, and therefore, effectively participate in electricity wholesale markets. Deterministic forecasts have been traditionally used to support dispatch planning, representing reduced or no uncertainty information about the future weather. Aiming at better representing the uncertainties involved, probabilistic forecasts have been developed to increase forecasting accuracy. For the dispatch planning, this can highly influence the development of a more precise schedule. This work extends a dispatch planning method to the use of probabilistic weather forecasts. The underlying method used a schedule optimizer coupled to a post-processing machine learning algorithm. This machine learning algorithm was adapted to include probabilistic forecasts, considering their additional information on uncertainties. This post-processing applied a calibration of the planned schedule considering the knowledge about uncertainties obtained from similar past situations. Simulations performed with a concentrated solar power plant model following the proposed strategy demonstrated promising financial improvement and relevant potential in dealing with uncertainties. Results especially show that information included in probabilistic forecasts can increase financial revenues up to 15% (in comparison to a persistence solar driven approach) if processed in a suitable way.

Download Full-text

Applying machine learning models for predicting forest fires in Australia and the influence of weather on the spread of fires based on satellite and weather forecast data.

Proceedings of International Young Scholars Workshop ◽

10.47344/iysw.v9i0.187 ◽

2020 ◽

Vol 9 ◽

Author(s):

Bakhtiyor Askarovich Meraliyev ◽

Kurmangazy Sakenuly Kongratbayev

Keyword(s):

Machine Learning ◽

Forest Fires ◽

Learning Algorithms ◽

Weather Forecast ◽

Weather Conditions ◽

Machine Learning Algorithms ◽

Multidimensional Analysis ◽

Natural Habitats ◽

Meteorological Observations ◽

High Level

What shall we expect from the year 2020? The coronavirus pandemic is not the worst thing that humanity can face in the near future. According to the observations of the scientists, in March, 2020, the planet temperature warmed up to the record-high level. Also, the temperature of the world’s oceans exceeded its average temperatures by 80%, and prognosis of the meteorological observations is not good. The warming seas had already led to catastrophic disaster. The average temperature increase can also lead to hurricanes, drought, invasion of locusts and, the worst, to forest fires. Natural disasters lead to loss of life, destruction of properties and infrastructure, loss of animal natural habitats, displacement of humans. And the results of these all lead to humanitarian catastrophes, including social and economic.The situations related to the nature are always very serious, as the whole world is involved. This is like butterfly effect, i.e., the natural disaster in Australia affect the economic and ecologic situation in USA and England. Taking the Australia, they faced problem that cannot be avoided. Nevertheless, the world can be prepared and prevent from the huge disasters. The forecasting of forest fires can really be helpful, as well as the inquiry of the weather impact on fires. The current paper is focused on the study of fire forecasting and weather influence on fire. The relevance of the study is important, as the global warming and human caused fires are increasing and there is a trend that Australia’s fires became more dangerous and longer lasting. The artificial intelligence, particularly machine learning algorithms, can help to make appropriate calculations and predictions to safe the ecosystem and human lives.According to the preliminary research results we acquire; in-depth multidimensional analysis confirms almost 100 percent dependence of bushfires on the weather conditions. Using the machine learning algorithms, it would be possible to predict the time and positioning of inflammation source.

Download Full-text

Classification of Observations through Combination of the Dimension Reduction and the Cluster Analysis

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.13 ◽

2017 ◽

Vol 7 (8) ◽

pp. 30

Author(s):

Hyeuk Kim

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Cluster Analysis ◽

Unsupervised Learning ◽

Principal Component ◽

Component Analysis ◽

Baseball Players ◽

Partitioning Around Medoids ◽

Different Characteristics

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.

Download Full-text

IMPROVEMENT OF INNOVATIVE ACTIVITY OF THE ENTERPRISE

International scientific journal Internauka Series Economical Sciences ◽

10.25313/2520-2294-2020-11-6486 ◽

2017 ◽

Author(s):

Nataliya Stoyanets ◽

◽

Mathias Onuh Aboyi ◽

Keyword(s):

Value Chain ◽

Net Present Value ◽

Successful Implementation ◽

Innovative Development ◽

Technological Equipment ◽

Alliance Partner ◽

Innovation Potential ◽

Wide Range ◽

Technological Platform ◽

The Cost

The article defines that for the successful implementation of an innovative project and the introduction of a new product into production it is necessary to use advanced technologies and modern software, which is an integral part of successful innovation by taking into account the life cycle of innovations. It is proposed to consider the general potential of the enterprise through its main components, namely: production and technological, scientific and technical, financial and economic, personnel and actual innovation potential. Base for the introduction of technological innovations LLC "ALLIANCE- PARTNER", which provides a wide range of support and consulting services, services in the employment market, tourism, insurance, translation and more. To form a model of innovative development of the enterprise, it is advisable to establish the following key aspects: the system of value creation through the model of cooperation with partners and suppliers; creating a value chain; technological platform; infrastructure, determine the cost of supply, the cost of activities for customers and for the enterprise as a whole. The system of factors of influence on formation of model of strategic innovative development of the enterprise is offered. The expediency of the cost of the complex of technological equipment, which is 6800.0 thousand UAH, is economically calculated. Given the fact that the company plans to receive funds under the program of socio-economic development of Sumy region, the evaluation of the effectiveness of the innovation project, the purchase of technological equipment, it is determined that the payback period of the project is 3 years 10 months. In terms of net present value (NPV), the project under study is profitable. The project profitability index (PI) meets the requirements for a positive decision on project implementation> 1.0. The internal rate of return of the project (IRR) also has a positive value of 22% because it exceeds the discount rate.

Download Full-text

Efficient Prediction of Structural and Electronic Properties of Hybrid 2D Materials Using DFT and Machine Learning

10.26434/chemrxiv.6254756.v1 ◽

2018 ◽

Author(s):

Sherif Tawfik ◽

Olexandr Isayev ◽

Catherine Stampfl ◽

Joseph Shapter ◽

David Winkler ◽

...

Keyword(s):

Machine Learning ◽

Band Gap ◽

Density Functional ◽

2D Materials ◽

Van Der Waals ◽

Building Blocks ◽

Machine Learning Techniques ◽

Interlayer Distance ◽

Computational Screening ◽

Wide Range

Materials constructed from different van der Waals two-dimensional (2D) heterostructures offer a wide range of benefits, but these systems have been little studied because of their experimental and computational complextiy, and because of the very large number of possible combinations of 2D building blocks. The simulation of the interface between two different 2D materials is computationally challenging due to the lattice mismatch problem, which sometimes necessitates the creation of very large simulation cells for performing density-functional theory (DFT) calculations. Here we use a combination of DFT, linear regression and machine learning techniques in order to rapidly determine the interlayer distance between two different 2D heterostructures that are stacked in a bilayer heterostructure, as well as the band gap of the bilayer. Our work provides an excellent proof of concept by quickly and accurately predicting a structural property (the interlayer distance) and an electronic property (the band gap) for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast parameter space of van der Waals heterostructures to identify new hybrid materials with useful and interesting properties.

Download Full-text

COVID-19 Outbreak Prediction with Machine Learning

10.34055/osf.io/xr4js ◽

2020 ◽

Author(s):

Sina Faizollahzadeh Ardabili ◽

Amir Mosavi ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

Annamaria R. Varkonyi-Koczy ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Fuzzy Inference ◽

Control Measures ◽

Future Research ◽

Complex Nature ◽

Inference System ◽

Wide Range ◽

Standard Models ◽

High Level

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.

Download Full-text

Intelligent Techniques Analysis for Glycosylation Site Prediction

Current Bioinformatics ◽

10.2174/1574893615666210108094847 ◽

2021 ◽

Vol 15 ◽

Author(s):

Alhassan Alkuhlani ◽

Walaa Gad ◽

Mohamed Roushdy ◽

Abdel-Badeeh M. Salem

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Cell Interaction ◽

Glycosylation Site ◽

Machine Learning Classification ◽

Site Prediction ◽

Glycosylation Sites ◽

Wide Range ◽

Feature Extraction And Selection ◽

Computational Intelligent

Background: Glycosylation is one of the most common post-translation modifications (PTMs) in organism cells. It plays important roles in several biological processes including cell-cell interaction, protein folding, antigen’s recognition, and immune response. In addition, glycosylation is associated with many human diseases such as cancer, diabetes and coronaviruses. The experimental techniques for identifying glycosylation sites are time-consuming, extensive laboratory work, and expensive. Therefore, computational intelligence techniques are becoming very important for glycosylation site prediction. Objective: This paper is a theoretical discussion of the technical aspects of the biotechnological (e.g., using artificial intelligence and machine learning) to digital bioinformatics research and intelligent biocomputing. The computational intelligent techniques have shown efficient results for predicting N-linked, O-linked and C-linked glycosylation sites. In the last two decades, many studies have been conducted for glycosylation site prediction using these techniques. In this paper, we analyze and compare a wide range of intelligent techniques of these studies from multiple aspects. The current challenges and difficulties facing the software developers and knowledge engineers for predicting glycosylation sites are also included. Method: The comparison between these different studies is introduced including many criteria such as databases, feature extraction and selection, machine learning classification methods, evaluation measures and the performance results. Results and conclusions: Many challenges and problems are presented. Consequently, more efforts are needed to get more accurate prediction models for the three basic types of glycosylation sites.

Download Full-text

Customer lifetime value prediction for gaming industry: fuzzy clustering based approach

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219177 ◽

2021 ◽

pp. 1-10

Author(s):

Ahmet Tezcan Tekin ◽

Tolga Kaya ◽

Ferhan Cebi

Keyword(s):

Machine Learning ◽

Fuzzy Clustering ◽

Customer Lifetime Value ◽

Gaming Industry ◽

Value Prediction ◽

Customer Lifetime ◽

Lifetime Value ◽

Membership Value ◽

Different Characteristics ◽

Ensemble Algorithms

The use of fuzzy logic in machine learning is becoming widespread. In machine learning problems, the data, which have different characteristics, are trained and predicted together. Training the model consisting of data with different characteristics can increase the rate of error in prediction. In this study, we suggest a new approach to assembling prediction with fuzzy clustering. Our approach aims to cluster the data according to their fuzzy membership value and model it with similar characteristics. This approach allows for efficient clustering of objects with more than one cluster characteristic. On the other hand, our approach will enable us to combine boosting type ensemble algorithms, which are various forms of assemblies that are widely used in machine learning due to their excellent success in the literature. We used a mobile game’s customers’ marketing and gameplay data for predicting their customer lifetime value for testing our approach. Customer lifetime value prediction for users is crucial for determining the marketing cost cap for companies. The findings reveal that using a fuzzy method to ensemble the algorithms outperforms implementing the algorithms individually.

Download Full-text

Global soil moisture data derived through machine learning trained with in-situ measurements

Scientific Data ◽

10.1038/s41597-021-00964-1 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Sungmin O. ◽

Rene Orth

Keyword(s):

Machine Learning ◽

Soil Moisture ◽

Large Scale ◽

Short Term Memory ◽

Temporal Dynamics ◽

Soil Moisture Data ◽

Wide Range ◽

Global Soil

AbstractWhile soil moisture information is essential for a wide range of hydrologic and climate applications, spatially-continuous soil moisture data is only available from satellite observations or model simulations. Here we present a global, long-term dataset of soil moisture derived through machine learning trained with in-situ measurements, SoMo.ml. We train a Long Short-Term Memory (LSTM) model to extrapolate daily soil moisture dynamics in space and in time, based on in-situ data collected from more than 1,000 stations across the globe. SoMo.ml provides multi-layer soil moisture data (0–10 cm, 10–30 cm, and 30–50 cm) at 0.25° spatial and daily temporal resolution over the period 2000–2019. The performance of the resulting dataset is evaluated through cross validation and inter-comparison with existing soil moisture datasets. SoMo.ml performs especially well in terms of temporal dynamics, making it particularly useful for applications requiring time-varying soil moisture, such as anomaly detection and memory analyses. SoMo.ml complements the existing suite of modelled and satellite-based datasets given its distinct derivation, to support large-scale hydrological, meteorological, and ecological analyses.

Download Full-text

Assessing the Relation between Mud Components and Rheology for Loss Circulation Prevention Using Polymeric Gels: A Machine Learning Approach

Energies ◽

10.3390/en14051377 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1377

Author(s):

Musaab I. Magzoub ◽

Raj Kiran ◽

Saeed Salehi ◽

Ibnelwaleed A. Hussein ◽

Mustafa S. Nasser

Keyword(s):

Machine Learning ◽

Rheological Properties ◽

Nearest Neighbor ◽

Drilling Fluid ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Wide Range ◽

Machine Learning Approach ◽

Drilling Operations

The traditional way to mitigate loss circulation in drilling operations is to use preventative and curative materials. However, it is difficult to quantify the amount of materials from every possible combination to produce customized rheological properties. In this study, machine learning (ML) is used to develop a framework to identify material composition for loss circulation applications based on the desired rheological characteristics. The relation between the rheological properties and the mud components for polyacrylamide/polyethyleneimine (PAM/PEI)-based mud is assessed experimentally. Four different ML algorithms were implemented to model the rheological data for various mud components at different concentrations and testing conditions. These four algorithms include (a) k-Nearest Neighbor, (b) Random Forest, (c) Gradient Boosting, and (d) AdaBoosting. The Gradient Boosting model showed the highest accuracy (91 and 74% for plastic and apparent viscosity, respectively), which can be further used for hydraulic calculations. Overall, the experimental study presented in this paper, together with the proposed ML-based framework, adds valuable information to the design of PAM/PEI-based mud. The ML models allowed a wide range of rheology assessments for various drilling fluid formulations with a mean accuracy of up to 91%. The case study has shown that with the appropriate combination of materials, reasonable rheological properties could be achieved to prevent loss circulation by managing the equivalent circulating density (ECD).

Download Full-text