Using machine learning techniques to generate analog ensembles for data assimilation

Author(s):  
Lucia Yang ◽  
Ian Grooms

<p>We propose to use analogs of the forecast mean to generate an ensemble of perturbations for use in ensemble optimal interpolation (EnOI) or ensemble variational (EnVar) methods.  In addition to finding analogs from a library, we propose a new method of constructing analogs using autoencoders (a machine learning method).  To extend the scalability of constructed analogs for use in data assimilation on geophysical models, we propose using patching schemes to divide the global spatial domain into digestable chunks.  Using patches makes training the generative models possible and has the added benefit of being able to exploit parallel computing powers.  The resulting analog methods using analogs from a catalog (AnEnOI), constructed analogs (cAnEnOI), and patched constructed analogs (p-cAnEnOI) are tested in the context of a multiscale Lorenz-`96 model, with standard EnOI and an ensemble square root filter for comparison.  The use of analogs from a modestly-sized catalog is shown to improve the performance of EnOI, with limited marginal improvements resulting from increases in the catalog size.  The method using constructed analogs is found to perform as well as a full ensemble square root filter, and to be robust over a wide range of tuning parameters.  Lastly, we find that p-cAnENOI with larger patches produces the best data assimilation performance despite having larger reconstruction errors.  All patch variants except for the variant that uses the smallest patch size outperform cAnEnOI as well as some traditional data assimilation methods such as the ensemble square root filter.</p>

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Majid Amirfakhrian ◽  
Mahboub Parhizkar

AbstractIn the next decade, machine vision technology will have an enormous impact on industrial works because of the latest technological advances in this field. These advances are so significant that the use of this technology is now essential. Machine vision is the process of using a wide range of technologies and methods in providing automated inspections in an industrial setting based on imaging, process control, and robot guidance. One of the applications of machine vision is to diagnose traffic accidents. Moreover, car vision is utilized for detecting the amount of damage to vehicles during traffic accidents. In this article, using image processing and machine learning techniques, a new method is presented to improve the accuracy of detecting damaged areas in traffic accidents. Evaluating the proposed method and comparing it with previous works showed that the proposed method is more accurate in identifying damaged areas and it has a shorter execution time.


2021 ◽  
Author(s):  
K. Emma Knowland ◽  
Christoph Keller ◽  
Krzysztof Wargan ◽  
Brad Weir ◽  
Pamela Wales ◽  
...  

<p>NASA's Global Modeling and Assimilation Office (GMAO) produces high-resolution global forecasts for weather, aerosols, and air quality. The NASA Global Earth Observing System (GEOS) model has been expanded to provide global near-real-time 5-day forecasts of atmospheric composition at unprecedented horizontal resolution of 0.25 degrees (~25 km). This composition forecast system (GEOS-CF) combines the operational GEOS weather forecasting model with the state-of-the-science GEOS-Chem chemistry module (version 12) to provide detailed analysis of a wide range of air pollutants such as ozone, carbon monoxide, nitrogen oxides, and fine particulate matter (PM2.5). Satellite observations are assimilated into the system for improved representation of weather and smoke. The assimilation system is being expanded to include chemically reactive trace gases. We discuss current capabilities of the GEOS Constituent Data Assimilation System (CoDAS) to improve atmospheric composition modeling and possible future directions, notably incorporating new observations (TROPOMI, geostationary satellites) and machine learning techniques. We show how machine learning techniques can be used to correct for sub-grid-scale variability, which further improves model estimates at a given observation site.</p>


2021 ◽  
pp. 194855062110349
Author(s):  
Bastian Jaeger ◽  
Alex L. Jones

Which facial characteristics do people rely on when forming personality impressions? Previous research has uncovered an array of facial features that influence people’s impressions. Even though some (classes of) features, such as resemblances to emotional expressions or facial width-to-height ratio (fWHR), play a central role in theories of social perception, their relative importance in impression formation remains unclear. Here, we model faces along a wide range of theoretically important dimensions and use machine learning techniques to test how well 28 features predict impressions of trustworthiness and dominance in a diverse set of 597 faces. In line with overgeneralization theory, emotion resemblances were most predictive of both traits. Other features that have received a lot of attention in the literature, such as fWHR, were relatively uninformative. Our results highlight the importance of modeling faces along a wide range of dimensions to elucidate their relative importance in impression formation.


2020 ◽  
Author(s):  
Futo Tomizawa ◽  
Yohei Sawada

Abstract. Prediction of spatio-temporal chaotic systems is important in various fields, such as Numerical Weather Prediction (NWP). While data assimilation methods have been applied in NWP, machine learning techniques, such as Reservoir Computing (RC), are recently recognized as promising tools to predict spatio-temporal chaotic systems. However, the sensitivity of the skill of the machine learning based prediction to the imperfectness of observations is unclear. In this study, we evaluate the skill of RC with noisy and sparsely distributed observations. We intensively compare the performances of RC and Local Ensemble Transform Kalman Filter (LETKF) by applying them to the prediction of the Lorenz 96 system. Although RC can successfully predict the Lorenz 96 system if the system is perfectly observed, we find that RC is vulnerable to observation sparsity compared with LETKF. To overcome this limitation of RC, we propose to combine LETKF and RC. In our proposed method, the system is predicted by RC that learned the analysis time series estimated by LETKF. Our proposed method can successfully predict the Lorenz 96 system using noisy and sparsely distributed observations. Most importantly, our method can predict better than LETKF when the process-based model is imperfect.


Online shopping's have achieved an immense growth. All like to do it as there is no need to physically to the shop and we have a wide range of collections available in the online sites from which we can actually buy the product. The customers usually tend to purchase a product that has a good customer review and has the highest rating. Numerous reviews are given for a single product and the most of the important reviews are not organized well which makes it disappear from the other reviews. Numerous researchers have worked on structuring the reviews for various purposes. In this work we propose a sentimental analysis of customer reviews for various hotel items. All the items are reviewed by the customers and the proposed work makes an analysis of the reviews obtained for a particular item in all the available shops. This analysis is helpful injudging the most likely consumed food by the customers around and can get to know the competiveness of the product being delivered to the customers. Machine Learning techniques and Natural language Processing (NLP) are used for the proposed work and is observed to produce an efficient result.


Author(s):  
Graziano Fiorillo ◽  
Hani Nassif

The MAP-21 Act requires information on bridge assets to be at the element level for management operations in the U.S.A. This approach has the objective of improving future predictions of the performance of bridge assets for a more precise evaluation of condition and correct allocation of management funds to keep bridges in a good state of repair. Although bridge element conditions were introduced in the 1990s, the application of such data had never been mandatory for bridge asset management until 2014, therefore, the amount of historical data on bridge element (BE) condition is still limited. On the other hand, National Bridge Inventory (NBI) ratings have been collected since the 1970s and a wide range of data are available. Therefore, it is natural to ask whether BE condition can be predicted using NBI data. In the past, researchers statistically related BE and NBI data, but little has been done to revert NBI to BE. This paper addresses both challenges of mapping BE–NBI condition data using several machine learning techniques. The results of the analysis of these techniques applied to a sample of about 9,000 bridges from northeastern states of the U.S.A. shows that between 79.8% and 100% of the NBI ratings for deck, superstructure, and substructure can be predicted within a rating error of ± 1. The back-mapping operation of NBI time-dependent ratings to BE deterioration profiles for deck, superstructure, and substructure can also be predicted accurately with a probability greater than 50% at the 95% confidence level.


Author(s):  
Frederico Luiz Caram ◽  
Bruno Rafael De Oliveira Rodrigues ◽  
Amadeu Silveira Campanelli ◽  
Fernando Silva Parreiras

Code smells or bad smells are an accepted approach to identify design flaws in the source code. Although it has been explored by researchers, the interpretation of programmers is rather subjective. One way to deal with this subjectivity is to use machine learning techniques. This paper provides the reader with an overview of machine learning techniques and code smells found in the literature, aiming at determining which methods and practices are used when applying machine learning for code smells identification and which machine learning techniques have been used for code smells identification. A mapping study was used to identify the techniques used for each smell. We found that the Bloaters was the main kind of smell studied, addressed by 35% of the papers. The most commonly used technique was Genetic Algorithms (GA), used by 22.22% of the papers. Regarding the smells addressed by each technique, there was a high level of redundancy, in a way that the smells are covered by a wide range of algorithms. Nevertheless, Feature Envy stood out, being targeted by 63% of the techniques. When it comes to performance, the best average was provided by Decision Tree, followed by Random Forest, Semi-supervised and Support Vector Machine Classifier techniques. 5 out of the 25 analyzed smells were not handled by any machine learning techniques. Most of them focus on several code smells and in general there is no outperforming technique, except for a few specific smells. We also found a lack of comparable results due to the heterogeneity of the data sources and of the provided results. We recommend the pursuit of further empirical studies to assess the performance of these techniques in a standardized dataset to improve the comparison reliability and replicability.


Author(s):  
Dr. E. Baraneetharan

Machine Learning is capable of providing real-time solutions that maximize the utilization of resources in the network thereby increasing the lifetime of the network. It is able to process automatically without being externally programmed thus making the process more easy, efficient, cost-effective, and reliable. ML algorithms can handle complex data more quickly and accurately. Machine Learning is used to enhance the ability of the Wireless Sensor Network environment. Wireless Sensor Networks (WSN) is a combination of several networks and it is decentralized and distributed in nature. WSN consists of sensor nodes and sinks nodes which have a property of self-organizing and self-healing. WSN is used in other applications, such as biodiversity and ecosystem protection, surveillance, climate change tracking, and other military applications.Now-a-days, a huge development is seen in WSNs due to the advancement of electronics and wireless communication technologies, several drawbacks like low computational capacity, small memory, and limited energy resources infrastructure needs physical vulnerability to require source measures where privacy plays a key role.WSN is used to monitor the dynamic environments and to adapt to such situation sensor networks need Machine Learning techniques to avoid unnecessary redesign. Machine learning techniques survey for WSNs provide a wide range of applications in which security is given top priority. To secure data from attackers the WSNs system should be able to delete the instruction if any hackers/attackers are trying to steal data.


2021 ◽  
Vol 14 (1) ◽  
pp. 453-463
Author(s):  
Abdul Syukur ◽  
◽  
Deden Istiawan ◽  

LQ45 is an Indonesia Stock Exchange Index (ISX) incorporate of 45 companies that meet certain criteria to target investors for selecting certain stocks. The prediction of stock price direction in the financial world is a major issue. The implementation of machine learning and other algorithms for market price analysis and forecasting is a very promising field. Different types of classification algorithms were used to predict the stock market. However, when individual studies are considered separately there is no clear consensus that algorithms work best. In this research, a comparison framework is proposed, which aims to benchmark the performance of a wide range of classification models and use them to predict the LQ45 index. The data in this research contains the transaction level and capitalization size are obtained from the Indonesian Stock Exchange (ISX). For analysis purposes, we set out 10 classifiers that can be used to build classification models and test their performance in the LQ45 dataset. The performance criterion chosen to measure this effect is accuracy, recall, and precision. The results showed that the random forest algorithm had the best performance for predicting the LQ45 index. Whilst the classification and regression trees, C4.5, support vector machine, and logistic regression algorithms also perform well. Besides, the models based on traditional statisticalbased learners that are Naïve Bayes and linear discriminant analysis seem to underperform for predicting the LQ45 index. These results are not only beneficial to enrichment the machine learning techniques literature but also have a significant influence on the stock market prediction in terms of the ability to predict the LQ45 index.


Sign in / Sign up

Export Citation Format

Share Document