Big Data Assimilation: Real-time Workflow for 30-second-update Forecasting and Perspectives toward DA-AI Integration

Author(s):  
Takemasa Miyoshi ◽  
Takmi Honda ◽  
Shigenori Otsuka ◽  
Arata Amemiya ◽  
Yasumitsu Maejima ◽  
...  

<p>The Japan’s Big Data Assimilation (BDA) project started in October 2013 and ended its 5.5-year period in March 2019. The direct follow-on project was accepted and started in April 2019 under the Japan Science and Technology Agency (JST) AIP (Advanced Intelligence Project) Acceleration Research, with emphases on the connection with AI technologies, in particular, an integration of DA and AI with high-performance computation (HPC). The BDA project aimed to fully take advantage of “big data” from advanced sensors such as the phased array weather radar (PAWR) and Himawari-8 geostationary satellite, which provide two orders of magnitude more data than the previous sensors. We have achieved successful case studies with newly-developed 30-second-update, 100-m-mesh numerical weather prediction (NWP) system based on the RIKEN’s SCALE model and local ensemble transform Kalman filter (LETKF) to assimilate PAWR in Osaka and Kobe. We have been actively developing the workflow for real-time weather forecasting in Tokyo in summer 2020. In addition, we developed two precipitation nowcasting systems with the every-30-second PAWR data: one with an optical-flow-based system, the other with a deep-learning-based system. We chose the convolutional Long Short Term Memory (Conv-LSTM) as a deep learning algorithm, and found it effective for precipitation nowcasting. The use of Conv-LSTM would lead to an integration of DA and AI with HPC. This presentation will include an overview of the BDA project toward a DA-AI-HPC integration under the new AIP Acceleration Research scheme, and recent progress of the project.</p>

2021 ◽  
Author(s):  
Aryaman Sinha ◽  
Mayuna Gupta ◽  
K S S Sai Srujan ◽  
Hariprasad Kodamana ◽  
Sandeep Sukumaran

<div><div><div><p>The synoptic-scale (3 - 7 days) variability is a dominant contributor to the Indian summer monsoon (ISM) seasonal precipitation. An accurate prediction of ISM precipitation by dynamical or statistical models remains a challenge. Here we show that the sea level pressure (SLP) can be used as a proxy to predict the active-break cycle as well as the genesis of low- pressure-systems (LPS), using a deep learning model, namely, convolutional long short-term memory (ConvLSTM) networks. The deep learning model is able to reliably predict the daily SLP anomalies over Central India and the Bay of Bengal at a lead time of 7 days. As the fluctuations in SLP drive the changes in the strength of the atmospheric circulation, the prediction of SLP anomalies is useful in predicting the intensity of ISM. It is demonstrated that the ConvLSTM possesses better prediction skill compared to a conventional numerical weather prediction model, indicating the usefulness of a physics guided deep learning model in medium range weather forecasting.</p></div></div></div>


2020 ◽  
pp. 158-161
Author(s):  
Chandraprabha S ◽  
Pradeepkumar G ◽  
Dineshkumar Ponnusamy ◽  
Saranya M D ◽  
Satheesh Kumar S ◽  
...  

This paper outfits artificial intelligence based real time LDR data which is implemented in various applications like indoor lightning, and places where enormous amount of heat is produced, agriculture to increase the crop yield, Solar plant for solar irradiance Tracking. For forecasting the LDR information. The system uses a sensor that can measure the light intensity by means of LDR. The data acquired from sensors are posted in an Adafruit cloud for every two seconds time interval using Node MCU ESP8266 module. The data is also presented on adafruit dashboard for observing sensor variables. A Long short-term memory is used for setting up the deep learning. LSTM module uses the recorded historical data from adafruit cloud which is paired with Node MCU in order to obtain the real-time long-term time series sensor variables that is measured in terms of light intensity. Data is extracted from the cloud for processing the data analytics later the deep learning model is implemented in order to predict future light intensity values.


Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 360
Author(s):  
Theyazn H. H. Aldhyani ◽  
Hasan Alkahtani

Rapid technological development has changed drastically the automotive industry. Network communication has improved, helping the vehicles transition from completely machine- to software-controlled technologies. The autonomous vehicle network is controlled by the controller area network (CAN) bus protocol. Nevertheless, the autonomous vehicle network still has issues and weaknesses concerning cybersecurity due to the complexity of data and traffic behaviors that benefit the unauthorized intrusion to a CAN bus and several types of attacks. Therefore, developing systems to rapidly detect message attacks in CAN is one of the biggest challenges. This study presents a high-performance system with an artificial intelligence approach that protects the vehicle network from cyber threats. The system secures the autonomous vehicle from intrusions by using deep learning approaches. The proposed security system was verified by using a real automatic vehicle network dataset, including spoofing, flood, replaying attacks, and benign packets. Preprocessing was applied to convert the categorical data into numerical. This dataset was processed by using the convolution neural network (CNN) and a hybrid network combining CNN and long short-term memory (CNN-LSTM) models to identify attack messages. The results revealed that the model achieved high performance, as evaluated by the metrics of precision, recall, F1 score, and accuracy. The proposed system achieved high accuracy (97.30%). Along with the empirical demonstration, the proposed system enhanced the detection and classification accuracy compared with the existing systems and was proven to have superior performance for real-time CAN bus security.


Electronics ◽  
2020 ◽  
Vol 9 (7) ◽  
pp. 1140
Author(s):  
Jeong-Hee Lee ◽  
Jongseok Kang ◽  
We Shim ◽  
Hyun-Sang Chung ◽  
Tae-Eung Sung

Building a pattern detection model using a deep learning algorithm for data collected from manufacturing sites is an effective way for to perform decision-making and assess business feasibility for enterprises, by providing the results and implications of the patterns analysis of big data occurring at manufacturing sites. To identify the threshold of the abnormal pattern requires collaboration between data analysts and manufacturing process experts, but it is practically difficult and time-consuming. This paper suggests how to derive the threshold setting of the abnormal pattern without manual labelling by process experts, and offers a prediction algorithm to predict the potentials of future failures in advance by using the hybrid Convolutional Neural Networks (CNN)–Long Short-Term Memory (LSTM) algorithm, and the Fast Fourier Transform (FFT) technique. We found that it is easier to detect abnormal patterns that cannot be found in the existing time domain after preprocessing the data set through FFT. Our study shows that both train loss and test loss were well developed, with near zero convergence with the lowest loss rate compared to existing models such as LSTM. Our proposition for the model and our method of preprocessing the data greatly helps in understanding the abnormal pattern of unlabeled big data produced at the manufacturing site, and can be a strong foundation for detecting the threshold of the abnormal pattern of big data occurring at manufacturing sites.


2021 ◽  
Author(s):  
Aryaman Sinha ◽  
Mayuna Gupta ◽  
K S S Sai Srujan ◽  
Hariprasad Kodamana ◽  
Sandeep Sukumaran

<div><div><div><p>The synoptic-scale (3 - 7 days) variability is a dominant contributor to the Indian summer monsoon (ISM) seasonal precipitation. An accurate prediction of ISM precipitation by dynamical or statistical models remains a challenge. Here we show that the sea level pressure (SLP) can be used as a proxy to predict the active-break cycle as well as the genesis of low- pressure-systems (LPS), using a deep learning model, namely, convolutional long short-term memory (ConvLSTM) networks. The deep learning model is able to reliably predict the daily SLP anomalies over Central India and the Bay of Bengal at a lead time of 7 days. As the fluctuations in SLP drive the changes in the strength of the atmospheric circulation, the prediction of SLP anomalies is useful in predicting the intensity of ISM. It is demonstrated that the ConvLSTM possesses better prediction skill compared to a conventional numerical weather prediction model, indicating the usefulness of a physics guided deep learning model in medium range weather forecasting.</p></div></div></div>


2021 ◽  
Author(s):  
Takemasa Miyoshi ◽  
Takumi Honda ◽  
Arata Amemiya ◽  
Shigenori Otsuka ◽  
Yasumitsu Maejima ◽  
...  

&lt;p&gt;The Japan&amp;#8217;s Big Data Assimilation (BDA) project started in October 2013 and ended its 5.5-year period in March 2019. Here, we developed a novel numerical weather prediction (NWP) system at 100-m resolution updated every 30 seconds for precise prediction of individual convective clouds. This system was designed to fully take advantage of the phased array weather radar (PAWR) which observes reflectivity and Doppler velocity at 30-second frequency for 100 elevation angles at 100-m range resolution. By the end of the 5.5-year project period, we achieved less than 30-second computational time using the Japan&amp;#8217;s flagship K computer, whose 10-petaflops performance was ranked #1 in the TOP500 list in 2011, for past cases with all input data such as boundary conditions and observation data being ready to use. The direct follow-on project started in April 2019 under the Japan Science and Technology Agency (JST) AIP (Advanced Intelligence Project) Acceleration Research. We continued the development to achieve real-time operations of this novel 30-second-update NWP system for demonstration at the time of the Tokyo 2020 Olympic and Paralympic games. The games were postponed, but the project achieved real-time demonstration of the 30-second-update NWP system at 500-m resolution using a powerful supercomputer called Oakforest-PACS operated jointly by the Tsukuba University and the University of Tokyo. The additional developments include parameter tuning for more accurate prediction and complete workflow to prepare all input data in real time, i.e., fast data transfer from the novel dual-polarization PAWR called MP-PAWR in Saitama University, and real-time nested-domain forecasts at 18-km, 6-km, and 1.5-km to provide lateral boundary conditions for the innermost 500-m-mesh domain. A real-time test was performed during July 31 and August 7, 2020 and resulted in the actual lead time of more than 27 minutes for 30-minute prediction with very few exceptions of extended delay. Past case experiments showed that this system could capture rapid intensification and decays of convective rains that occurred in the order of less than 10 minutes, while the JMA nowcasting did not predict the rapid changes by its design. This presentation will summarize the real-time demonstration during August 25 and September 7 when Tokyo 2020 Paralympic games were supposed to take place.&lt;/p&gt;


2022 ◽  
Author(s):  
Romit Maulik ◽  
Vishwas Rao ◽  
Jiali Wang ◽  
Gianmarco Mengaldo ◽  
Emil Constantinescu ◽  
...  

Abstract. Data assimilation (DA) in the geophysical sciences remains the cornerstone of robust forecasts from numerical models. Indeed, DA plays a crucial role in the quality of numerical weather prediction, and is a crucial building block that has allowed dramatic improvements in weather forecasting over the past few decades. DA is commonly framed in a variational setting, where one solves an optimization problem within a Bayesian formulation using raw model forecasts as a prior, and observations as likelihood. This leads to a DA objective function that needs to be minimized, where the decision variables are the initial conditions specified to the model. In traditional DA, the forward model is numerically and computationally expensive. Here we replace the forward model with a low-dimensional, data-driven, and differentiable emulator. Consequently, gradients of our DA objective function with respect to the decision variables are obtained rapidly via automatic differentiation. We demonstrate our approach by performing an emulator-assisted DA forecast of geopotential height. Our results indicate that emulator-assisted DA is faster than traditional equation-based DA forecasts by four orders of magnitude, allowing computations to be performed on a workstation rather than a dedicated high-performance computer. In addition, we describe accuracy benefits of emulator-assisted DA when compared to simply using the emulator for forecasting (i.e., without DA). Our overall formulation is denoted AIAEDA (Artificial Intelligence Emulator Assisted Data Assimilation).


2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Kate Highnam ◽  
Domenic Puzio ◽  
Song Luo ◽  
Nicholas R. Jennings

AbstractBotnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the “bagging” model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC, $$F_1$$ F 1 score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large enterprise. In 4 h of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag.


Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 859
Author(s):  
Abdulaziz O. AlQabbany ◽  
Aqil M. Azmi

We are living in the age of big data, a majority of which is stream data. The real-time processing of this data requires careful consideration from different perspectives. Concept drift is a change in the data’s underlying distribution, a significant issue, especially when learning from data streams. It requires learners to be adaptive to dynamic changes. Random forest is an ensemble approach that is widely used in classical non-streaming settings of machine learning applications. At the same time, the Adaptive Random Forest (ARF) is a stream learning algorithm that showed promising results in terms of its accuracy and ability to deal with various types of drift. The incoming instances’ continuity allows for their binomial distribution to be approximated to a Poisson(1) distribution. In this study, we propose a mechanism to increase such streaming algorithms’ efficiency by focusing on resampling. Our measure, resampling effectiveness (ρ), fuses the two most essential aspects in online learning; accuracy and execution time. We use six different synthetic data sets, each having a different type of drift, to empirically select the parameter λ of the Poisson distribution that yields the best value for ρ. By comparing the standard ARF with its tuned variations, we show that ARF performance can be enhanced by tackling this important aspect. Finally, we present three case studies from different contexts to test our proposed enhancement method and demonstrate its effectiveness in processing large data sets: (a) Amazon customer reviews (written in English), (b) hotel reviews (in Arabic), and (c) real-time aspect-based sentiment analysis of COVID-19-related tweets in the United States during April 2020. Results indicate that our proposed method of enhancement exhibited considerable improvement in most of the situations.


Sign in / Sign up

Export Citation Format

Share Document