scholarly journals Preliminary assessment of the pragmatic value of information in the classification problem based on deep neural networks

2021 ◽  
Vol 16 (93) ◽  
pp. 9-20
Author(s):  
Valery P. Meshalkin ◽  
◽  
Maxim I. Dli ◽  
Andrey Yu. Puchkov ◽  
Ekaterina I. Lobaneva ◽  
...  

A method is proposed for preliminary assessment of the pragmatic value of information in the problem of classifying the state of an object based on deep recurrent networks of long short-term memory. The purpose of the study is to develop a method for predicting the state of a controlled object while minimizing the number of used prognostic parameters through a preliminary assessment of the pragmatic value of information. This is an especially urgent task under conditions of processing big data, characterized not only by significant volumes of incoming information, but also by information rate and multiformatness. The generation of big data is now happening in almost all areas of activity due to the widespread introduction of the Internet of Things in them. The method is implemented by a two-level scheme for processing input information. At the first level, a Random Forest machine learning algorithm is used, which has significantly fewer adjustable parameters than a recurrent neural network used at the second level for the final and more accurate classification of the state of the controlled object or process. The choice of Random Forest is due to its ability to assess the importance of variables in regression and classification problems. This is used in determining the pragmatic value of the input information at the first level of the data processing scheme. For this purpose, a parameter is selected that reflects the specified value in some sense, and based on the ranking of the input variables by the level of importance, they are selected to form training datasets for the recurrent network. The algorithm of the proposed data processing method with a preliminary assessment of the pragmatic value of information is implemented in a program in the MatLAB language, and it has shown its efficiency in an experiment on model data.

Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 859
Author(s):  
Abdulaziz O. AlQabbany ◽  
Aqil M. Azmi

We are living in the age of big data, a majority of which is stream data. The real-time processing of this data requires careful consideration from different perspectives. Concept drift is a change in the data’s underlying distribution, a significant issue, especially when learning from data streams. It requires learners to be adaptive to dynamic changes. Random forest is an ensemble approach that is widely used in classical non-streaming settings of machine learning applications. At the same time, the Adaptive Random Forest (ARF) is a stream learning algorithm that showed promising results in terms of its accuracy and ability to deal with various types of drift. The incoming instances’ continuity allows for their binomial distribution to be approximated to a Poisson(1) distribution. In this study, we propose a mechanism to increase such streaming algorithms’ efficiency by focusing on resampling. Our measure, resampling effectiveness (ρ), fuses the two most essential aspects in online learning; accuracy and execution time. We use six different synthetic data sets, each having a different type of drift, to empirically select the parameter λ of the Poisson distribution that yields the best value for ρ. By comparing the standard ARF with its tuned variations, we show that ARF performance can be enhanced by tackling this important aspect. Finally, we present three case studies from different contexts to test our proposed enhancement method and demonstrate its effectiveness in processing large data sets: (a) Amazon customer reviews (written in English), (b) hotel reviews (in Arabic), and (c) real-time aspect-based sentiment analysis of COVID-19-related tweets in the United States during April 2020. Results indicate that our proposed method of enhancement exhibited considerable improvement in most of the situations.


Author(s):  
V. Fartukov ◽  
N. Hanov

A tree of data analysis for the formation and preprocessing, storage and protection of data based on Big Data and Blockchain technologies has been developed. The developed algorithm allows for the classification of data on the state of the field, split testing of data, forecasting and machine learning for the implementation of differential irrigation with sprinklers.


2019 ◽  
Vol 11 (5) ◽  
pp. 601 ◽  
Author(s):  
Sajid Pareeth ◽  
Poolad Karimi ◽  
Mojtaba Shafiei ◽  
Charlotte De Fraiture

Increase in irrigated area, driven by demand for more food production, in the semi-arid regions of Asia and Africa is putting pressure on the already strained available water resources. To cope and manage this situation, monitoring spatial and temporal dynamics of the irrigated area land use at basin level is needed to ensure proper allocation of water. Publicly available satellite data at high spatial resolution and advances in remote sensing techniques offer a viable opportunity. In this study, we developed a new approach using time series of Landsat 8 (L8) data and Random Forest (RF) machine learning algorithm by introducing a hierarchical post-processing scheme to extract key Land Use Land Cover (LULC) types. We implemented this approach for Mashhad basin in Iran to develop a LULC map at 15 m spatial resolution with nine classes for the crop year 2015/2016. In addition, five irrigated land use types were extracted for three crop years—2013/2014, 2014/2015, and 2015/2016—using the RF models. The total irrigated area was estimated at 1796.16 km2, 1581.7 km2 and 1578.26 km2 for the cropping years 2013/2014, 2014/2015 and 2015/2016, respectively. The overall accuracy of the final LULC map was 87.2% with a kappa coefficient of 0.85. The methodology was implemented using open data and open source libraries. The ability of the RF models to extract key LULC types at basin level shows the usability of such approaches for operational near real time monitoring.


2019 ◽  
Vol 4 (2) ◽  
pp. 910-917
Author(s):  
Chao-Chun Chen ◽  
Min-Hsiung Hung ◽  
Benny Suryajaya ◽  
Yu-Chuan Lin ◽  
Haw-Ching Yang ◽  
...  

Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 517
Author(s):  
Rakib Hossen ◽  
Md Whaiduzzaman ◽  
Mohammed Nasir Uddin ◽  
Md. Jahidul Islam ◽  
Nuruzzaman Faruqui ◽  
...  

The Internet of Things (IoT) has seen a surge in mobile devices with the market and technical expansion. IoT networks provide end-to-end connectivity while keeping minimal latency. To reduce delays, efficient data delivery schemes are required for dispersed fog-IoT network orchestrations. We use a Spark-based big data processing scheme (BDPS) to accelerate the distributed database (RDD) delay efficient technique in the fogs for a decentralized heterogeneous network architecture to reinforce suitable data allocations via IoTs. We propose BDPS based on Spark-RDD in fog-IoT overlay architecture to address the performance issues across the network orchestration. We evaluate data processing delays from fog-IoT integrated parts using a depth-first-search-based shortest path node finding configuration, which outperforms the existing shortest path algorithms in terms of algorithmic (i.e., depth-first search) efficiency, including the Bellman–Ford (BF) algorithm, Floyd–Warshall (FW) algorithm, Dijkstra algorithm (DA), and Apache Hadoop (AH) algorithm. The BDPS exhibits low latency in packet deliveries as well as low network overhead uplink activity through a map-reduced resilient data distribution mechanism, better than in BF, DA, FW, and AH. The overall BDPS scheme supports efficient data delivery across the fog-IoT orchestration, outperforming faster node execution while proving effective results, compared to DA, BF, FW and AH, respectively.


2014 ◽  
Vol 12 (6) ◽  
pp. 311-316 ◽  
Author(s):  
Yoon-Su Jeong ◽  
Kun-Hee Han

Sign in / Sign up

Export Citation Format

Share Document