Using Trend Extraction and Spatial Trends to Improve Flood Modeling and Control

Effective management of flood events depends on a thorough understanding of regional geospatial characteristics, yet data visualization is rarely effectively integrated into the planning tools used by decision makers. This chapter considers publicly available data sets and data visualization techniques that can be adapted for use by all community planners and decision makers. A long short-term memory (LSTM) network is created to develop a univariate time series value for river stage prediction that improves the temporal resolution and accuracy of forecasts. This prediction is then tied to a corresponding spatial flood inundation profile in a geographic information system (GIS) setting. The intersection of flood profile and affected road segments can be easily visualized and extracted. Traffic decision makers can use these findings to proactively deploy re-routing measures and warnings to motorists to decrease travel-miles and risks such as loss of property or life.

Download Full-text

Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning

SN Computer Science ◽

10.1007/s42979-021-00507-w ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Kate Highnam ◽

Domenic Puzio ◽

Song Luo ◽

Nicholas R. Jennings

Keyword(s):

Neural Network ◽

Deep Learning ◽

Real Time ◽

Network Traffic ◽

Short Term Memory ◽

Domain Names ◽

Control Networks ◽

Detection Techniques ◽

Lstm Network ◽

And Control

AbstractBotnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the “bagging” model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC, $$F_1$$ F 1 score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large enterprise. In 4 h of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag.

Download Full-text

Visualization of Big Data Sets Using Computer Graphics

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch020 ◽

2018 ◽

pp. 578-603

Author(s):

Anna Ursyn ◽

Edoardo L'Astorina

Keyword(s):

Big Data ◽

Data Visualization ◽

Visual Analysis ◽

Data Sets ◽

Arrival Times ◽

Learning And Teaching ◽

Knowledge Domains ◽

Visualization Of Data ◽

And Mathematics ◽

Visualization Techniques

This chapter discusses some possible ways of how professionals, researchers and users representing various knowledge domains are collecting and visualizing big data sets. First it describes communication through senses as a basis for visualization techniques, computational solutions for enhancing senses and ways of enhancing senses by technology. The next part discusses ideas behind visualization of data sets and ponders what is and what not visualization is. Further discussion relates to data visualization through art as visual solutions of science and mathematics related problems, documentation objects and events, and a testimony to thoughts, knowledge and meaning. Learning and teaching through data visualization is the concluding theme of the chapter. Edoardo L'Astorina provides visual analysis of best practices in visualization: An overlay of Google Maps that showed all the arrival times - in real time - of all the buses in your area based on your location and visual representation of all the Tweets in the world about TfL (Transport for London) tube lines to predict disruptions.

Download Full-text

LSTM-based soft sensor design for oxygen content of flue gas in coal-fired power plant

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331220932390 ◽

2020 ◽

pp. 014233122093239

Author(s):

Hongguang Pan ◽

Tao Su ◽

Xiangdong Huang ◽

Zheng Wang

Keyword(s):

Power Plant ◽

Oxygen Content ◽

Flue Gas ◽

Oxygen Sensor ◽

Short Term Memory ◽

Support Vector ◽

Data Sets ◽

Auxiliary Variables ◽

Data Set ◽

Lstm Network

To address problems of high cost, complicated process and low accuracy of oxygen content measurement in flue gas of coal-fired power plant, a method based on long short-term memory (LSTM) network is proposed in this paper to replace oxygen sensor to estimate oxygen content in flue gas of boilers. Specifically, first, the LSTM model was built with the Keras deep learning framework, and the accuracy of the model was further improved by selecting appropriate super-parameters through experiments. Secondly, the flue gas oxygen content, as the leading variable, was combined with the mechanism and boiler process primary auxiliary variables. Based on the actual production data collected from a coal-fired power plant in Yulin, China, the data sets were preprocessed. Moreover, a selection model of auxiliary variables based on grey relational analysis is proposed to construct a new data set and divide the training set and testing set. Finally, this model is compared with the traditional soft-sensing modelling methods (i.e. the methods based on support vector machine and BP neural network). The RMSE of LSTM model is 4.51% lower than that of GA-SVM model and 3.55% lower than that of PSO-BP model. The conclusion shows that the oxygen content model based on LSTM has better generalization and has certain industrial value.

Download Full-text

HPCWMF: A Hybrid Predictive Cloud Workload Management Framework Using Improved LSTM Neural Network

Cybernetics and Information Technologies ◽

10.2478/cait-2020-0047 ◽

2020 ◽

Vol 20 (4) ◽

pp. 55-73

Author(s):

K. Dinesh Kumar ◽

E. Umamaheswari

Keyword(s):

Short Term Memory ◽

Hybrid Approach ◽

Differential Evolution Algorithm ◽

Memory Loss ◽

Data Sets ◽

Workload Prediction ◽

Workload Management ◽

Management Framework ◽

Multilayer Perceptron Network ◽

Lstm Network

AbstractFor cloud providers, workload prediction is a challenging task due to irregular incoming workloads from users. Accurate workload prediction is essential for scheduling the resources to the cloud applications. Thus, in this paper, the authors propose a predictive cloud workload management framework to estimate the needed resources in advance based on a hybrid approach, which is a combination of an improved Long Short-Term Memory (LSTM) network and a multilayer perceptron network. By improving the traditional LSTM architecture by using opposition-based differential evolution algorithm and dropout technique on recurrent connection without memory loss, the proposed approach has the ability to perform a better prediction process. A novel hybrid predictive approach is aiming at enhancing the prediction performance of the cloud workload. Finally, the authors measure the proposed approach’s effectiveness under benchmark data sets of NASA and Saskatchewan servers. The experimental results proved that the proposed approach outperforms the other conventional methods.

Download Full-text

Test Data Sets for Evaluating Data Visualization Techniques

Perceptual Issues in Visualization ◽

10.1007/978-3-642-79057-7_2 ◽

1995 ◽

pp. 9-21 ◽

Cited By ~ 2

Author(s):

R. Daniel Bergeron ◽

Daniel A. Keim ◽

Ronald M. Pickett

Keyword(s):

Data Visualization ◽

Test Data ◽

Data Sets ◽

Visualization Techniques

Download Full-text

Performance of Combined Recommendation Modeling Fused With Tagawareness In Mixed Traffic Scenarios

10.21203/rs.3.rs-1120711/v1 ◽

2021 ◽

Author(s):

Xiangyuan Li ◽

Fei Ding ◽

Suju Ren ◽

Jianmin Bao ◽

Ruoyu Su ◽

...

Keyword(s):

High Speed ◽

Recommendation System ◽

Short Term Memory ◽

User Behavior ◽

Data Sets ◽

Mixed Traffic ◽

User Interest ◽

Travel Information ◽

Lstm Network ◽

Behavior Characteristics

Abstract Due to the heterogeneous characteristics of vehicles and user terminals, information in mixed traffic scenarios can be interacted based on the Web protocol of different terminals. The recommendation system can dig users' travel preferences by analyzing historical travel information of different traffic participants, to publish accurate travel information and services for the terminals of traffic participants. The diversification of existing road network users and networking modes, as well as the dynamic changes of user interest distribution caused by high-speed movement of vehicles, traditional collaborative filtering algorithms have limitations in terms of effectiveness. This paper proposes a novel Hybrid Tag-aware Recommender Model (HTRM). The model embedding layer first employs the Word2vec model to represent the tags and ratings of projects and users, respectively. The feature layer then introduces the auto-encoder to extract self-similar features of the item, and a long short-term memory (LSTM) network is used to extract user behavior characteristics to provide higher-quality recommendations. The gating layer combines the features of users and projects and then makes score recommendations based on the Fully Connected Neural Network (FCNN). Finally, Web data sets of different service preferences of traffic participants during the trip are used to evaluate the model recommendation performance in different scenarios. The experimental results show that the HTRM model is reasonable in design and can achieve high recommendation accuracy.

Download Full-text

Data Visualization

Employee Surveys and Sensing ◽

10.1093/oso/9780190939717.003.0019 ◽

2020 ◽

pp. 306-323

Author(s):

Evan F. Sinar

Keyword(s):

Survey Data ◽

Data Visualization ◽

Repeated Administration ◽

Quantitative Information ◽

Evidence Based ◽

Data Sets ◽

High Quality ◽

Text Information ◽

Visualization Techniques ◽

The Impact

Data visualization—a set of approaches for applying graphical principles to represent quantitative information—is extremely well matched to the nature of survey data but often underleveraged for this purpose. Surveys produce data sets that are highly structured and comparative across groups and geographies, that often blend numerical and open-text information, and that are designed for repeated administration and analysis. Each of these characteristics aligns well with specific visualization types, use of which has the potential to—when paired with foundational, evidence-based tenets of high-quality graphical representations—substantially increase the impact and influence of data presentations given by survey researchers. This chapter recommends and provides guidance on data visualization techniques fit to purpose for survey researchers, while also describing key risks and missteps associated with these approaches.

Download Full-text

Partial Discharge Recognition with a Multi-Resolution Convolutional Neural Network

Sensors ◽

10.3390/s18103512 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3512 ◽

Cited By ~ 22

Author(s):

Gaoyang Li ◽

Xiaohua Wang ◽

Xi Li ◽

Aijun Yang ◽

Mingzhe Rong

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Partial Discharge ◽

Unified Framework ◽

Important Symptom ◽

Multiple Resolutions ◽

Model Training ◽

Lstm Network ◽

Visualization Techniques

Partial discharge (PD) is not only an important symptom for monitoring the imperfections in the insulation system of a gas-insulated switchgear (GIS), but also the factor that accelerates the degradation. At present, monitoring ultra-high-frequency (UHF) signals induced by PDs is regarded as one of the most effective approaches for assessing the insulation severity and classifying the PDs. Therefore, in this paper, a deep learning-based PD classification algorithm is proposed and realized with a multi-column convolutional neural network (CNN) that incorporates UHF spectra of multiple resolutions. First, three subnetworks, as characterized by their specified designed temporal filters, frequency filters, and texture filters, are organized and then intergraded by a fully-connected neural network. Then, a long short-term memory (LSTM) network is utilized for fusing the embedded multi-sensor information. Furthermore, to alleviate the risk of overfitting, a transfer learning approach inspired by manifold learning is also present for model training. To demonstrate, 13 modes of defects considering both the defect types and their relative positions were well designed for a simulated GIS tank. A detailed analysis of the performance reveals the clear superiority of the proposed method, compared to18 typical baselines. Several advanced visualization techniques are also implemented to explore the possible qualitative interpretations of the learned features. Finally, a unified framework based on matrix projection is discussed to provide a possible explanation for the effectiveness of the architecture.

Download Full-text

Sarcasm Analysis Using Conversation Context

Computational Linguistics ◽

10.1162/coli_a_00336 ◽

2018 ◽

Vol 44 (4) ◽

pp. 755-792 ◽

Cited By ~ 2

Author(s):

Debanjan Ghosh ◽

Alexander R. Fabbri ◽

Smaranda Muresan

Keyword(s):

Social Media ◽

Computational Models ◽

Human Performance ◽

Short Term Memory ◽

Data Sets ◽

Discussion Forums ◽

Sentence Level ◽

Social Media Platforms ◽

Gold Label ◽

Lstm Network

Computational models for sarcasm detection have often relied on the content of utterances in isolation. However, the speaker’s sarcastic intent is not always apparent without additional context. Focusing on social media discussions, we investigate three issues: (1) does modeling conversation context help in sarcasm detection? (2) can we identify what part of conversation context triggered the sarcastic reply? and (3) given a sarcastic post that contains multiple sentences, can we identify the specific sentence that is sarcastic? To address the first issue, we investigate several types of Long Short-Term Memory (LSTM) networks that can model both the conversation context and the current turn. We show that LSTM networks with sentence-level attention on context and current turn, as well as the conditional LSTM network, outperform the LSTM model that reads only the current turn. As conversation context, we consider the prior turn, the succeeding turn, or both. Our computational models are tested on two types of social media platforms: Twitter and discussion forums. We discuss several differences between these data sets, ranging from their size to the nature of the gold-label annotations. To address the latter two issues, we present a qualitative analysis of the attention weights produced by the LSTM models (with attention) and discuss the results compared with human performance on the two tasks.

Download Full-text

Performance Testing on Marker Clustering and Heatmap Visualization Techniques: A Comparative Study on JavaScript Mapping Libraries

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8080348 ◽

2019 ◽

Vol 8 (8) ◽

pp. 348 ◽

Cited By ~ 3

Author(s):

Netek ◽

Brus ◽

Tomecka

Keyword(s):

Data Visualization ◽

Dynamic Environment ◽

Performance Testing ◽

Large Data ◽

Test Point ◽

Limiting Factor ◽

Data Sets ◽

Internet Platform ◽

Point Data ◽

Visualization Techniques

We are now generating exponentially more data from more sources than a few years ago. Big data, an already familiar term, has been generally defined as a massive volume of structured, semi-structured, and/or unstructured data, which may not be effectively managed and processed using traditional databases and software techniques. It could be problematic to visualize easily and quickly a large amount of data via an Internet platform. From this perspective, the main aim of the paper is to test point data visualization possibilities of selected JavaScript Mapping Libraries to measure their performance and ability to cope with a big amount of data. Nine datasets containing 10,000 to 3,000,000 points were generated from the Nature Conservation Database. Five libraries for marker clustering and two libraries for heatmap visualization were analyzed. Loading time and the ability to visualize large data sets were compared for each dataset and each library. The best-evaluated library was a Mapbox GL JS (Graphics Library JavaScript) with the highest overall performance. Some of the tested libraries were not able to handle the desired amount of data. In general, an amount of less than 100,000 points was indicated as the threshold for implementation without a noticeable slowdown in performance. Their usage can be a limiting factor for point data visualization in such a dynamic environment as we live nowadays.

Download Full-text