scholarly journals Time-Aware and Feature Similarity Self-Attention in Vessel Fuel Consumption Prediction

2021 ◽  
Vol 11 (23) ◽  
pp. 11514
Author(s):  
Hyun Joon Park ◽  
Min Seok Lee ◽  
Dong Il Park ◽  
Sung Won Han

An accurate vessel fuel consumption prediction is essential for constructing a ship route network and vessel management, leading to efficient sailings. Besides, ship data from monitoring and sensing systems accelerate fuel consumption prediction research. However, the ship data consist of three properties: sequential, irregular time interval, and feature importance, making the predicting problem challenging. In this paper, we propose Time-aware Attention (TA) and Feature-similarity Attention (FA) applied to bi-directional Long Short-Term Memory (LSTM). TA acquires time importance by nonlinear function from irregular time intervals in each sequence and emphasizes data depending on the importance. FA emphasizes data based on similarities of features in the sequence by estimating feature importance with learnable parameters. Finally, we propose the ensemble model of TA and FA-based BiLSTM. The ensemble model, which consists of fully connected layers, is capable of simultaneously capturing different properties of ship data. The experimental results on ship data showed that the proposed model improves the performance in predicting fuel consumption. In addition to model performance, visualization results of attention maps and feature importance help to understand data properties and model characteristics.

2021 ◽  
Vol 15 (3) ◽  
pp. 1-16
Author(s):  
Lin Cheng ◽  
Yuliang Shi ◽  
Kun Zhang ◽  
Xinjun Wang ◽  
Zhiyong Chen

In China, with the continuous development of national health insurance policies, more and more people have joined the health insurance. How to accurately predict patients future medical treatment behavior becomes a hotspot issue. The biggest challenge in this issue is how to improve the prediction performance by modeling health insurance data with high-dimensional time characteristics. At present, most of the research is to solve this issue by using Recurrent Neural Networks (RNNs) to construct an overall prediction model for the medical visit sequences. However, RNNs can not effectively solve the long-term dependence, and RNNs ignores the importance of time interval of the medical visit sequence. Additionally, the global model may lose some important content to different groups. In order to solve these problems, we propose a Grouping and Global Attention based Time-aware Bidirectional Long Short-Term Memory (GGATB-LSTM) model to achieve medical treatment behavior prediction. The model first constructs a heterogeneous information network based on health insurance data, and uses a tensor CANDECOMP/PARAFAC decomposition method to achieve similarity grouping. In terms of group prediction, a global attention and time factor are introduced to extend the bidirectional LSTM. Finally, the proposed model is evaluated by using real dataset, and conclude that GGATB-LSTM is better than other methods.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Osman Mamun ◽  
Madison Wenzlick ◽  
Arun Sathanur ◽  
Jeffrey Hawk ◽  
Ram Devanathan

AbstractThe Larson–Miller parameter (LMP) offers an efficient and fast scheme to estimate the creep rupture life of alloy materials for high-temperature applications; however, poor generalizability and dependence on the constant C often result in sub-optimal performance. In this work, we show that the direct rupture life parameterization without intermediate LMP parameterization, using a gradient boosting algorithm, can be used to train ML models for very accurate prediction of rupture life in a variety of alloys (Pearson correlation coefficient >0.9 for 9–12% Cr and >0.8 for austenitic stainless steels). In addition, the Shapley value was used to quantify feature importance, making the model interpretable by identifying the effect of various features on the model performance. Finally, a variational autoencoder-based generative model was built by conditioning on the experimental dataset to sample hypothetical synthetic candidate alloys from the learnt joint distribution not existing in both 9–12% Cr ferritic–martensitic alloys and austenitic stainless steel datasets.


2020 ◽  
Vol 10 (11) ◽  
pp. 3788 ◽  
Author(s):  
Qi Ouyang ◽  
Yongbo Lv ◽  
Jihui Ma ◽  
Jing Li

With the development of big data and deep learning, bus passenger flow prediction considering real-time data becomes possible. Real-time traffic flow prediction helps to grasp real-time passenger flow dynamics, provide early warning for a sudden passenger flow and data support for real-time bus plan changes, and improve the stability of urban transportation systems. To solve the problem of passenger flow prediction considering real-time data, this paper proposes a novel passenger flow prediction network model based on long short-term memory (LSTM) networks. The model includes four parts: feature extraction based on Xgboost model, information coding based on historical data, information coding based on real-time data, and decoding based on a multi-layer neural network. In the feature extraction part, the data dimension is increased by fusing bus data and points of interest to improve the number of parameters and model accuracy. In the historical information coding part, we use the date as the index in the LSTM structure to encode historical data and provide relevant information for prediction; in the real-time data coding part, the daily half-hour time interval is used as the index to encode real-time data and provide real-time prediction information; in the decoding part, the passenger flow data for the next two 30 min interval outputs by decoding all the information. To our best knowledge, it is the first time to real-time information has been taken into consideration in passenger flow prediction based on LSTM. The proposed model can achieve better accuracy compared to the LSTM and other baseline methods.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lei Li ◽  
Desheng Wu

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.


2021 ◽  
Vol 6 (11) ◽  
pp. 157
Author(s):  
Gonçalo Pereira ◽  
Manuel Parente ◽  
João Moutinho ◽  
Manuel Sampaio

Decision support and optimization tools to be used in construction often require an accurate estimation of the cost variables to maximize their benefit. Heavy machinery is traditionally one of the greatest costs to consider mainly due to fuel consumption. These typically diesel-powered machines have a great variability of fuel consumption depending on the scenario of utilization. This paper describes the creation of a framework aiming to estimate the fuel consumption of construction trucks depending on the carried load, the slope, the distance, and the pavement type. Having a more accurate estimation will increase the benefit of these optimization tools. The fuel consumption estimation model was developed using Machine Learning (ML) algorithms supported by data, which were gathered through several sensors, in a specially designed datalogger with wireless communication and opportunistic synchronization, in a real context experiment. The results demonstrated the viability of the method, providing important insight into the advantages associated with the combination of sensorization and the machine learning models in a real-world construction setting. Ultimately, this study comprises a significant step towards the achievement of IoT implementation from a Construction 4.0 viewpoint, especially when considering its potential for real-time and digital twins applications.


2018 ◽  
Author(s):  
Seth W. Egger ◽  
Mehrdad Jazayeri

AbstractBayesian models of behavior have advanced the idea that humans combine prior beliefs and sensory observations to minimize uncertainty. How the brain implements Bayes-optimal inference, however, remains poorly understood. Simple behavioral tasks suggest that the brain can flexibly represent and manipulate probability distributions. An alternative view is that brain relies on simple algorithms that can implement Bayes-optimal behavior only when the computational demands are low. To distinguish between these alternatives, we devised a task in which Bayes-optimal performance could not be matched by simple algorithms. We asked subjects to estimate and reproduce a time interval by combining prior information with one or two sequential measurements. In the domain of time, measurement noise increases with duration. This property makes the integration of multiple measurements beyond the reach of simple algorithms. We found that subjects were able to update their estimates using the second measurement but their performance was suboptimal, suggesting that they were unable to update full probability distributions. Instead, subjects’ behavior was consistent with an algorithm that predicts upcoming sensory signals, and applies a nonlinear function to errors in prediction to update estimates. These results indicate that inference strategies humans deploy may deviate from Bayes-optimal integration when the computational demands are high.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Juhong Namgung ◽  
Siwoon Son ◽  
Yang-Sae Moon

In recent years, cyberattacks using command and control (C&C) servers have significantly increased. To hide their C&C servers, attackers often use a domain generation algorithm (DGA), which automatically generates domain names for the C&C servers. Accordingly, extensive research on DGA domain detection has been conducted. However, existing methods cannot accurately detect continuously generated DGA domains and can easily be evaded by an attacker. Recently, long short-term memory- (LSTM-) based deep learning models have been introduced to detect DGA domains in real time using only domain names without feature extraction or additional information. In this paper, we propose an efficient DGA domain detection method based on bidirectional LSTM (BiLSTM), which learns bidirectional information as opposed to unidirectional information learned by LSTM. We further maximize the detection performance with a convolutional neural network (CNN) + BiLSTM ensemble model using Attention mechanism, which allows the model to learn both local and global information in a domain sequence. Experimental results show that existing CNN and LSTM models achieved F1-scores of 0.9384 and 0.9597, respectively, while the proposed BiLSTM and ensemble models achieved higher F1-scores of 0.9618 and 0.9666, respectively. In addition, the ensemble model achieved the best performance for most DGA domain classes, enabling more accurate DGA domain detection than existing models.


Sign in / Sign up

Export Citation Format

Share Document