scholarly journals An Attention Mechanism Oriented Hybrid CNN-RNN Deep Learning Architecture of Container Terminal Liner Handling Conditions Prediction

2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Bin Li ◽  
Yuqing He

The booming computational thinking and deep learning make it possible to construct agile, efficient, and robust deep learning-driven decision-making support engine for the operation of container terminal handling systems (CTHSs). Within the conceptual framework of computational logistics, an attention mechanism oriented hybrid convolutional neural network and recurrent neural network deep learning architecture (AMO-HCR-DLA) is proposed technically to predict the container terminal liner handling conditions that mainly include liner handling time (LHT) and total working time of quay crane farm (TWT-QCF) for a calling liner. Consequently, the container terminal oriented logistics generalized computation (CTO-LGC) automation and intelligence are established tentatively by AMO-HCR-DLA. A typical regional container terminal hub of China is selected to design, implement, execute, and evaluate the AMO-HCR-DLA with the actual production data. In the case of severe vibration of LHT and TWT-QCF, while forecasting the handling conditions of 210 ships based on the CTO-LGC running log of four years, the forecasting error of LHT within one hour is more than 97% and that of TWT-QCF within six hours accounts for 89.405%. When predicting the operating conditions of 300 liners by the log of five years, the forecasting deviation of LHT within one hour is more than striking 99% and that of TWT-QCF within six hours reaches up to 94.010% as well. All are far superior to the predicting outcomes by the classical algorithms of machine learning and deep learning. Hence, the AMO-HCR-DLA shows excellent performance for the prediction of CTHS with the low and stable computational consuming. It also demonstrates the feasibility, credibility, and realizability of the computing architecture and design paradigm of AMO-HCR-DLA preliminarily.

2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Bin Li ◽  
Yuqing He

The synergy of computational logistics and deep learning provides a new methodology and solution to the operational decisions of container terminal handling systems (CTHS) at the strategic, tactical, and executive levels. Above all, the container terminal logistics tactical operational complexity is discussed by computational logistics, and the liner handling volume (LHV) has important influences on a series of terminal scheduling decision problems. Subsequently, a feature-extraction-based lightweight convolutional and recurrent neural network adaptive computing model (FEB-LCR-ACM) is presented initially to predict the LHV by the fusion of multiple deep learning algorithms and mechanisms, especially for the specific feature extraction package of tsfresh. Consequently, the container-terminal-oriented logistics service scheduling decision support design paradigm is put forward tentatively by FEB-LCR-ACM. Finally, a typical large-scale container terminal of China is chosen to implement, execute, and evaluate the FEB-LCR-ACM based on the terminal running log around the indicator of LHV. In the case of severe vibration of LHV between 2 twenty-foot equivalent units (TEUs) and 4215 TEUs, while forecasting the LHV of 300 liners by the log of five years, the forecasting error within 100 TEUs almost accounts for 80%. When predicting the operation of 350 ships by the log of six years, the forecasting deviation within 100 TEUs reaches up to nearly 90%. The abovementioned two deep learning experimental performances with FEB-LCR-ACM are so far ahead of the forecasting results by the classical machine learning algorithm that is similar to Gaussian support vector machine. Consequently, the FEB-LCR-ACM achieves sufficiently good performance for the LHV prediction with a lightweight deep learning architecture based on the typical small datasets, and then it is supposed to overcome the operational nonlinearity, dynamics, coupling, and complexity of CTHS partially.


2021 ◽  
Author(s):  
Yuqi Wang ◽  
Tianyuan Liu ◽  
Di Zhang

Abstract The research on the supercritical carbon dioxide (S-CO2) Brayton cycle has gradually become a hot spot in recent years. The off-design performance of turbine is an important reference for analyzing the variable operating conditions of the cycle. With the development of deep learning technology, the research of surrogate models based on neural network has received extensive attention. In order to improve the inefficiency in traditional off-design analyses, this research establishes a data-driven deep learning off-design aerodynamic prediction model for a S-CO2 centrifugal turbine, which is based on a deep convolutional neural network. The network can rapidly and adaptively provide dynamic aerodynamic performance prediction results for varying blade profiles and operating conditions. Meanwhile, it can illustrate the mechanism based on the field reconstruction results for the generated aerodynamic performance. The training results show that the off-design aerodynamic prediction convolutional neural network (OAP-CNN) has reduced the mean and maximum error of efficiency prediction compared with the traditional Gaussian Process Regression (GPR) and Artificial Neural Network (ANN). Aiming at the off-design conditions, the pressure and temperature distributions with acceptable error can be obtained without a CFD calculation. Besides, the influence of off-design parameters on the efficiency and power can be conveniently acquired, thus providing the reference for an optimized operation strategy. Analyzing the sensitivity of AOP-CNN to training data set size, the prediction accuracy is acceptable when the percentage of training samples exceeds 50%. The minimum error appears when the training data set size is 0.8. The mean and maximum errors are respectively 1.46% and 6.42%. In summary, this research provides a precise and fast aerodynamic performance prediction model in the analyses of off-design conditions for S-CO2 turbomachinery and Brayton cycle.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Venkateswara Rao Kota ◽  
Shyamala Devi Munisamy

PurposeNeural network (NN)-based deep learning (DL) approach is considered for sentiment analysis (SA) by incorporating convolutional neural network (CNN), bi-directional long short-term memory (Bi-LSTM) and attention methods. Unlike the conventional supervised machine learning natural language processing algorithms, the authors have used unsupervised deep learning algorithms.Design/methodology/approachThe method presented for sentiment analysis is designed using CNN, Bi-LSTM and the attention mechanism. Word2vec word embedding is used for natural language processing (NLP). The discussed approach is designed for sentence-level SA which consists of one embedding layer, two convolutional layers with max-pooling, one LSTM layer and two fully connected (FC) layers. Overall the system training time is 30 min.FindingsThe method performance is analyzed using metrics like precision, recall, F1 score, and accuracy. CNN is helped to reduce the complexity and Bi-LSTM is helped to process the long sequence input text.Originality/valueThe attention mechanism is adopted to decide the significance of every hidden state and give a weighted sum of all the features fed as input.


PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0247984
Author(s):  
Xuyang Wang ◽  
Yixuan Tong

With the rapid development of the mobile internet, people are becoming more dependent on the internet to express their comments on products or stores; meanwhile, text sentiment classification of these comments has become a research hotspot. In existing methods, it is fairly popular to apply a deep learning method to the text classification task. Aiming at solving information loss, weak context and other problems, this paper makes an improvement based on the transformer model to reduce the difficulty of model training and training time cost and achieve higher overall model recall and accuracy in text sentiment classification. The transformer model replaces the traditional convolutional neural network (CNN) and the recurrent neural network (RNN) and is fully based on the attention mechanism; therefore, the transformer model effectively improves the training speed and reduces training difficulty. This paper selects e-commerce reviews as research objects and applies deep learning theory. First, the text is preprocessed by word vectorization. Then the IN standardized method and the GELUs activation function are applied based on the original model to analyze the emotional tendencies of online users towards stores or products. The experimental results show that our method improves by 9.71%, 6.05%, 5.58% and 5.12% in terms of recall and approaches the peak level of the F1 value in the test model by comparing BiLSTM, Naive Bayesian Model, the serial BiLSTM_CNN model and BiLSTM with an attention mechanism model. Therefore, this finding proves that our method can be used to improve the text sentiment classification accuracy and effectively apply the method to text classification.


2020 ◽  
Author(s):  
Chen Chen ◽  
Tianqi Wu ◽  
Zhiye Guo ◽  
Jianlin Cheng

AbstractDeep learning has emerged as a revolutionary technology for protein residue-residue contact prediction since the 2012 CASP10 competition. Considerable advancements in the predictive power of the deep learning-based contact predictions have been achieved since then. However, little effort has been put into interpreting the black-box deep learning methods. Algorithms that can interpret the relationship between predicted contact maps and the internal mechanism of the deep learning architectures are needed to explore the essential components of contact inference and improve their explainability. In this study, we present an attention-based convolutional neural network for protein contact prediction, which consists of two attention mechanism-based modules: sequence attention and regional attention. Our benchmark results on the CASP13 free-modeling (FM) targets demonstrate that the two attention modules added on top of existing typical deep learning models exhibit a complementary effect that contributes to predictive improvements. More importantly, the inclusion of the attention mechanism provides interpretable patterns that contain useful insights into the key fold-determining residues in proteins. We expect the attention-based model can provide a reliable and practically interpretable technique that helps break the current bottlenecks in explaining deep neural networks for contact prediction.


Author(s):  
Luqi Li ◽  
Jie Zhao ◽  
Li Hou ◽  
Yunkai Zhai ◽  
Jinming Shi ◽  
...  

Abstract Background Clinical named entity recognition (CNER) is important for medical information mining and establishment of high-quality knowledge map. Due to the different text features from natural language and a large number of professional and uncommon clinical terms in Chinese electronic medical records (EMRs), there are still many difficulties in clinical named entity recognition of Chinese EMRs. It is of great importance to eliminate semantic interference and improve the ability of autonomous learning of internal features of the model under the small training corpus. Methods From the perspective of deep learning, we integrated the attention mechanism into neural network, and proposed an improved clinical named entity recognition method for Chinese electronic medical records called BiLSTM-Att-CRF, which could capture more useful information of the context and avoid the problem of missing information caused by long-distance factors. In addition, medical dictionaries and part-of-speech (POS) features were also introduced to improve the performance of the model. Results Based on China Conference on Knowledge Graph and Semantic Computing (CCKS) 2017 and 2018 Chinese EMRs corpus, our BiLSTM-Att-CRF model finally achieved better performance than other widely-used models without additional features(F1-measure of 85.4% in CCKS 2018, F1-measure of 90.29% in CCKS 2017), and achieved the best performance with POS and dictionary features (F1-measure of 86.11% in CCKS 2018, F1-measure of 90.48% in CCKS 2017). In particular, the BiLSTM-Att-CRF model had significant effect on the improvement of Recall. Conclusions Our work preliminarily confirmed the validity of attention mechanism in discovering key information and mining text features, which might provide useful ideas for future research in clinical named entity recognition of Chinese electronic medical records. In the future, we will explore the deeper application of attention mechanism in neural network.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Hangxia Zhou ◽  
Qian Liu ◽  
Ke Yan ◽  
Yang Du

Short-term photovoltaic (PV) energy generation forecasting models are important, stabilizing the power integration between the PV and the smart grid for artificial intelligence- (AI-) driven internet of things (IoT) modeling of smart cities. With the recent development of AI and IoT technologies, it is possible for deep learning techniques to achieve more accurate energy generation forecasting results for the PV systems. Difficulties exist for the traditional PV energy generation forecasting method considering external feature variables, such as the seasonality. In this study, we propose a hybrid deep learning method that combines the clustering techniques, convolutional neural network (CNN), long short-term memory (LSTM), and attention mechanism with the wireless sensor network to overcome the existing difficulties of the PV energy generation forecasting problem. The overall proposed method is divided into three stages, namely, clustering, training, and forecasting. In the clustering stage, correlation analysis and self-organizing mapping are employed to select the highest relevant factors in historical data. In the training stage, a convolutional neural network, long short-term memory neural network, and attention mechanism are combined to construct a hybrid deep learning model to perform the forecasting task. In the testing stage, the most appropriate training model is selected based on the month of the testing data. The experimental results showed significantly higher prediction accuracy rates for all time intervals compared to existing methods, including traditional artificial neural networks, long short-term memory neural networks, and an algorithm combining long short-term memory neural network and attention mechanism.


Energies ◽  
2019 ◽  
Vol 12 (20) ◽  
pp. 3937 ◽  
Author(s):  
Tengda Huang ◽  
Sheng Fu ◽  
Haonan Feng ◽  
Jiafeng Kuang

Recently, deep learning technology was successfully applied to mechanical fault diagnosis. The convolutional neural network (CNN), as a prevalent deep learning model, occupies a place in intelligent fault diagnosis, which reduces the need for human feature extraction and prior knowledge, thereby achieving an end-to-end intelligent fault diagnosis model. However, the data for mechanical fault diagnosis in practical application are limited, the CNN model is too deep and too complex, making it prone to overfitting, and a model with too simple a structure and shallow layers cannot fully learn the effective features of the data. Convolutional filters with fixed window sizes are widely used in existing CNN models, which cannot flexibly select variable pivotal features. The model may be interfered with by redundant information in feature maps during training. Therefore, in this paper, a novel shallow multi-scale convolutional neural network with attention is proposed for bearing fault diagnosis. The shallow multi-scale convolutional neural network structure can fully learn the feature information of input data without overfitting. For the first time, a feature attention mechanism is developed for fault diagnosis to adaptively select features for classification more effectively, where the pivotal feature was emphasized, and the redundant feature was weakened through an attention mechanism. The time frequency representations as the input of the model were obtained from the vibration time domain signals, which contain the complete time domain and frequency domain information of the vibration signals. Compared with the current popular diagnostic methods, the results show that the proposed diagnostic method has fairly high accuracy, and its performance is superior to the existing methods. The average recognition accuracy was 99.86%, and the weak recognition rate of I-07 and I-14 labels was improved.


2021 ◽  
Vol 13 (15) ◽  
pp. 3029
Author(s):  
Jimin Yu ◽  
Guangyu Zhou ◽  
Shangbo Zhou ◽  
Jiajun Yin

Automatic target recognition (ATR) in synthetic aperture radar (SAR) images has been widely used in civilian and military fields. Traditional model-based methods and template matching methods do not work well under extended operating conditions (EOCs), such as depression angle variant, configuration variant, and noise corruption. To improve the recognition performance, methods based on convolutional neural networks (CNN) have been introduced to solve such problems and have shown outstanding performance. However, most of these methods rely on continuously increasing the width and depth of networks. This adds a large number of parameters and computational overhead, which is not conducive to deployment on edge devices. To solve these problems, a novel lightweight fully convolutional neural network based on Channel-Attention mechanism, Channel-Shuffle mechanism, and Inverted-Residual block, namely the ASIR-Net, is proposed in this paper. Specifically, we deploy Inverted-Residual blocks to extract features in high-dimensional space with fewer parameters and design a Channel-Attention mechanism to distribute different weights to different channels. Then, in order to increase the exchange of information between channels, we introduce the Channel-Shuffle mechanism into the Inverted-Residual block. Finally, to alleviate the matter of the scarcity of SAR images and strengthen the generalization performance of the network, four approaches of data augmentation are proposed. The effect and generalization performance of the proposed ASIR-Net have been proved by a lot of experiments under both SOC and EOCs on the MSTAR dataset. The experimental results indicate that ASIR-Net achieves higher recognition accuracy rates under both SOC and EOCs, which is better than the existing excellent ATR methods.


Sign in / Sign up

Export Citation Format

Share Document