Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure

2021 ◽  
pp. 1-10
Author(s):  
Hye-Jeong Song ◽  
Tak-Sung Heo ◽  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.

2021 ◽  
pp. 1-17
Author(s):  
Enda Du ◽  
Yuetian Liu ◽  
Ziyan Cheng ◽  
Liang Xue ◽  
Jing Ma ◽  
...  

Summary Accurate production forecasting is an essential task and accompanies the entire process of reservoir development. With the limitation of prediction principles and processes, the traditional approaches are difficult to make rapid predictions. With the development of artificial intelligence, the data-driven model provides an alternative approach for production forecasting. To fully take the impact of interwell interference on production into account, this paper proposes a deep learning-based hybrid model (GCN-LSTM), where graph convolutional network (GCN) is used to capture complicated spatial patterns between each well, and long short-term memory (LSTM) neural network is adopted to extract intricate temporal correlations from historical production data. To implement the proposed model more efficiently, two data preprocessing procedures are performed: Outliers in the data set are removed by using a box plot visualization, and measurement noise is reduced by a wavelet transform. The robustness and applicability of the proposed model are evaluated in two scenarios of different data types with the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). The results show that the proposed model can effectively capture spatial and temporal correlations to make a rapid and accurate oil production forecast.


2018 ◽  
Vol 10 (11) ◽  
pp. 113 ◽  
Author(s):  
Yue Li ◽  
Xutao Wang ◽  
Pengjian Xu

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.


Author(s):  
Tao Gui ◽  
Qi Zhang ◽  
Lujun Zhao ◽  
Yaosong Lin ◽  
Minlong Peng ◽  
...  

In recent years, long short-term memory (LSTM) has been successfully used to model sequential data of variable length. However, LSTM can still experience difficulty in capturing long-term dependencies. In this work, we tried to alleviate this problem by introducing a dynamic skip connection, which can learn to directly connect two dependent words. Since there is no dependency information in the training data, we propose a novel reinforcement learning-based method to model the dependency relationship and connect dependent words. The proposed model computes the recurrent transition functions based on the skip connections, which provides a dynamic skipping advantage over RNNs that always tackle entire sentences sequentially. Our experimental results on three natural language processing tasks demonstrate that the proposed method can achieve better performance than existing methods. In the number prediction experiment, the proposed model outperformed LSTM with respect to accuracy by nearly 20%.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 312 ◽  
Author(s):  
Asma Baccouche ◽  
Sadaf Ahmed ◽  
Daniel Sierra-Sosa ◽  
Adel Elmaghraby

Identifying internet spam has been a challenging problem for decades. Several solutions have succeeded to detect spam comments in social media or fraudulent emails. However, an adequate strategy for filtering messages is difficult to achieve, as these messages resemble real communications. From the Natural Language Processing (NLP) perspective, Deep Learning models are a good alternative for classifying text after being preprocessed. In particular, Long Short-Term Memory (LSTM) networks are one of the models that perform well for the binary and multi-label text classification problems. In this paper, an approach merging two different data sources, one intended for Spam in social media posts and the other for Fraud classification in emails, is presented. We designed a multi-label LSTM model and trained it on the joint datasets including text with common bigrams, extracted from each independent dataset. The experiment results show that our proposed model is capable of identifying malicious text regardless of the source. The LSTM model trained with the merged dataset outperforms the models trained independently on each dataset.


Author(s):  
J. Shobana ◽  
M. Murali

AbstractSentiment analysis is the process of determining the sentiment polarity (positivity, neutrality or negativity) of the text. As online markets have become more popular over the past decades, online retailers and merchants are asking their buyers to share their opinions about the products they have purchased. As a result, millions of reviews are generated daily, making it difficult to make a good decision about whether a consumer should buy a product. Analyzing these enormous concepts is difficult and time-consuming for product manufacturers. Deep learning is the current research interest in Natural language processing. In the proposed model, Skip-gram architecture is used for better feature extraction of semantic and contextual information of words. LSTM (long short-term memory) is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are optimized by the adaptive particle Swarm Optimization algorithm. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models in different metrics.


Batteries ◽  
2021 ◽  
Vol 7 (4) ◽  
pp. 66
Author(s):  
Tadele Mamo ◽  
Fu-Kwun Wang

Monitoring cycle life can provide a prediction of the remaining battery life. To improve the prediction accuracy of lithium-ion battery capacity degradation, we propose a hybrid long short-term memory recurrent neural network model with an attention mechanism. The hyper-parameters of the proposed model are also optimized by a differential evolution algorithm. Using public battery datasets, the proposed model is compared to some published models, and it gives better prediction performance in terms of mean absolute percentage error and root mean square error. In addition, the proposed model can achieve higher prediction accuracy of battery end of life.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5132
Author(s):  
Ping-Huan Kuo ◽  
Ssu-Ting Lin ◽  
Jun Hu ◽  
Chiou-Jye Huang

In aspect of the natural language processing field, previous studies have generally analyzed sound signals and provided related responses. However, in various conversation scenarios, image information is still vital. Without the image information, misunderstanding may occur, and lead to wrong responses. In order to address this problem, this study proposes a recurrent neural network (RNNs) based multi-sensor context-aware chatbot technology. The proposed chatbot model incorporates image information with sound signals and gives appropriate responses to the user. In order to improve the performance of the proposed model, the long short-term memory (LSTM) structure is replaced by gated recurrent unit (GRU). Moreover, a VGG16 model is also chosen for a feature extractor for the image information. The experimental results demonstrate that the integrative technology of sound and image information, which are obtained by the image sensor and sound sensor in a companion robot, is helpful for the chatbot model proposed in this study. The feasibility of the proposed technology was also confirmed in the experiment.


Energies ◽  
2018 ◽  
Vol 11 (12) ◽  
pp. 3493 ◽  
Author(s):  
Chujie Tian ◽  
Jian Ma ◽  
Chunhong Zhang ◽  
Panpan Zhan

Accurate electrical load forecasting is of great significance to help power companies in better scheduling and efficient management. Since high levels of uncertainties exist in the load time series, it is a challenging task to make accurate short-term load forecast (STLF). In recent years, deep learning approaches provide better performance to predict electrical load in real world cases. The convolutional neural network (CNN) can extract the local trend and capture the same pattern, and the long short-term memory (LSTM) is proposed to learn the relationship in time steps. In this paper, a new deep neural network framework that integrates the hidden feature of the CNN model and the LSTM model is proposed to improve the forecasting accuracy. The proposed model was tested in a real-world case, and detailed experiments were conducted to validate its practicality and stability. The forecasting performance of the proposed model was compared with the LSTM model and the CNN model. The Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) were used as the evaluation indexes. The experimental results demonstrate that the proposed model can achieve better and stable performance in STLF.


Atmosphere ◽  
2019 ◽  
Vol 10 (11) ◽  
pp. 718 ◽  
Author(s):  
Park ◽  
Kim ◽  
Lee ◽  
Kim ◽  
Song ◽  
...  

In this paper, we propose a new temperature prediction model based on deep learning by using real observed weather data. To this end, a huge amount of model training data is needed, but these data should not be defective. However, there is a limitation in collecting weather data since it is not possible to measure data that have been missed. Thus, the collected data are apt to be incomplete, with random or extended gaps. Therefore, the proposed temperature prediction model is used to refine missing data in order to restore missed weather data. In addition, since temperature is seasonal, the proposed model utilizes a long short-term memory (LSTM) neural network, which is a kind of recurrent neural network known to be suitable for time-series data modeling. Furthermore, different configurations of LSTMs are investigated so that the proposed LSTM-based model can reflect the time-series traits of the temperature data. In particular, when a part of the data is detected as missing, it is restored by using the proposed model’s refinement function. After all the missing data are refined, the LSTM-based model is retrained using the refined data. Finally, the proposed LSTM-based temperature prediction model can predict the temperature through three time steps: 6, 12, and 24 h. Furthermore, the model is extended to predict 7 and 14 day future temperatures. The performance of the proposed model is measured by its root-mean-squared error (RMSE) and compared with the RMSEs of a feedforward deep neural network, a conventional LSTM neural network without any refinement function, and a mathematical model currently used by the meteorological office in Korea. Consequently, it is shown that the proposed LSTM-based model employing LSTM-refinement achieves the lowest RMSEs for 6, 12, and 24 h temperature prediction as well as for 7 and 14 day temperature prediction, compared to other DNN-based and LSTM-based models with either no refinement or linear interpolation. Moreover, the prediction accuracy of the proposed model is higher than that of the Unified Model (UM) Local Data Assimilation and Prediction System (LDAPS) for 24 h temperature predictions.


2019 ◽  
Vol 20 (1) ◽  
pp. 129-139 ◽  
Author(s):  
Zahra Bokaee Nezhad ◽  
Mohammad Ali Deihimi

With increasing members in social media sites today, people tend to share their views about everything online. It is a convenient way to convey their messages to end users on a specific subject. Sentiment Analysis is a subfield of Natural Language Processing (NLP) that refers to the identification of users’ opinions toward specific topics. It is used in several fields such as marketing, customer services, etc. However, limited works have been done on Persian Sentiment Analysis. On the other hand, deep learning has recently become popular because of its successful role in several Natural Language Processing tasks. The objective of this paper is to propose a novel hybrid deep learning architecture for Persian Sentiment Analysis. According to the proposed model, local features are extracted by Convolutional Neural Networks (CNN) and long-term dependencies are learned by Long Short Term Memory (LSTM). Therefore, the model can harness both CNN's and LSTM's abilities. Furthermore, Word2vec is used for word representation as an unsupervised learning step. To the best of our knowledge, this is the first attempt where a hybrid deep learning model is used for Persian Sentiment Analysis. We evaluate the model on a Persian dataset that is introduced in this study. The experimental results show the effectiveness of the proposed model with an accuracy of 85%. ABSTRAK: Hari ini dengan ahli yang semakin meningkat di laman media sosial, orang cenderung untuk berkongsi pandangan mereka tentang segala-galanya dalam talian. Ini adalah cara mudah untuk menyampaikan mesej mereka kepada pengguna akhir mengenai subjek tertentu. Analisis Sentimen adalah subfield Pemprosesan Bahasa Semula Jadi yang merujuk kepada pengenalan pendapat pengguna ke arah topik tertentu. Ia digunakan dalam beberapa bidang seperti pemasaran, perkhidmatan pelanggan, dan sebagainya. Walau bagaimanapun, kerja-kerja terhad telah dilakukan ke atas Analisis Sentimen Parsi. Sebaliknya, pembelajaran mendalam baru menjadi popular kerana peranannya yang berjaya dalam beberapa tugas Pemprosesan Bahasa Asli (NLP). Objektif makalah ini adalah mencadangkan senibina pembelajaran hibrid yang baru dalam Analisis Sentimen Parsi. Menurut model yang dicadangkan, ciri-ciri tempatan ditangkap oleh Rangkaian Neural Convolutional (CNN) dan ketergantungan jangka panjang dipelajari oleh Long Short Term Memory (LSTM). Oleh itu, model boleh memanfaatkan kebolehan CNN dan LSTM. Selain itu, Word2vec digunakan untuk perwakilan perkataan sebagai langkah pembelajaran tanpa pengawasan. Untuk pengetahuan yang terbaik, ini adalah percubaan pertama di mana model pembelajaran mendalam hibrid digunakan untuk Analisis Sentimen Persia. Kami menilai model pada dataset Persia yang memperkenalkan dalam kajian ini. Keputusan eksperimen menunjukkan keberkesanan model yang dicadangkan dengan ketepatan 85%.


Sign in / Sign up

Export Citation Format

Share Document