Arabic Semantic Textual Similarity Identification based on Convolutional Gated Recurrent Units

Author(s):  
Adnen Mahmoud ◽  
Mounir Zrigui
2014 ◽  
Author(s):  
Abhay Kashyap ◽  
Lushan Han ◽  
Roberto Yus ◽  
Jennifer Sleeman ◽  
Taneeya Satyapanich ◽  
...  

2020 ◽  
pp. 1-12
Author(s):  
Liping Li ◽  
Zean Tian ◽  
Kenli Li ◽  
Cen Chen

Anomaly detection based on time series data is of great importance in many fields. Time series data produced by man-made systems usually include two parts: monitored and exogenous data, which respectively are the detected object and the control/feedback information. In this paper, a so-called G-CNN architecture that combined the gated recurrent units (GRU) with a convolutional neural network (CNN) is proposed, which respectively focus on the monitored and exogenous data. The most important is the introduction of a complementary double-referenced thresholding approach that processes prediction errors and calculates threshold, achieving balance between the minimization of false positives and the false negatives. The outstanding performance and extensive applicability of our model is demonstrated by experiments on two public datasets from aerospace and a new server machine dataset from an Internet company. It is also found that the monitored data is close associated with the exogenous data if any, and the interpretability of the G-CNN is discussed by visualizing the intermediate output of neural networks.


Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1149
Author(s):  
Pedro Oliveira ◽  
Bruno Fernandes ◽  
Cesar Analide ◽  
Paulo Novais

A major challenge of today’s society is to make large urban centres more sustainable. Improving the energy efficiency of the various infrastructures that make up cities is one aspect being considered when improving their sustainability, with Wastewater Treatment Plants (WWTPs) being one of them. Consequently, this study aims to conceive, tune, and evaluate a set of candidate deep learning models with the goal being to forecast the energy consumption of a WWTP, following a recursive multi-step approach. Three distinct types of models were experimented, in particular, Long Short-Term Memory networks (LSTMs), Gated Recurrent Units (GRUs), and uni-dimensional Convolutional Neural Networks (CNNs). Uni- and multi-variate settings were evaluated, as well as different methods for handling outliers. Promising forecasting results were obtained by CNN-based models, being this difference statistically significant when compared to LSTMs and GRUs, with the best model presenting an approximate overall error of 630 kWh when on a multi-variate setting. Finally, to overcome the problem of data scarcity in WWTPs, transfer learning processes were implemented, with promising results being achieved when using a pre-trained uni-variate CNN model, with the overall error reducing to 325 kWh.


Author(s):  
Alok Debnath ◽  
Nikhil Pinnaparaju ◽  
Manish Shrivastava ◽  
Vasudeva Varma ◽  
Isabelle Augenstein

Author(s):  
Antonio L. Alfeo ◽  
Mario G. C. A. Cimino ◽  
Gigliola Vaglini

AbstractIn nowadays manufacturing, each technical assistance operation is digitally tracked. This results in a huge amount of textual data that can be exploited as a knowledge base to improve these operations. For instance, an ongoing problem can be addressed by retrieving potential solutions among the ones used to cope with similar problems during past operations. To be effective, most of the approaches for semantic textual similarity need to be supported by a structured semantic context (e.g. industry-specific ontology), resulting in high development and management costs. We overcome this limitation with a textual similarity approach featuring three functional modules. The data preparation module provides punctuation and stop-words removal, and word lemmatization. The pre-processed sentences undergo the sentence embedding module, based on Sentence-BERT (Bidirectional Encoder Representations from Transformers) and aimed at transforming the sentences into fixed-length vectors. Their cosine similarity is processed by the scoring module to match the expected similarity between the two original sentences. Finally, this similarity measure is employed to retrieve the most suitable recorded solutions for the ongoing problem. The effectiveness of the proposed approach is tested (i) against a state-of-the-art competitor and two well-known textual similarity approaches, and (ii) with two case studies, i.e. private company technical assistance reports and a benchmark dataset for semantic textual similarity. With respect to the state-of-the-art, the proposed approach results in comparable retrieval performance and significantly lower management cost: 30-min questionnaires are sufficient to obtain the semantic context knowledge to be injected into our textual search engine.


Sign in / Sign up

Export Citation Format

Share Document