attention model
Recently Published Documents





2022 ◽  
Vol 22 (3) ◽  
pp. 1-21
Prayag Tiwari ◽  
Amit Kumar Jaiswal ◽  
Sahil Garg ◽  
Ilsun You

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.

2022 ◽  
Vol 11 (2) ◽  
pp. 0-0

In the recent times transfer learning models have known to exhibited good results in the area of text classification for question-answering, summarization, next word prediction but these learning models have not been extensively used for the problem of hate speech detection yet. We anticipate that these networks may give better results in another task of text classification i.e. hate speech detection. This paper introduces a novel method of hate speech detection based on the concept of attention networks using the BERT attention model. We have conducted exhaustive experiments and evaluation over publicly available datasets using various evaluation metrics (precision, recall and F1 score). We show that our model outperforms all the state-of-the-art methods by almost 4%. We have also discussed in detail the technical challenges faced during the implementation of the proposed model.

Agriculture ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 73
Kaidong Lei ◽  
Chao Zong ◽  
Ting Yang ◽  
Shanshan Peng ◽  
Pengfei Zhu ◽  

In large-scale sow production, real-time detection and recognition of sows is a key step towards the application of precision livestock farming techniques. In the pig house, the overlap of railings, floors, and sows usually challenge the accuracy of sow target detection. In this paper, a non-contact machine vision method was used for sow targets perception in complex scenarios, and the number position of sows in the pen could be detected. Two multi-target sow detection and recognition models based on the deep learning algorithms of Mask-RCNN and UNet-Attention were developed, and the model parameters were tuned. A field experiment was carried out. The data-set obtained from the experiment was used for algorithm training and validation. It was found that the Mask-RCNN model showed a higher recognition rate than that of the UNet-Attention model, with a final recognition rate of 96.8% and complete object detection outlines. In the process of image segmentation, the area distribution of sows in the pens was analyzed. The position of the sow’s head in the pen and the pixel area value of the sow segmentation were analyzed. The feeding, drinking, and lying behaviors of the sow have been identified on the basis of image recognition. The results showed that the average daily lying time, standing time, feeding and drinking time of sows were 12.67 h(MSE 1.08), 11.33 h(MSE 1.08), 3.25 h(MSE 0.27) and 0.391 h(MSE 0.10), respectively. The proposed method in this paper could solve the problem of target perception of sows in complex scenes and would be a powerful tool for the recognition of sows.

2022 ◽  
Vol 9 ◽  
Maoyi Zhang ◽  
Changqing Ding ◽  
Shuli Guo

Tracheobronchial diverticula (TD) is a common cystic lesion that can be easily neglected; hence accurate and rapid identification is critical for later diagnosis. There is a strong need to automate this diagnostic process because traditional manual observations are time-consuming and laborious. However, most studies have only focused on the case report or listed the relationship between the disease and other physiological indicators, but a few have adopted advanced technologies such as deep learning for automated identification and diagnosis. To fill this gap, this study interpreted TD recognition as semantic segmentation and proposed a novel attention-based network for TD semantic segmentation. Since the area of TD lesion is small and similar to surrounding organs, we designed the atrous spatial pyramid pooling (ASPP) and attention mechanisms, which can efficiently complete the segmentation of TD with robust results. The proposed attention model can selectively gather features from different branches according to the amount of information they contain. Besides, to the best of our knowledge, no public research data is available yet. For efficient network training, we constructed a data set containing 218 TD and related ground truth (GT). We evaluated different models based on the proposed data set, among which the highest MIOU can reach 0.92. The experiments show that our model can outperform state-of-the-art methods, indicating that the deep learning method has great potential for TD recognition.

2022 ◽  
pp. 102347
Audrey Duran ◽  
Gaspard Dussert ◽  
Olivier Rouviére ◽  
Tristan Jaouen ◽  
Pierre-Marc Jodoin ◽  

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Junjun Huo

Based on deep learning and digital image processing algorithms, we design and implement an accurate automatic recognition system for bank note text and propose an improved recognition method based on ResNet for the problems of difficult image text extraction and insufficient recognition accuracy. Firstly, a deep hyperparameterized convolution (DO-Conv) is used instead of the traditional convolution in the network to improve the recognition rate while reducing the model parameters. Then, the spatial attention model (SAM) and the squeezed excitation block (SE-Block) are fused and applied to a modified ResNet to extract detailed features of bank note images in the channel and spatial domains. Finally, the label-smoothed cross-entropy (LSCE) loss function is used to train the model to automatically calibrate the network to prevent classification errors. The experimental results demonstrate that the improved model is not easily affected by the image quality, and the model in this paper has good performance in text detection and recognition in specific business ticket scenarios.

Machines ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 367
Yan Wang ◽  
Zhikang Li ◽  
Xin Wang ◽  
Hongnian Yu ◽  
Wudai Liao ◽  

To date, several alterations in the gait pattern can be treated through rehabilitative approaches and robot assisted therapy (RAT). Gait data and gait trajectories are essential in specific exoskeleton control strategies. Nevertheless, the scarcity of human gait data due to the high cost of data collection or privacy concerns can hinder the performance of controllers or models. This paper thus first creates a GANs-based (Generative Adversarial Networks) data augmentation method to generate synthetic human gait data while still retaining the dynamics of the real gait data. Then, both the real collected and the synthesized gait data are fed to our constructed two-stage attention model for gait trajectories prediction. The real human gait data are collected with the five healthy subjects recruited from an optical motion capture platform. Experimental results indicate that the created GANs-based data augmentation model can synthesize realistic-looking multi-dimensional human gait data. Also, the two-stage attention model performs better compared with the LSTM model; the attention mechanism shows a higher capacity of learning dependencies between the historical gait data to accurately predict the current values of the hip joint angles and knee joint angles in the gait trajectory. The predicted gait trajectories depending on the historical gait data can be further used for gait trajectory tracking strategies.

2021 ◽  
Vol 11 (24) ◽  
pp. 12023
Hyun-Je Song ◽  
Su-Hwan Yoon ◽  
Seong-Bae Park

This paper addresses a question difficulty estimation of which goal is to estimate the difficulty level of a given question in question-answering (QA) tasks. Since a question in the tasks is composed of a questionary sentence and a set of information components such as a description and candidate answers, it is important to model the relationship among the information components to estimate the difficulty level of the question. However, existing approaches to this task modeled a simple relationship such as a relationship between a questionary sentence and a description, but such simple relationships are insufficient to predict the difficulty level accurately. Therefore, this paper proposes an attention-based model to consider the complicated relationship among the information components. The proposed model first represents bi-directional relationships between a questionary sentence and each information component using a dual multi-head co-attention, since the questionary sentence is a key factor in the QA questions and it affects and is affected by information components. Then, the proposed model considers inter-information relationship over the bi-directional representations through a self-attention model. The inter-information relationship helps predict the difficulty of the questions accurately which require reasoning over multiple kinds of information components. The experimental results from three well-known and real-world QA data sets prove that the proposed model outperforms the previous state-of-the-art and pre-trained language model baselines. It is also shown that the proposed model is robust against the increase of the number of information components.

Sign in / Sign up

Export Citation Format

Share Document