input sequence
Recently Published Documents


TOTAL DOCUMENTS

224
(FIVE YEARS 97)

H-INDEX

15
(FIVE YEARS 3)

2022 ◽  
Author(s):  
Salman Khan ◽  
Muzammal Naseer ◽  
Munawar Hayat ◽  
Syed Waqas Zamir ◽  
Fahad Shahbaz Khan ◽  
...  

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks e.g. , Long short-term memory (LSTM). Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities ( e.g. , images, videos, text and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks ( e.g. , image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks ( e.g. , visual-question answering, visual reasoning, and visual grounding), video processing ( e.g. , activity recognition, video forecasting), low-level vision ( e.g. , image super-resolution, image enhancement, and colorization) and 3D analysis ( e.g. , point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges towards the application of transformer models in computer vision.


2021 ◽  
Vol 13 (1) ◽  
pp. 3
Author(s):  
Jorge Silvestre ◽  
Miguel de Santiago ◽  
Anibal Bregon ◽  
Miguel A. Martínez-Prieto ◽  
Pedro C. Álvarez-Esteban

Predictable operations are the basis of efficient air traffic management. In this context, accurately estimating the arrival time to the destination airport is fundamental to make tactical decisions about an optimal schedule of landing and take-off operations. In this paper, we evaluate different deep learning models based on LSTM architectures for predicting estimated time of arrival of commercial flights, mainly using surveillance data from OpenSky Network. We observed that the number of previous states of the flight used to make the prediction have great influence on the accuracy of the estimation, independently of the architecture. The best model, with an input sequence length of 50, has reported a MAE of 3.33 min and a RMSE of 5.42 min on the test set, with MAE values of 5.67 and 2.13 min 90 and 15 min before the end of the flight, respectively.


Webology ◽  
2021 ◽  
Vol 18 (2) ◽  
pp. 1011-1022
Author(s):  
Saja Naeem Turky ◽  
Ahmed Sabah Ahmed AL-Jumaili ◽  
Rajaa K. Hasoun

An abstractive summary is a process of producing a brief and coherent summary that contains the original text's main concepts. In scientific texts, summarization has generally been restricted to extractive techniques. Abstractive methods that use deep learning have proven very effective in summarizing articles in public fields, like news documents. Because of the difficulty of the neural frameworks for learning specific domain- knowledge especially in NLP task, they haven't been more applied to documents that are related to a particular domain such as the medical domain. In this study, an abstractive summary is proposed. The proposed system is applied to the COVID-19 dataset which a collection of science documents linked to the coronavirus and associated illnesses, in this work 12000 samples from this dataset have been used. The suggested model is an abstractive summary model that can read abstracts of Covid-19 papers then create summaries in the style of a single-statement headline. A text summary model has been designed based on the LSTM method architecture. The proposed model includes using a glove model for word embedding which is converts input sequence to vector forms, then these vectors pass through LSTM layers to produce the summary. The results indicate that using an LSTM and glove model for word embedding together improves the summarization system's performance. This system was evaluated by rouge metrics and it achieved (43.6, 36.7, 43.6) for Rouge-1, Rouge-2, and Rouge-L respectively.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jianbin Zhu ◽  
Xiaojun Shi ◽  
Shuanghua Zhang

The detection of grammatical errors in English composition is an important task in the field of NLP. The main purpose of this task is to check out grammatical errors in English sentences and correct them. Grammatical error detection and correction are important applications in the automatic proofreading of English texts and in the field of English learning aids. With the increasing influence of English on a global scale, a huge breakthrough has been made in the task of detecting English grammatical errors. Based on machine learning, this paper designs a new method for detecting grammatical errors in English composition. First, this paper implements a grammatical error detection model based on Seq2Seq. Second, this paper implements a grammatical error detection and correction scheme based on the Transformer model. The Transformer model performs better than most grammar models. Third, this paper realizes the application of the BERT model in grammar error detection and error correction tasks, and the generalization ability of the model has been significantly enhanced. This solves the problem that the forward and backward cannot be merged when the Transformer trains the language model. Fourth, this paper proposes a method of grammatical error detection and correction in English composition based on a hybrid model. According to specific application scenarios, the corresponding neural network model is used for grammatical error correction. Combine the Seq2Seq structure to encode the input sequence and automate feature engineering. Through the combination of traditional model and deep model, the advantages are complemented to realize grammatical error detection and automatic correction.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ibtissam Benchaji ◽  
Samira Douzi ◽  
Bouabid El Ouahidi ◽  
Jaafar Jaafari

AbstractAs credit card becomes the most popular payment mode particularly in the online sector, the fraudulent activities using credit card payment technologies are rapidly increasing as a result. For this end, it is obligatory for financial institutions to continuously improve their fraud detection systems to reduce huge losses. The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data, using attention mechanism and LSTM deep recurrent neural networks. The proposed model, compared to previous studies, considers the sequential nature of transactional data and allows the classifier to identify the most important transactions in the input sequence that predict at higher accuracy fraudulent transactions. Precisely, the robustness of our model is built by combining the strength of three sub-methods; the uniform manifold approximation and projection (UMAP) for selecting the most useful predictive features, the Long Short Term Memory (LSTM) networks for incorporating transaction sequences and the attention mechanism to enhance LSTM performances. The experimentations of our model give strong results in terms of efficiency and effectiveness.


2021 ◽  
pp. 1-29
Author(s):  
Yizhu Liu ◽  
Xinyue Chen ◽  
Xusheng Luo ◽  
Kenny Q. Zhu

Abstract Convolutional sequence to sequence (CNN seq2seq) models have met success in abstractive summarization. However, their outputs often contain repetitive word sequences and logical inconsistencies, limiting the practicality of their application. In this paper, we find the reasons behind the repetition problem in CNN-based abstractive summarization through observing the attention map between the summaries with repetition and their corresponding source documents and mitigate the repetition problem. We propose to reduce the repetition in summaries by attention filter mechanism (ATTF) and sentence-level backtracking decoder (SBD), which dynamically redistributes attention over the input sequence as the output sentences are generated. The ATTF can record previously attended locations in the source document directly and prevent the decoder from attending to these locations. The SBD prevents the decoder from generating similar sentences more than once via backtracking at test. The proposed model outperforms the baselines in terms of ROUGE score, repeatedness, and readability. The results show that this approach generates high-quality summaries with minimal repetition and makes the reading experience better.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Sihai Zhao ◽  
Jiangye Xu ◽  
Yuyan Zhang

The leaky LMS algorithm has been extensively studied because of its control of parameter drift. This unexpected parameter drift is linked to the inadequacy of excitation in the input sequence. And generally leaky LMS algorithms use fixed step size to force the performance of compromise between the fast convergence rate and small steady-state misalignment. In this paper, variable step-size (VSS) leaky LMS algorithm is proposed. And the variable step-size method combines the time average estimation of the error and the time average estimation of the normalized quantity. Variable step-size method proposed incorporating with leaky LMS algorithm can effectively eliminate noise interference and make the early convergence, and final small misalignments are obtained together. Simulation results demonstrate that the proposed algorithm has better performance than the existing variable step-size algorithms in the unexcited environment. Furthermore, the proposed algorithm is comparable in performance to other variable step-size algorithms under the adequacy of excitation.


2021 ◽  
Vol 9 ◽  
Author(s):  
Pan Xiong ◽  
Cheng Long ◽  
Huiyu Zhou ◽  
Roberto Battiston ◽  
Angelo De Santis ◽  
...  

During the lithospheric buildup to an earthquake, complex physical changes occur within the earthquake hypocenter. Data pertaining to the changes in the ionosphere may be obtained by satellites, and the analysis of data anomalies can help identify earthquake precursors. In this paper, we present a deep-learning model, SeqNetQuake, that uses data from the first China Seismo-Electromagnetic Satellite (CSES) to identify ionospheric perturbations prior to earthquakes. SeqNetQuake achieves the best performance [F-measure (F1) = 0.6792 and Matthews correlation coefficient (MCC) = 0.427] when directly trained on the CSES dataset with a spatial window centered on the earthquake epicenter with the Dobrovolsky radius and an input sequence length of 20 consecutive observations during night time. We further explore a transferring learning approach, which initially trains the model with the larger Electro-Magnetic Emissions Transmitted from the Earthquake Regions (DEMETER) dataset, and then tunes the model with the CSES dataset. The transfer-learning performance is substantially higher than that of direct learning, yielding a 12% improvement in the F1 score and a 29% improvement in the MCC value. Moreover, we compare the proposed model SeqNetQuake with other five benchmarking classifiers on an independent test set, which shows that SeqNetQuake demonstrates a 64.2% improvement in MCC and approximately a 24.5% improvement in the F1 score over the second-best convolutional neural network model. SeqNetSquake achieves significant improvement in identifying pre-earthquake ionospheric perturbation and improves the performance of earthquake prediction using the CSES data.


Sign in / Sign up

Export Citation Format

Share Document