End-to-End Pedestrian Trajectory Forecasting with Transformer Network

Analysis of pedestrians’ motion is important to real-world applications in public scenes. Due to the complex temporal and spatial factors, trajectory prediction is a challenging task. With the development of attention mechanism recently, transformer network has been successfully applied in natural language processing, computer vision, and audio processing. We propose an end-to-end transformer network embedded with random deviation queries for pedestrian trajectory forecasting. The self-correcting scheme can enhance the robustness of the network. Moreover, we present a co-training strategy to improve the training effect. The whole scheme is trained collaboratively by the original loss and classification loss. Therefore, we also achieve more accurate prediction results. Experimental results on several datasets indicate the validity and robustness of the network. We achieve the best performance in individual forecasting and comparable results in social forecasting. Encouragingly, our approach achieves a new state of the art on the Hotel and Zara2 datasets compared with the social-based and individual-based approaches.

Download Full-text

Repetitive Reprediction Deep Decipher for Semi-Supervised Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6082 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6170-6177

Author(s):

Guo-Hua Wang ◽

Jianxin Wu

Keyword(s):

Deep Learning ◽

Supervised Learning ◽

Large Scale ◽

State Of The Art ◽

Link Function ◽

Training Strategy ◽

Network Parameters ◽

Theoretical Support ◽

Percentage Points ◽

End To End

Most recent semi-supervised deep learning (deep SSL) methods used a similar paradigm: use network predictions to update pseudo-labels and use pseudo-labels to update network parameters iteratively. However, they lack theoretical support and cannot explain why predictions are good candidates for pseudo-labels. In this paper, we propose a principled end-to-end framework named deep decipher (D2) for SSL. Within the D2 framework, we prove that pseudo-labels are related to network predictions by an exponential link function, which gives a theoretical support for using predictions as pseudo-labels. Furthermore, we demonstrate that updating pseudo-labels by network predictions will make them uncertain. To mitigate this problem, we propose a training strategy called repetitive reprediction (R2). Finally, the proposed R2-D2 method is tested on the large-scale ImageNet dataset and outperforms state-of-the-art methods by 5 percentage points.

Download Full-text

Generating Senses and RoLes: An End-to-End Model for Dependency- and Span-based Semantic Role Labeling

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/521 ◽

2021 ◽

Author(s):

Rexhina Blloshmi ◽

Simone Conia ◽

Rocco Tripodi ◽

Roberto Navigli

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Great Success ◽

Semantic Role ◽

Semantic Role Labeling ◽

Complex Predicate ◽

Input Sentence ◽

End To End

Despite the recent great success of the sequence-to-sequence paradigm in Natural Language Processing, the majority of current studies in Semantic Role Labeling (SRL) still frame the problem as a sequence labeling task. In this paper we go against the flow and propose GSRL (Generating Senses and RoLes), the first sequence-to-sequence model for end-to-end SRL. Our approach benefits from recently-proposed decoder-side pretraining techniques to generate both sense and role labels for all the predicates in an input sentence at once, in an end-to-end fashion. Evaluated on standard gold benchmarks, GSRL achieves state-of-the-art results in both dependency- and span-based English SRL, proving empirically that our simple generation-based model can learn to produce complex predicate-argument structures. Finally, we propose a framework for evaluating the robustness of an SRL model in a variety of synthetic low-resource scenarios which can aid human annotators in the creation of better, more diverse, and more challenging gold datasets. We release GSRL at github.com/SapienzaNLP/gsrl.

Download Full-text

SIMPA: Statement-to-Item Matching Personality Assessment from text

10.31234/osf.io/tuq7z ◽

2022 ◽

Author(s):

Matej Gjurković ◽

Iva Vukojević ◽

Jan Šnajder

Keyword(s):

Language Processing ◽

Personality Assessment ◽

Feedback Loop ◽

State Of The Art ◽

Proof Of Concept ◽

Text Data ◽

Social Media Site ◽

The Social ◽

Personality Judgments ◽

Personality Prediction

Automated text-based personality assessment (ATBPA) methods can analyze large amounts of text data and identify nuanced linguistic personality cues. However, current approaches lack the interpretability, explainability, and validity offered by standard questionnaire instruments. To address these weaknesses, we propose an approach that combines questionnaire-based and text-based approaches to personality assessment. Our Statement-to-Item Matching Personality Assessment (SIMPA) framework uses natural language processing methods to detect self-referencing descriptions of personality in a target’s text and utilizes these descriptions for personality assessment. The core of the framework is the notion of a trait-constrained semantic similarity between the target’s freely expressed statements and questionnaire items. The conceptual basis is provided by the realistic accuracy model (RAM), which describes the process of accurate personality judgments and which we extend with a feedback loop mechanism to improve the accuracy of judgments. We present a simple proof-of-concept implementation of SIMPA for ATBPA on the social media site Reddit. We show how the framework can be used directly for unsupervised estimation of a target’s Big 5 scores and indirectly to produce features for a supervised ATBPA model, demonstrating state-of-the-art results for the personality prediction task on Reddit.

Download Full-text

Mention Recommendation for Twitter with End-to-end Memory Network

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/260 ◽

2017 ◽

Cited By ~ 11

Author(s):

Haoran Huang ◽

Qi Zhang ◽

Xuanjing Huang

Keyword(s):

Statistical Learning ◽

Language Processing ◽

Network Architecture ◽

State Of The Art ◽

Rapid Development ◽

Word Embedding ◽

Social Networking Services ◽

Memory Network ◽

End To End ◽

Embedding Methods

In this study, we investigated the problem of recommending usernames when people attempt to use the ``@'' sign to mention other people in twitter-like social media. With the extremely rapid development of social networking services, this problem has received considerable attention in recent years. Previous methods have studied the problem from different aspects. Because most of Twitter-like microblogging services limit the length of posts, statistical learning methods may be affected by the problems of word sparseness and synonyms. Although recent progress in neural word embedding methods have advanced the state-of-the-art in many natural language processing tasks, the benefits of word embedding have not been taken into consideration for this problem. In this work, we proposed a novel end-to-end memory network architecture to perform this task. We incorporated the interests of users with external memory. A hierarchical attention mechanism was also applied to better consider the interests of users. The experimental results on a dataset we collected from Twitter demonstrated that the proposed method could outperform state-of-the-art approaches.

Download Full-text

Neural Discourse Segmentation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/949 ◽

2019 ◽

Author(s):

Jing Li

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Word Embeddings ◽

External Knowledge ◽

Coherence Relations ◽

End To End ◽

Discourse Units

Identifying discourse structures and coherence relations in a piece of text is a fundamental task in natural language processing. The first step of this process is segmenting sentences into clause-like units called elementary discourse units (EDUs). Traditional solutions to discourse segmentation heavily rely on carefully designed features. In this demonstration, we present SegBot, a system to split a given piece of text into sequence of EDUs by using an end-to-end neural segmentation model. Our model does not require hand-crafted features or external knowledge except word embeddings, yet it outperforms state-of-the-art solutions to discourse segmentation.

Download Full-text

Automatic morphological analysis on the material of Russian social media texts

10.29007/dlff ◽

2019 ◽

Author(s):

Alena Fenogenova ◽

Viktor Kazorin ◽

Ilia Karpov ◽

Tatyana Krylova

Keyword(s):

Social Networks ◽

Social Media ◽

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Morphological Analysis ◽

State Of The Art ◽

Test Set ◽

The Social ◽

Media Texts

Automatic morphological analysis is one of the fundamental and significant tasks of NLP (Natural Language Processing). Due to special features of Internet texts, as they can be both normative texts (news, fiction, nonfiction) and less formal texts (such as blogs and texts from social networks), the morphological tagging has become non-trivial and an actual task. In this paper we describe our experiments in tagging of Internet texts presenting our approach based on deep learning. The new social media test set was created, that allows to compare our system with state-of-the-art open source analyzers on the social media texts material.

Download Full-text

Coreference Resolution: Toward End-to-End and Cross-Lingual Systems

Information ◽

10.3390/info11020074 ◽

2020 ◽

Vol 11 (2) ◽

pp. 74 ◽

Cited By ~ 1

Author(s):

André Ferreira Cruz ◽

Gil Rocha ◽

Henrique Lopes Cardoso

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Coreference Resolution ◽

Language Understanding ◽

External Resources ◽

End To End ◽

Cross Lingual ◽

Open Issues

The task of coreference resolution has attracted considerable attention in the literature due to its importance in deep language understanding and its potential as a subtask in a variety of complex natural language processing problems. In this study, we outlined the field’s terminology, describe existing metrics, their differences and shortcomings, as well as the available corpora and external resources. We analyzed existing state-of-the-art models and approaches, and reviewed recent advances and trends in the field, namely end-to-end systems that jointly model different subtasks of coreference resolution, and cross-lingual systems that aim to overcome the challenges of less-resourced languages. Finally, we discussed the main challenges and open issues faced by coreference resolution systems.

Download Full-text

TrustSVD: A Novel Trust-Based Matrix Factorization Model with User Trust and Item Ratings

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.422 ◽

2017 ◽

Vol 7 (11) ◽

pp. 7 ◽

Cited By ~ 1

Author(s):

K Sobha Rani

Keyword(s):

Matrix Factorization ◽

Social Trust ◽

State Of The Art ◽

Data Sets ◽

Real World Data ◽

Recommendation Algorithm ◽

Active User ◽

Factorization Model ◽

The Social ◽

Matrix Factorization Technique

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.

Download Full-text

Coinage and the Tributary Mode of Production

10.1093/oso/9780198777601.003.0030 ◽

2018 ◽

Author(s):

Eduardo Manzano Moreno

Keyword(s):

Roman Empire ◽

Middle Ages ◽

Social Role ◽

State Of The Art ◽

Early Middle Ages ◽

The Social ◽

Simple Question ◽

Mode Of Production

This chapter addresses a very simple question: is it possible to frame coinage in the Early Middle Ages? The answer will be certainly yes, but will also acknowledge that we lack considerable amounts of relevant data potentially available through state-of-the-art methodologies. One problem is, though, that many times we do not really know the relevant questions we can pose on coins; another is that we still have not figured out the social role of coinage in the aftermath of the Roman Empire. This chapter shows a number of things that could only be known thanks to the analysis of coins. And as its title suggests it will also include some reflections on greed and generosity.

Download Full-text

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks

Applied Sciences ◽

10.3390/app11156975 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6975

Author(s):

Tao Zhang ◽

Lun He ◽

Xudong Li ◽

Guoqing Feng

Keyword(s):

Performance Improvement ◽

State Of The Art ◽

Error Rates ◽

Convolutional Network ◽

Convolutional Networks ◽

Sentence Level ◽

End To End ◽

High Level ◽

Improved Accuracy ◽

Talking Face

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.

Download Full-text