scholarly journals Deep Learning for Video Captioning: A Review

Author(s):  
Shaoxiang Chen ◽  
Ting Yao ◽  
Yu-Gang Jiang

Deep learning has achieved great successes in solving specific artificial intelligence problems recently. Substantial progresses are made on Computer Vision (CV) and Natural Language Processing (NLP). As a connection between the two worlds of vision and language, video captioning is the task of producing a natural-language utterance (usually a sentence) that describes the visual content of a video. The task is naturally decomposed into two sub-tasks. One is to encode a video via a thorough understanding and learn visual representation. The other is caption generation, which decodes the learned representation into a sequential sentence, word by word. In this survey, we first formulate the problem of video captioning, then review state-of-the-art methods categorized by their emphasis on vision or language, and followed by a summary of standard datasets and representative approaches. Finally, we highlight the challenges which are not yet fully understood in this task and present future research directions.

2021 ◽  
Vol 9 ◽  
pp. 1061-1080
Author(s):  
Prakhar Ganesh ◽  
Yao Chen ◽  
Xin Lou ◽  
Mohammad Ali Khan ◽  
Yin Yang ◽  
...  

Abstract Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resource- hungry and computation-intensive to suit low- capability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.


Author(s):  
Martin Atzmueller

Data Mining provides approaches for the identification and discovery of non-trivial patterns and models hidden in large collections of data. In the applied natural language processing domain, data mining usually requires preprocessed data that has been extracted from textual documents. Additionally, this data is often integrated with other data sources. This chapter provides an overview on data mining focusing on approaches for pattern mining, cluster analysis, and predictive model construction. For those, we discuss exemplary techniques that are especially useful in the applied natural language processing context. Additionally, we describe how the presented data mining approaches are connected to text mining, text classification, and clustering, and discuss interesting problems and future research directions.


2019 ◽  
Vol 11 (3) ◽  
pp. 937 ◽  
Author(s):  
Xing Shi ◽  
Binghui Si ◽  
Jiangshan Zhao ◽  
Zhichao Tian ◽  
Chao Wang ◽  
...  

The performance gap of buildings is commonly defined as the difference between the performance value predicted in the design stage and that measured in the post-occupancy stage. Knowledge about the performance gap of buildings is valuable in many aspects and thus is a research subject drawing much attention. Important questions that should be asked include: (1) Definition: what is the performance gap of buildings? (2) Magnitude: how large is the performance gap of buildings? (3) Techniques: how to determine the performance gap of buildings? (4) Causes: what are the reasons leading to the performance gap of buildings? (5) Solutions: how to bridge the performance gap of buildings. By collecting and analyzing more than 20 published works with reported data on the performance gap of buildings and other research articles, these important questions are addressed. Through this review state-of-the-art knowledge regarding the performance gap of buildings is presented. Major conclusions are drawn and future research directions are pointed out.


2000 ◽  
Vol 6 (2) ◽  
pp. 163-181 ◽  
Author(s):  
QIANG ZHOU ◽  
FUJI REN

In this paper, we propose a new ambiguity representation scheme; Structure Preference Relation (SPR), which consists of useful quantitative distribution information for ambiguous structures. Two automatic acquisition algorithms, the first acquired from a treebank, and the second acquired from raw texts, are introduced, and some experimental results which prove the availability of the algorithms are also given. Finally, we introduce some SPR applications in linguistics and natural language processing, such as preference-based parsing and the discovery of representative ambiguous structures, and propose some future research directions.


Author(s):  
Nag Nami ◽  
Melody Moh

Intelligent systems are capable of doing tasks on their own with minimal or no human intervention. With the advent of big data and IoT, these intelligence systems have made their ways into most industries and homes. With its recent advancements, deep learning has created a niche in the technology space and is being actively used in big data and IoT systems globally. With the wider adoption, deep learning models unfortunately have become susceptible to attacks. Research has shown that many state-of-the-art accurate models can be vulnerable to attacks by well-crafted adversarial examples. This chapter aims to provide concise, in-depth understanding of attacks and defense of deep learning models. The chapter first presents the key architectures and application domains of deep learning and their vulnerabilities. Next, it illustrates the prominent adversarial examples, including the algorithms and techniques used to generate these attacks. Finally, it describes challenges and mechanisms to counter these attacks, and suggests future research directions.


Author(s):  
Giovanni Da San Martino ◽  
Stefano Cresci ◽  
Alberto Barrón-Cedeño ◽  
Seunghak Yu ◽  
Roberto Di Pietro ◽  
...  

Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda. They exploit the anonymity of the Internet, the micro-profiling ability of social networks, and the ease of automatically creating and managing coordinated networks of accounts, to reach millions of social network users with persuasive messages, specifically targeted to topics each individual user is sensitive to, and ultimately influencing the outcome on a targeted issue. In this survey, we review the state of the art on computational propaganda detection from the perspective of Natural Language Processing and Network Analysis, arguing about the need for combined efforts between these communities. We further discuss current challenges and future research directions.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2312
Author(s):  
Tom Bolton ◽  
Tooska Dargahi ◽  
Sana Belguith ◽  
Mabrook S. Al-Rakhami ◽  
Ali Hassan Sodhro

Since the purchase of Siri by Apple, and its release with the iPhone 4S in 2011, virtual assistants (VAs) have grown in number and popularity. The sophisticated natural language processing and speech recognition employed by VAs enables users to interact with them conversationally, almost as they would with another human. To service user voice requests, VAs transmit large amounts of data to their vendors; these data are processed and stored in the Cloud. The potential data security and privacy issues involved in this process provided the motivation to examine the current state of the art in VA research. In this study, we identify peer-reviewed literature that focuses on security and privacy concerns surrounding these assistants, including current trends in addressing how voice assistants are vulnerable to malicious attacks and worries that the VA is recording without the user’s knowledge or consent. The findings show that not only are these worries manifold, but there is a gap in the current state of the art, and no current literature reviews on the topic exist. This review sheds light on future research directions, such as providing solutions to perform voice authentication without an external device, and the compliance of VAs with privacy regulations.


2016 ◽  
Vol 26 (3) ◽  
pp. 269-290 ◽  
Author(s):  
Catherine Baethge ◽  
Julia Klier ◽  
Mathias Klier

Sign in / Sign up

Export Citation Format

Share Document