scholarly journals Semantic Hyperlapse: a Sparse Coding-based and Multi-Importance Approach for First-Person Videos

2020 ◽  
Author(s):  
Michel M. Silva ◽  
Mario F. M. Campos ◽  
Erickson R. Nascimento

The availability of low-cost and high-quality wearable cameras combined with the unlimited storage capacity of video-sharing websites have evoked a growing interest in First-Person Videos. Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visually unpleasant to watch. Consequently, it raises the need to provide quick access to the information therein. We propose a Sparse Coding based methodology to fast-forward First-Person Videos adaptively. Experimental evaluations show that the shorter version video resulting from the proposed method is more stable and retain more semantic information than the state-of-the-art. Visual results and graphical explanation of the methodology can be visualized through the link: https://youtu.be/rTEZurH64ME

Author(s):  
Michel M. Silva ◽  
Mario F. M. Campos ◽  
Erickson R. Nascimento

The availability of low-cost, high-quality personal wearable cameras combined with the unlimited storage capacity of video-sharing websites has evoked a growing interest in First-Person Videos (FPVs). Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visually unpleasant to watch. Consequently, there is a rise in the need to provide quick access to the information therein. To address this need, efforts have been applied to the development of techniques such as Hyperlapse and Semantic Hyperlapse, which aims to create visually pleasant shorter videos and emphasize semantic portions of the video, respectively. The state-of-the-art Semantic Hyperlapse method SSFF, negligees the level of importance of the relevant information, by only evaluating if it is significant or not. Other limitations of SSFF are the number of input parameters, the scalability in the number of visual features to describe the frames, and the abrupt change in the speed-up rate of consecutive video segments. In this dissertation, we propose a parameter-free Sparse Coding based methodology to adaptively fast-forward First-Person Videos, that emphasize the semantic portions applying a multi-importance approach. Experimental evaluations show that the proposed method creates shorter version video retaining more semantic information, with fewer abrupt transitions of speed-up rates, and more stable final videos than the output of SSFF. Visual results and graphical explanation of the methodology can be visualized through the link: https://youtu.be/8uStih8P5-Y.


2021 ◽  
Author(s):  
Xueqiao Li ◽  
Na Sun ◽  
Zhanfeng Li ◽  
Jinbo Chen ◽  
Qinjun Sun ◽  
...  

Perovskite solar cells (PSCs) have reached their highest efficiency with the state-of-the-art hole-transporting material (HTM) spiro-OMeTAD.


Author(s):  
Anastasia Dimou

In this chapter, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML. We look into details on their differences and explain how knowledge graphs, in the form of RDF graphs, can be generated with each one of the two mapping languages. Then we assess if the vocabulary terms were properly applied to the data and no violations occurred on their use, either using R2RML or RML to generate the desired knowledge graph.


Author(s):  
Yongzhi Wang

The application of virtual reality (VR) in higher education has drawn attention. Understanding the state of the art for VR technologies helps educators identify appropriate applications and develop a high-quality engaging teaching-learning process. This chapter provides a comprehensive survey of current hardware and software supports on VR. Secondly, important technical metrics in VR technology are considered with comparisons of different VR devices using identified metrics. Third, there is a focus on software tools and an explore of various development frameworks, which facilitate the implementation of VR applications. With this information as a foundation, there is a VR use in higher education. Finally, there is a discussion of VR applications that can be potentially used in education.


Author(s):  
Tianxing Wu ◽  
Guilin Qi ◽  
Bin Luo ◽  
Lei Zhang ◽  
Haofen Wang

Extracting knowledge from Wikipedia has attracted much attention in recent ten years. One of the most valuable kinds of knowledge is type information, which refers to the axioms stating that an instance is of a certain type. Current approaches for inferring the types of instances from Wikipedia mainly rely on some language-specific rules. Since these rules cannot catch the semantic associations between instances and classes (i.e. candidate types), it may lead to mistakes and omissions in the process of type inference. The authors propose a new approach leveraging attributes to perform language-independent type inference of the instances from Wikipedia. The proposed approach is applied to the whole English and Chinese Wikipedia, which results in the first version of MulType (Multilingual Type Information), a knowledge base describing the types of instances from multilingual Wikipedia. Experimental results show that not only the proposed approach outperforms the state-of-the-art comparison methods, but also MulType contains lots of new and high-quality type information.


2020 ◽  
Vol 34 (07) ◽  
pp. 11394-11401
Author(s):  
Shuzhao Li ◽  
Huimin Yu ◽  
Haoji Hu

In this paper, we propose an Appearance and Motion Enhancement Model (AMEM) for video-based person re-identification to enrich the two kinds of information contained in the backbone network in a more interpretable way. Concretely, human attribute recognition under the supervision of pseudo labels is exploited in an Appearance Enhancement Module (AEM) to help enrich the appearance and semantic information. A Motion Enhancement Module (MEM) is designed to capture the identity-discriminative walking patterns through predicting future frames. Despite a complex model with several auxiliary modules during training, only the backbone model plus two small branches are kept for similarity evaluation which constitute a simple but effective final model. Extensive experiments conducted on three popular video-based person ReID benchmarks demonstrate the effectiveness of our proposed model and the state-of-the-art performance compared with existing methods.


Author(s):  
Ziming Li ◽  
Julia Kiseleva ◽  
Maarten De Rijke

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.


Author(s):  
Xosé López-García ◽  
Ángel Vizoso

High technology is driving most of the innovation and debates in journalism today. Artificial intelligence and journalism are walking hand in hand in the current phase, defined by processes of digitization. Studies on the state of the art of such technology in the media reveal a clear tendency towards the use of more sophisticated tools. Furthermore, this research highlights how journalists are increasingly using such approaches in challenging situations. This shift has thus led to more debate on the threats and opportunities of the introduction of such technologies into a communication ecosystem that is already in need of models that can produce high-quality information. This study thus describes the state of the art on the integration of high technology into daily routines in the media. Resumen La denominada “alta tecnología” marca buena parte de la innovación y de los debates del periodismo actual. Inteligencia artificial y periodismo caminan de la mano en la fase actual de la digitalización de los procesos. Los estudios sobre el estado de las tecnologías en las redacciones de los medios muestran una clara tendencia de los periodistas a trabajar con herramientas más sofisticadas y a emplearlas en los desafíos que tienen a la hora de realizar su cometido profesional. La tendencia, que no parece tener marcha atrás, introduce renovados debates sobre las amenazas y oportunidades en un ecosistema comunicativo cada vez más complejo y más necesitado de respuestas para establecer modelos sostenibles que aseguren la existencia de información periodística de calidad. En este texto se realiza una aproximación al estado de la cuestión, se analizan experiencias y se sitúan algunos de los retos.


2018 ◽  
Vol 3 (1) ◽  
pp. 821 ◽  
Author(s):  
Antony García ◽  
Yessica Sáez ◽  
José Muñoz ◽  
Ignacio Chang ◽  
Héctor Montes Franceschi

This article presents the state of the art on the use of radiofrequency communication for the detection of objects and vehicles in motion, through the interaction between transmitter and receiver devices using ISM (Industrial, Scientific and Medical) bands. By quantifying parameters such as the absence or presence of signals and their intensity, it is possible to approximate the distance between an emitting device and a receiver, localized in the vehicle and a fixed point, respectively . The study of the methodologies used in this article aims to develop a system oriented to guide people with visual disabilities in the public transportation system, taking advantage of the main characteristics of radiofrequency communication: low cost, easy implementation and full compatibility with electronic boards built on embedded systems.Keywords: radiofrequency, ISM bands, detection of vehicles in motion, support for visual disability people, ETA


Author(s):  
Shuai Yang ◽  
Jiaying Liu ◽  
Wenjing Wang ◽  
Zongming Guo

Text effects transfer technology automatically makes the text dramatically more impressive. However, previous style transfer methods either study the model for general style, which cannot handle the highly-structured text effects along the glyph, or require manual design of subtle matching criteria for text effects. In this paper, we focus on the use of the powerful representation abilities of deep neural features for text effects transfer. For this purpose, we propose a novel Texture Effects Transfer GAN (TET-GAN), which consists of a stylization subnetwork and a destylization subnetwork. The key idea is to train our network to accomplish both the objective of style transfer and style removal, so that it can learn to disentangle and recombine the content and style features of text effects images. To support the training of our network, we propose a new text effects dataset with as much as 64 professionally designed styles on 837 characters. We show that the disentangled feature representations enable us to transfer or remove all these styles on arbitrary glyphs using one network. Furthermore, the flexible network design empowers TET-GAN to efficiently extend to a new text style via oneshot learning where only one example is required. We demonstrate the superiority of the proposed method in generating high-quality stylized text over the state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document