still image Latest Research Papers

Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

Frontiers in Neuroscience ◽

10.3389/fnins.2021.781196 ◽

2022 ◽

Vol 15 ◽

Author(s):

Enrico Varano ◽

Konstantinos Vougioukas ◽

Pingchuan Ma ◽

Stavros Petridis ◽

Maja Pantic ◽

...

Keyword(s):

Hearing Impairment ◽

Audiovisual Speech ◽

Speech Comprehension ◽

Generative Adversarial Network ◽

Still Image ◽

Speech In Noise ◽

Adversarial Network ◽

Listening Environments ◽

End To End ◽

Speech Recognizer

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speaker’s face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person’s face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer (AVSR) benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.

Download Full-text

Arts-Aided Recognition of Citizens’ Perceptions for Urban Open Space Management

Sustainability ◽

10.3390/su14010135 ◽

2021 ◽

Vol 14 (1) ◽

pp. 135

Author(s):

Sari Suomalainen ◽

Helena Kahiluoto ◽

Anne Pässilä ◽

Allan Owens ◽

Clive Holtham

Keyword(s):

Open Space ◽

Well Being ◽

Image Method ◽

Natural Environments ◽

Environment Interaction ◽

Open Spaces ◽

Human Environment ◽

Space Management ◽

Still Image ◽

Urban Open Space

Urban open spaces of local natural environments can promote the health and well-being of both ecosystems and humans, and the management of the urban spaces can benefit from knowledge of individuals’/citizens’ perceptions of such environments. However, such knowledge is scarce and contemporary inquiries are often limited to cognitive observations and focused on built environmental elements rather than encouraged to recognize and communicate comprehensive perceptions. This paper investigates whether arts-based methods can facilitate recognition and understanding perceptions of urban open spaces. Two arts-based methods were used to capture perceptions: drifting, which is a walking method, and theatrical images, which is a still image method and three reflective methods to recognize and communicate the perceptions. The results show related sensations and perceptions enabled by arts-based methods comparing them to a sticker map method. The main findings were perceptions, which included information about human–environment interaction, about relations to other people and about ‘sense of place’ in urban open spaces. The hitherto unidentified perceptions about urban open space were associations, metaphors and memories. The methods used offer initial practical implications for future use.

Download Full-text

Speech-driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

10.1101/2021.12.18.471222 ◽

2021 ◽

Author(s):

Enrico Varano ◽

Konstantinos Vougioukas ◽

Pingchuan Ma ◽

Stavros Petridis ◽

Maja Pantic ◽

...

Keyword(s):

Hearing Impairment ◽

Audiovisual Speech ◽

Speech Comprehension ◽

Generative Adversarial Network ◽

Still Image ◽

Speech In Noise ◽

Adversarial Network ◽

Listening Environments ◽

End To End ◽

Speech Recognizer

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speake's face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person's face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.

Download Full-text

Depth and edge auxiliary learning for still image crowd density estimation

Pattern Analysis and Applications ◽

10.1007/s10044-021-01017-4 ◽

2021 ◽

Author(s):

Sifan Peng ◽

Baoqun Yin ◽

Xiaoliang Hao ◽

Qianqian Yang ◽

Aakash Kumar ◽

...

Keyword(s):

Density Estimation ◽

Still Image ◽

Crowd Density Estimation ◽

Crowd Density

Download Full-text

Taylor-Series-Based Reconfigurability of Gamma Correction in Hardware Designs

Electronics ◽

10.3390/electronics10161959 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1959

Author(s):

Dat Ngo ◽

Bongsoon Kang

Keyword(s):

Taylor Series ◽

Hardware Implementation ◽

Processing Technique ◽

Image Processing Technique ◽

Gamma Correction ◽

Challenging Problem ◽

Still Image ◽

System On A Chip ◽

Hardware Designs ◽

Dedicated Hardware

Gamma correction is a common image processing technique that is common in video or still image systems. However, this simple and efficient method is typically expressed using the power law, which gives rise to practical difficulties in designing a reconfigurable hardware implementation. For example, the conventional approach calculates all possible outputs for a pre-determined gamma value, and this information is hardwired into memory components. As a result, reconfigurability is unattainable after deployment. This study proposes using the Taylor series to approximate gamma correction to overcome the aforementioned challenging problem, hence, facilitating the post-deployment reconfigurability of the hardware implementation. In other words, the gamma value is freely adjustable, resulting in the high appropriateness for offloading gamma correction onto its dedicated hardware in system-on-a-chip applications. Finally, the proposed hardware implementation is verified on Zynq UltraScale+ MPSoC ZCU106 Evaluation Kit, and the results demonstrate its superiority against benchmark designs.

Download Full-text

Information Hiding in Still Image Based on Variable Steganography Technique to Achieve High Imperceptibility

10.1109/icsgrc53186.2021.9515235 ◽

2021 ◽

Author(s):

Rakan Mohammed Rashid ◽

Baker Khalid Baker ◽

Omar Farook Mohammad ◽

Falah Y H Ahmed

Keyword(s):

Information Hiding ◽

Still Image

Download Full-text

Recognition of Images Formed in Pho on the Eyes of different Subjects

Revista Gestão Inovação e Tecnologias ◽

10.47059/revistageintec.v11i4.2335 ◽

2021 ◽

Vol 11 (4) ◽

pp. 3023-3029

Author(s):

Muhammad Junaid ◽

Luqman Shah ◽

Ali Imran Jehangiri ◽

Fahad Ali Khan ◽

Yousaf Saeed ◽

...

Keyword(s):

Visual Information ◽

Character Recognition ◽

Optical Character Recognition ◽

Good Resolution ◽

Still Images ◽

Still Image ◽

Video Cameras ◽

Video Footage ◽

Video Clips ◽

Human Eyes

With each passing day resolutions of still image/video cameras are on the rise. This amelioration in resolutions has the potential to extract useful information on the view opposite the photographed subjects from their reflecting parts. Especially important is the idea to capture images formed on the eyes of photographed people and animals. The motivation behind this research is to explore the forensic importance of the images/videos to especially analyze the reflections of the background of the camera. This analysis may include extraction/ detection/recognition of the objects in front of the subjects but on the back of the camera. In the national context such videos/photographs are not rare and, specifically speaking, an abductee’s video footage at a good resolution may give some important clues to the identity of the person who kidnapped him/her. Our aim would be to extract visual information formed in human eyes from still images as well as from video clips. After extraction, our next task would be to recognize the extracted visual information. Initially our experiments would be limited on characters’ extraction and recognition, including characters of different styles and font sizes (computerized) as well as hand written. Although varieties of Optical Character Recognition (OCR) tools are available for characters’ extraction and recognition but, the problem is that they only provide results for clear images (zoomed).

Download Full-text

Narrative and philosophical photography focusing on Gilbert Garcin’s works

LAPLAGE EM REVISTA ◽

10.24115/s2446-622020217extra-d1142p.593-599 ◽

2021 ◽

Vol 7 (Extra-D) ◽

pp. 593-599

Author(s):

Alireza Asiaban ◽

Ahmad Ebrahimipour

Keyword(s):

Human Subject ◽

Human Being ◽

Basic Principles ◽

Still Image ◽

Narrative Style ◽

The World

Narrative in photography deals with a subject in which a still image can express a narrative or not. According to the theories expressed by the theorists in this regard, stage photography has emerged, but what is discussed in this research is the narrative style. Now in photography, too, is the expression of the narrator with the philosophical look and idea that manifests itself in the works of Gilbert Garcin. Works that are theoretically close to staged photography but operate in a different form and structure. In his work, Garcin creates a world full of question with photomontage techniques, and by placing himself as the human subject in the photograph, he offers a personal definition that is, of course, Shareable. A world devoid of meaning and a human being trapped in this world, which, like the characters in absurd plays, has a surreal form. Garcin narrates the world with three basic principles: the philosophical spirit, emotions, and meaning - the loss of meaning.

Download Full-text

Skepticism and the Digital Information Environment

SATS ◽

10.1515/sats-2021-0008 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Matthew Carlson

Keyword(s):

Information Environment ◽

The Internet ◽

Digital Information ◽

Popular Imagination ◽

Still Image ◽

The Status ◽

Artificial Intelligence Technology ◽

The Senses ◽

Audio Video ◽

Digital Recordings

Abstract Deepfakes are audio, video, or still-image digital artifacts created by the use of artificial intelligence technology, as opposed to traditional means of recording. Because deepfakes can look and sound much like genuine digital recordings, they have entered the popular imagination as sources of serious epistemic problems for us, as we attempt to navigate the increasingly treacherous digital information environment of the internet. In this paper, I attempt to clarify what epistemic problems deepfakes pose and why they pose these problems, by drawing parallels between recordings and our own senses as sources of evidence. I show that deepfakes threaten to undermine the status of digital recordings as evidence. The existence of deepfakes thus encourages a kind of skepticism about digital recordings that bears important similarities to classic philosophical skepticism concerning the senses. However, the skepticism concerning digital recordings that deepfakes motivate is also importantly different from classical skepticism concerning the senses, and I argue that these differences illuminate some possible strategies for solving the epistemic problems posed by deepfakes.

Download Full-text

OBJECT RECOGNITION FOR AUGMENTED REALITY APPLICATIONS

Azerbaijan Journal of High Performance Computing ◽

10.32010/26166127.2021.4.1.15.28 ◽

2021 ◽

Vol 4 (1) ◽

pp. 15-28

Author(s):

Vladislav Li ◽

◽

Georgios Amponis ◽

Jean-Christophe Nebel ◽

Vasileios Argyriou ◽

...

Keyword(s):

Video Processing ◽

Deep Neural Network ◽

Semantic Information ◽

Super Resolution ◽

Scene Analysis ◽

Analysis Framework ◽

Computing Systems ◽

Still Image ◽

Significant Performance ◽

Network Component

Developments in the field of neural networks, deep learning, and increases in computing systems’ capacity have allowed for a significant performance boost in scene semantic information extraction algorithms and their respective mechanisms. The work presented in this paper investigates the performance of various object classification- recognition frameworks and proposes a novel framework, which incorporates Super-Resolution as a preprocessing method, along with YOLO/Retina as the deep neural network component. The resulting scene analysis framework was fine-tuned and benchmarked using the COCO dataset, with the results being encouraging. The presented framework can potentially be utilized, not only in still image recognition scenarios but also in video processing.

Download Full-text

still image
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

Arts-Aided Recognition of Citizens’ Perceptions for Urban Open Space Management

Speech-driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

Depth and edge auxiliary learning for still image crowd density estimation

Taylor-Series-Based Reconfigurability of Gamma Correction in Hardware Designs

Information Hiding in Still Image Based on Variable Steganography Technique to Achieve High Imperceptibility

Recognition of Images Formed in Pho on the Eyes of different Subjects

Narrative and philosophical photography focusing on Gilbert Garcin’s works

Skepticism and the Digital Information Environment

OBJECT RECOGNITION FOR AUGMENTED REALITY APPLICATIONS

Export Citation Format

still imageRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

Arts-Aided Recognition of Citizens’ Perceptions for Urban Open Space Management

Speech-driven Facial Animations Improve Speech-in-Noise Comprehension of Humans

Depth and edge auxiliary learning for still image crowd density estimation

Taylor-Series-Based Reconfigurability of Gamma Correction in Hardware Designs

Information Hiding in Still Image Based on Variable Steganography Technique to Achieve High Imperceptibility

Recognition of Images Formed in Pho on the Eyes of different Subjects

Narrative and philosophical photography focusing on Gilbert Garcin’s works

Skepticism and the Digital Information Environment

OBJECT RECOGNITION FOR AUGMENTED REALITY APPLICATIONS

still image
Recently Published Documents