still image
Recently Published Documents


TOTAL DOCUMENTS

458
(FIVE YEARS 104)

H-INDEX

25
(FIVE YEARS 5)

2022 ◽  
Vol 15 ◽  
Author(s):  
Enrico Varano ◽  
Konstantinos Vougioukas ◽  
Pingchuan Ma ◽  
Stavros Petridis ◽  
Maja Pantic ◽  
...  

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speaker’s face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person’s face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer (AVSR) benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.


2021 ◽  
Vol 14 (1) ◽  
pp. 135
Author(s):  
Sari Suomalainen ◽  
Helena Kahiluoto ◽  
Anne Pässilä ◽  
Allan Owens ◽  
Clive Holtham

Urban open spaces of local natural environments can promote the health and well-being of both ecosystems and humans, and the management of the urban spaces can benefit from knowledge of individuals’/citizens’ perceptions of such environments. However, such knowledge is scarce and contemporary inquiries are often limited to cognitive observations and focused on built environmental elements rather than encouraged to recognize and communicate comprehensive perceptions. This paper investigates whether arts-based methods can facilitate recognition and understanding perceptions of urban open spaces. Two arts-based methods were used to capture perceptions: drifting, which is a walking method, and theatrical images, which is a still image method and three reflective methods to recognize and communicate the perceptions. The results show related sensations and perceptions enabled by arts-based methods comparing them to a sticker map method. The main findings were perceptions, which included information about human–environment interaction, about relations to other people and about ‘sense of place’ in urban open spaces. The hitherto unidentified perceptions about urban open space were associations, metaphors and memories. The methods used offer initial practical implications for future use.


2021 ◽  
Author(s):  
Enrico Varano ◽  
Konstantinos Vougioukas ◽  
Pingchuan Ma ◽  
Stavros Petridis ◽  
Maja Pantic ◽  
...  

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speake's face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person's face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.


Author(s):  
Sifan Peng ◽  
Baoqun Yin ◽  
Xiaoliang Hao ◽  
Qianqian Yang ◽  
Aakash Kumar ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1959
Author(s):  
Dat Ngo ◽  
Bongsoon Kang

Gamma correction is a common image processing technique that is common in video or still image systems. However, this simple and efficient method is typically expressed using the power law, which gives rise to practical difficulties in designing a reconfigurable hardware implementation. For example, the conventional approach calculates all possible outputs for a pre-determined gamma value, and this information is hardwired into memory components. As a result, reconfigurability is unattainable after deployment. This study proposes using the Taylor series to approximate gamma correction to overcome the aforementioned challenging problem, hence, facilitating the post-deployment reconfigurability of the hardware implementation. In other words, the gamma value is freely adjustable, resulting in the high appropriateness for offloading gamma correction onto its dedicated hardware in system-on-a-chip applications. Finally, the proposed hardware implementation is verified on Zynq UltraScale+ MPSoC ZCU106 Evaluation Kit, and the results demonstrate its superiority against benchmark designs.


Author(s):  
Rakan Mohammed Rashid ◽  
Baker Khalid Baker ◽  
Omar Farook Mohammad ◽  
Falah Y H Ahmed

2021 ◽  
Vol 11 (4) ◽  
pp. 3023-3029
Author(s):  
Muhammad Junaid ◽  
Luqman Shah ◽  
Ali Imran Jehangiri ◽  
Fahad Ali Khan ◽  
Yousaf Saeed ◽  
...  

With each passing day resolutions of still image/video cameras are on the rise. This amelioration in resolutions has the potential to extract useful information on the view opposite the photographed subjects from their reflecting parts. Especially important is the idea to capture images formed on the eyes of photographed people and animals. The motivation behind this research is to explore the forensic importance of the images/videos to especially analyze the reflections of the background of the camera. This analysis may include extraction/ detection/recognition of the objects in front of the subjects but on the back of the camera. In the national context such videos/photographs are not rare and, specifically speaking, an abductee’s video footage at a good resolution may give some important clues to the identity of the person who kidnapped him/her. Our aim would be to extract visual information formed in human eyes from still images as well as from video clips. After extraction, our next task would be to recognize the extracted visual information. Initially our experiments would be limited on characters’ extraction and recognition, including characters of different styles and font sizes (computerized) as well as hand written. Although varieties of Optical Character Recognition (OCR) tools are available for characters’ extraction and recognition but, the problem is that they only provide results for clear images (zoomed).


2021 ◽  
Vol 7 (Extra-D) ◽  
pp. 593-599
Author(s):  
Alireza Asiaban ◽  
Ahmad Ebrahimipour

Narrative in photography deals with a subject in which a still image can express a narrative or not. According to the theories expressed by the theorists in this regard, stage photography has emerged, but what is discussed in this research is the narrative style. Now in photography, too, is the expression of the narrator with the philosophical look and idea that manifests itself in the works of Gilbert Garcin. Works that are theoretically close to staged photography but operate in a different form and structure. In his work, Garcin creates a world full of question with photomontage techniques, and by placing himself as the human subject in the photograph, he offers a personal definition that is, of course, Shareable. A world devoid of meaning and a human being trapped in this world, which, like the characters in absurd plays, has a surreal form. Garcin narrates the world with three basic principles: the philosophical spirit, emotions, and meaning - the loss of meaning.


SATS ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Matthew Carlson

Abstract Deepfakes are audio, video, or still-image digital artifacts created by the use of artificial intelligence technology, as opposed to traditional means of recording. Because deepfakes can look and sound much like genuine digital recordings, they have entered the popular imagination as sources of serious epistemic problems for us, as we attempt to navigate the increasingly treacherous digital information environment of the internet. In this paper, I attempt to clarify what epistemic problems deepfakes pose and why they pose these problems, by drawing parallels between recordings and our own senses as sources of evidence. I show that deepfakes threaten to undermine the status of digital recordings as evidence. The existence of deepfakes thus encourages a kind of skepticism about digital recordings that bears important similarities to classic philosophical skepticism concerning the senses. However, the skepticism concerning digital recordings that deepfakes motivate is also importantly different from classical skepticism concerning the senses, and I argue that these differences illuminate some possible strategies for solving the epistemic problems posed by deepfakes.


2021 ◽  
Vol 4 (1) ◽  
pp. 15-28
Author(s):  
Vladislav Li ◽  
◽  
Georgios Amponis ◽  
Jean-Christophe Nebel ◽  
Vasileios Argyriou ◽  
...  

Developments in the field of neural networks, deep learning, and increases in computing systems’ capacity have allowed for a significant performance boost in scene semantic information extraction algorithms and their respective mechanisms. The work presented in this paper investigates the performance of various object classification- recognition frameworks and proposes a novel framework, which incorporates Super-Resolution as a preprocessing method, along with YOLO/Retina as the deep neural network component. The resulting scene analysis framework was fine-tuned and benchmarked using the COCO dataset, with the results being encouraging. The presented framework can potentially be utilized, not only in still image recognition scenarios but also in video processing.


Sign in / Sign up

Export Citation Format

Share Document