ACM Transactions on Applied Perception
Latest Publications


TOTAL DOCUMENTS

441
(FIVE YEARS 60)

H-INDEX

41
(FIVE YEARS 3)

Published By Association For Computing Machinery

1544-3558

2022 ◽  
Vol 19 (1) ◽  
pp. 1-19
Author(s):  
Anca Salagean ◽  
Jacob Hadnett-Hunter ◽  
Daniel J. Finnegan ◽  
Alexandra A. De Sousa ◽  
Michael J. Proulx

Ultrasonic mid-air haptic technologies, which provide haptic feedback through airwaves produced using ultrasound, could be employed to investigate the sense of body ownership and immersion in virtual reality (VR) by inducing the virtual hand illusion (VHI). Ultrasonic mid-air haptic perception has solely been investigated for glabrous (hairless) skin, which has higher tactile sensitivity than hairy skin. In contrast, the VHI paradigm typically targets hairy skin without comparisons to glabrous skin. The aim of this article was to investigate illusory body ownership, the applicability of ultrasonic mid-air haptics, and perceived immersion in VR using the VHI. Fifty participants viewed a virtual hand being stroked by a feather synchronously and asynchronously with the ultrasonic stimulation applied to the glabrous skin on the palmar surface and the hairy skin on the dorsal surface of their hands. Questionnaire responses revealed that synchronous stimulation induced a stronger VHI than asynchronous stimulation. In synchronous conditions, the VHI was stronger for palmar stimulation than dorsal stimulation. The ultrasonic stimulation was also perceived as more intense on the palmar surface compared to the dorsal surface. Perceived immersion was not related to illusory body ownership per se but was enhanced by the provision of synchronous stimulation.


2022 ◽  
Vol 19 (1) ◽  
pp. 1-18
Author(s):  
Björn Blissing ◽  
Fredrik Bruzelius ◽  
Olle Eriksson

Driving simulators are established tools used during automotive development and research. Most simulators use either monitors or projectors as their primary display system. However, the emergence of a new generation of head-mounted displays has triggered interest in using these as the primary display type. The general benefits and drawbacks of head-mounted displays are well researched, but their effect on driving behavior in a simulator has not been sufficiently quantified. This article presents a study of driving behavior differences between projector-based graphics and head-mounted display in a large dynamic driving simulator. This study has selected five specific driving maneuvers suspected of affecting driving behavior differently depending on the choice of display technology. Some of these maneuvers were chosen to reveal changes in lateral and longitudinal driving behavior. Others were picked for their ability to highlight the benefits and drawbacks of head-mounted displays in a driving context. The results show minor changes in lateral and longitudinal driver behavior changes when comparing projectors and a head-mounted display. The most noticeable difference in favor of projectors was seen when the display resolution is critical to the driving task. The choice of display type did not affect simulator sickness nor the realism rated by the subjects.


2022 ◽  
Vol 19 (1) ◽  
pp. 1-21
Author(s):  
Wanyu Liu ◽  
Michelle Agnes Magalhaes ◽  
Wendy E. Mackay ◽  
Michel Beaudouin-Lafon ◽  
Frédéric Bevilacqua

With the increasing interest in movement sonification and expressive gesture-based interaction, it is important to understand which factors contribute to movement learning and how. We explore the effects of movement sonification and users’ musical background on motor variability in complex gesture learning. We contribute an empirical study in which musicians and non-musicians learn two gesture sequences over three days, with and without movement sonification. Results show the interlaced interaction effects of these factors and how they unfold in the three-day learning process. For gesture 1, which is fast and dynamic with a direct “action-sound” sonification, movement sonification induces higher variability for both musicians and non-musicians on day 1. While musicians reduce this variability to a similar level as no auditory feedback condition on day 2 and day 3, non-musicians remain to have significantly higher variability. Across three days, musicians also have significantly lower variability than non-musicians. For gesture 2, which is slow and smooth with an “action-music” metaphor, there are virtually no effects. Based on these findings, we recommend future studies to take into account participants’ musical background, consider longitudinal study to examine these effects on complex gestures, and use awareness when interpreting the results given a specific design of gesture and sound.


2022 ◽  
Vol 19 (1) ◽  
pp. 1-17
Author(s):  
Hye Ji Kim ◽  
Michael Neff ◽  
Sung-Hee Lee

Laban Movement Analysis (LMA) and its Effort element provide a conceptual framework through which we can observe, describe, and interpret the intention of movement. Effort attributes provide a link between how people move and how their movement communicates to others. It is crucial to investigate the perceptual characteristics of Effort to validate whether it can serve as an effective framework to support a wide range of applications in animation and robotics that require a system for creating or perceiving expressive variation in motion. To this end, we first constructed an Effort motion database of short video clips of five different motions: walk, sit down, pass, put, wave performed in eight ways corresponding to the extremes of the Effort elements. We then performed a perceptual evaluation to examine the perceptual consistency and perceived associations among Effort elements: Space (Indirect/Direct), Time (Sustained/Sudden), Weight (Light/Strong), and Flow (Free/Bound) that appeared in the motion stimuli. The results of the perceptual consistency evaluation indicate that although the observers do not perceive the LMA Effort element 100% as intended, true response rates of seven Effort elements are higher than false response rates except for light Effort. The perceptual consistency results showed varying tendencies by motion. The perceptual association between LMA Effort elements showed that a single LMA Effort element tends to co-occur with the elements of other factors, showing significant correlation with one or two factors (e.g., indirect and free, light and free).


2021 ◽  
Vol 18 (4) ◽  
pp. 1-15
Author(s):  
Jonathan Ehret ◽  
Andrea Bönsch ◽  
Lukas Aspöck ◽  
Christine T. Röhr ◽  
Stefan Baumann ◽  
...  

For conversational agents’ speech, either all possible sentences have to be prerecorded by voice actors or the required utterances can be synthesized. While synthesizing speech is more flexible and economic in production, it also potentially reduces the perceived naturalness of the agents among others due to mistakes at various linguistic levels. In our article, we are interested in the impact of adequate and inadequate prosody, here particularly in terms of accent placement, on the perceived naturalness and aliveness of the agents. We compare (1) inadequate prosody, as generated by off-the-shelf text-to-speech (TTS) engines with synthetic output; (2) the same inadequate prosody imitated by trained human speakers; and (3) adequate prosody produced by those speakers. The speech was presented either as audio-only or by embodied, anthropomorphic agents, to investigate the potential masking effect by a simultaneous visual representation of those virtual agents. To this end, we conducted an online study with 40 participants listening to four different dialogues each presented in the three Speech levels and the two Embodiment levels. Results confirmed that adequate prosody in human speech is perceived as more natural (and the agents are perceived as more alive) than inadequate prosody in both human (2) and synthetic speech (1). Thus, it is not sufficient to just use a human voice for an agents’ speech to be perceived as natural—it is decisive whether the prosodic realisation is adequate or not. Furthermore, and surprisingly, we found no masking effect by speaker embodiment, since neither a human voice with inadequate prosody nor a synthetic voice was judged as more natural, when a virtual agent was visible compared to the audio-only condition. On the contrary, the human voice was even judged as less “alive” when accompanied by a virtual agent. In sum, our results emphasize, on the one hand, the importance of adequate prosody for perceived naturalness, especially in terms of accents being placed on important words in the phrase, while showing, on the other hand, that the embodiment of virtual agents plays a minor role in the naturalness ratings of voices.


2021 ◽  
Vol 18 (3) ◽  
pp. 1-18
Author(s):  
Anas Ali Alkasasbeh ◽  
Fotios Spyridonis ◽  
Gheorghita Ghinea

Current authentication processes overwhelmingly rely on audiovisual data, comprising images, text or audio. However, the use of olfactory data (scents) has remained unexploited in the authentication process, notwithstanding their verified potential to act as cues for information recall. Accordingly, in this paper, a new authentication process is proposed in which olfactory media are used as cues in the login phase. To this end, PassSmell , a proof of concept authentication application, is developed in which words and olfactory media act as passwords and olfactory passwords, respectively. In order to evaluate the potential of PassSmell, two different versions were developed, namely one which was olfactory-enhanced and another which did not employ olfactory media. Forty-two participants were invited to take part in the experiment, evenly split into a control and experimental group. For assessment purposes, we recorded the time taken to logon as well as the number of failed/successful login attempts; we also asked users to complete a Quality of Experience (QoE) questionnaire. In terms of time taken, a significant difference was found between the experimental and the control groups, as determined by an independent sample t-test. Similar results were found with respect to average scores and the number of successful attempts. Regarding user QoE, having olfactory media with words influenced the users positively, emphasizing the potential of using this kind of authentication application in the future.


2021 ◽  
Vol 18 (3) ◽  
pp. 1-19
Author(s):  
Benjamin Bressolette ◽  
Sébastien Denjean ◽  
Vincent Roussarie ◽  
Mitsuko Aramaki ◽  
Sølvi Ystad ◽  
...  

Most recent vehicles are equipped with touchscreens, which replace arrays of buttons that control secondary driving functions, such as temperature level, strength of ventilation, GPS, or choice of radio stations. While driving, manipulating such interfaces can be problematic in terms of safety, because they require the drivers’ sight. In this article, we develop an innovative interface, MovEcho, which is piloted with gestures and associated with sounds that are used as informational feedback. We compare this interface to a touchscreen in a perceptual experiment that took place in a driving simulator. The results show that MovEcho allows for a better visual task completion related to traffic and is preferred by the participants. These promising results in a simulator condition have to be confirmed in future studies, in a real vehicle with a comparable expertise for both interfaces.


2021 ◽  
Vol 18 (3) ◽  
pp. 1-22
Author(s):  
Charlotte M. Reed ◽  
Hong Z. Tan ◽  
Yang Jiao ◽  
Zachary D. Perez ◽  
E. Courtenay Wilson

Stand-alone devices for tactile speech reception serve a need as communication aids for persons with profound sensory impairments as well as in applications such as human-computer interfaces and remote communication when the normal auditory and visual channels are compromised or overloaded. The current research is concerned with perceptual evaluations of a phoneme-based tactile speech communication device in which a unique tactile code was assigned to each of the 24 consonants and 15 vowels of English. The tactile phonemic display was conveyed through an array of 24 tactors that stimulated the dorsal and ventral surfaces of the forearm. Experiments examined the recognition of individual words as a function of the inter-phoneme interval (Study 1) and two-word phrases as a function of the inter-word interval (Study 2). Following an average training period of 4.3 hrs on phoneme and word recognition tasks, mean scores for the recognition of individual words in Study 1 ranged from 87.7% correct to 74.3% correct as the inter-phoneme interval decreased from 300 to 0 ms. In Study 2, following an average of 2.5 hours of training on the two-word phrase task, both words in the phrase were identified with an accuracy of 75% correct using an inter-word interval of 1 sec and an inter-phoneme interval of 150 ms. Effective transmission rates achieved on this task were estimated to be on the order of 30 to 35 words/min.


2021 ◽  
Vol 18 (3) ◽  
pp. 1-21
Author(s):  
Hui Wei ◽  
Jingmeng Li

The edges of an image contains rich visual cognitive cues. However, the edge information of a natural scene usually is only a set of disorganized unorganized pixels for a computer. In psychology, the phenomenon of quickly perceiving global information from a complex pattern is called the global precedence effect (GPE). For example, when one observes the edge map of an image, some contours seem to automatically “pop out” from the complex background. This is a manifestation of GPE on edge information and is called global contour precedence (GCP). The primary visual cortex (V1) is closely related to the processing of edges. In this article, a neural computational model to simulate GCP based on the mechanisms of V1 is presented. There are three layers in the proposed model: the representation of line segments, organization of edges, and perception of global contours. In experiments, the ability to group edges is tested on the public dataset BSDS500. The results show that the grouping performance, robustness, and time cost of the proposed model are superior to those of other methods. In addition, the outputs of the proposed model can also be applied to the generation of object proposals, which indicates that the proposed model can contribute significantly to high-level visual tasks.


2021 ◽  
Vol 18 (2) ◽  
pp. 1-16
Author(s):  
Holly C. Gagnon ◽  
Carlos Salas Rosales ◽  
Ryan Mileris ◽  
Jeanine K. Stefanucci ◽  
Sarah H. Creem-Regehr ◽  
...  

Augmented reality ( AR ) is important for training complex tasks, such as navigation, assembly, and medical procedures. The effectiveness of such training may depend on accurate spatial localization of AR objects in the environment. This article presents two experiments that test egocentric distance perception in augmented reality within and at the boundaries of action space (up to 35 m) in comparison with distance perception in a matched real-world ( RW ) environment. Using the Microsoft HoloLens, in Experiment 1, participants in two different RW settings judged egocentric distances (ranging from 10 to 35 m) to an AR avatar or a real person using a visual matching measure. Distances to augmented targets were underestimated compared to real targets in the two indoor, RW contexts. Experiment 2 aimed to generalize the results to an absolute distance measure using verbal reports in one of the indoor environments. Similar to Experiment 1, distances to augmented targets were underestimated compared to real targets. We discuss these findings with respect to the importance of methodologies that directly compare performance in real and mediated environments, as well as the inherent differences present in mediated environments that are “matched” to the real world.


Sign in / Sign up

Export Citation Format

Share Document