Proceedings of the 25th International Conference on Auditory Display (ICAD 2019)
Latest Publications


TOTAL DOCUMENTS

51
(FIVE YEARS 51)

H-INDEX

2
(FIVE YEARS 2)

Published By Department Of Computer And Information Sciences, Northumbria University

0967090466

Author(s):  
Keenan R. May ◽  
Briana Sobel ◽  
Jeff Wilson ◽  
Bruce N. Walker

In both extreme and everyday situations, humans need to find nearby objects that cannot be located visually. In such situations, auditory display technology could be used to display information supporting object targeting. Unfortunately, spatial audio inadequately conveys sound source elevation, which is crucial for locating objects in 3D space. To address this, three auditory display concepts were developed and evaluated in the context of finding objects within a virtual room, in either low or no visibility conditions: (1) a one-time height-denoting “area cue,” (2) ongoing “proximity feedback,” or (3) both. All three led to improvements in performance and subjective workload compared to no sound. Displays (2) and (3) led to the largest improvements. This pattern was smaller, but still present, when visibility was low, compared to no visibility. These results indicate that persons who need to locate nearby objects in limited visibility conditions could benefit from the types of auditory displays considered here.


Author(s):  
Rébecca Kleinberger ◽  
George Stefanakis ◽  
Sebastian Franjou

Changing the way one hears one’s own voice, for instance by adding delay or shifting the pitch in real-time, can alter vocal qualities such as speed, pitch contour, or articulation. We created new types of auditory feedback called Speech Companions that generate live musical accompaniment to the spoken voice. Our system generates harmonized chorus effects layered on top of the speaker’s voice that change chord at each pseudo-beat detected in the spoken voice. The harmonization variations follow predeter-mined chord progressions. For the purpose of this study we generated two versions: one following a major chord progression and the other one following a minor chord progression. We conducted an evaluation of the effects of the feedback on speakers and we present initial findings assessing how different musical modulations might potentially affect the emotions and mental state of the speaker as well as semantic content of speech, and musical vocal parameters.


Author(s):  
Yliess Hati ◽  
Francis Rousseaux ◽  
Clément Duhart

Personal assistants are becoming more pervasive in our envi-ronments but still do not provide natural interactions. Their lack of realism in term of expressiveness and their lack of visual feedback can create frustrating experiences and make users lose patience. In this sense, we propose an end-to-end trainable neural architecture for text-driven 3D mouth animations. Previous works showed such architectures provide better realism and could open the door for integrated affective Human Computer Interface (HCI). Our study shows that such visual feedback improves users’ comfort for 78%of the candidates significantly while slightly improving their time perception.


Author(s):  
Richard Savery ◽  
Madhukesh Ayyagari ◽  
Keenan R. May ◽  
Bruce N. Walker

We present multiple approaches to soccer sonification, focusing on enhancing the experience for a general audience. For this work, we developed our own soccer data set through computer vision analysis of footage from a tactical overhead camera. This data set included X, Y, coordinates for the ball and players throughout, as well as passes, steals and goals. After a divergent creation process, we developed four main methods of sports sonification for entertainment. For the Tempo Variation and Pitch Variation methods, tempo or pitch is operationalized to demonstrate ball and player movement data. The Key Moments method features only pass, steal and goal data, while the Musical Moments method takes ex-isting music and attempts to align the track with important data points. Evaluation was done using a combination of qualitative focus groups and quantitative surveys, with 36 participants completing hour long sessions. Results indicated an overall preference for the Pitch Variation and Musical Moments methods, and revealed a robust trade-off between usability and enjoyability.


Author(s):  
Thomas Hermann ◽  
Marian Weger

We introduce Auditory Contrast Enhancement (ACE) as a technique to enhance sounds at hand of a given collection of sound or sonification examples that belong to different classes, such as sounds of machines with and without a certain malfunction, or medical data sonifications for different pathologies/conditions. A frequent use case in inductive data mining is the discovery of patterns in which such groups can be discerned, to guide subsequent paths for modelling and feature extraction. ACE provides researchers with a set of methods to render focussed auditory perspectives that accentuate inter-group differences and in turn also enhance the intra-group similarity, i.e. it warps sounds so that our human built-in metrics for assessing differences between sounds is better aligned to systematic differences between sounds belonging to different classes. We unfold and detail the concept along three different lines: temporal, spectral and spectrotemporal auditory contrast enhancement and we demonstrate their performance at hand of given sound and sonification collections.


Author(s):  
Pontus Larsson ◽  
Justyna Maculewicz ◽  
Johan Fagerlönn ◽  
Max Lachmann

The current position paper discusses vital challenges related to the user experience design in unsupervised, highly automated cars. These challenges are: (1) how to avoid motion sickness, (2) how to ensure users’ trust in the automation, (3) how to ensure usability and support the formation of accurate mental models of the automation system, and (4) how to provide a pleasant and enjoyable experience. We argue for that auditory displays have the potential to help solve these issues. While auditory displays in modern vehicles typically make use of discrete and salient cues, we argue that the use of less intrusive continuous sonic interaction could be more beneficial for the user experience.


Author(s):  
Ivica Ico Bukvic ◽  
Gregory Earle ◽  
Disha Sardana ◽  
Woohun Joo

The Spatial Audio Data Immersive Experience (SADIE) project aims to identify new foundational relationships pertaining to hu-man spatial aural perception, and to validate existing relation-ships. Our infrastructure consists of an intuitive interaction in-terface, an immersive exocentric sonification environment, and a layer-based amplitude-panning algorithm. Here we highlight the system’s unique capabilities and provide findings from an initial externally funded study that focuses on the assessment of human aural spatial perception capacity. When compared to the existing body of literature focusing on egocentric spatial perception, our data show that an immersive exocentric environment enhances spatial perception, and that the physical implementation using high density loudspeaker arrays enables significantly improved spatial perception accuracy relative to the egocentric and virtual binaural approaches. The preliminary observations suggest that human spatial aural perception capacity in real-world-like immersive exocentric environments that allow for head and body movement is significantly greater than in egocentric scenarios where head and body movement is restricted. Therefore, in the design of immersive auditory displays, the use of immersive exocentric environments is advised. Further, our data identify a significant gap between physical and virtual human spatial aural perception accuracy, which suggests that further development of virtual aural immersion may be necessary before such an approach may be seen as a viable alternative.


Author(s):  
Ognjen Miljic ◽  
Zoltan Bardosi ◽  
Wolfgang Freysinger

For patients with ineffective auditory nerve and complete hearing loss, Auditory Brainstem Implant (ABI) presents diversity of hearing sensations to help with sound consciousness and communication. At present, during the surgical intervention, surgeons use pre-operative patient images to determine optimal position of an ABI on cochlear nucleus on brainstem. When found, the optimal position is marked and mentally mapped by the surgeon. Next, the surgeon tries to locate the optimal position in patient’s head again and places the ABI. The aim of this project is to provide the surgeon with maximum clinical application accuracy guidance to store the optimal position for the implant, and to provide intuitive audio guidance for positioning the implant at the stored optimal position. By using three audio methods, in combination with visual information on Image-Guided Surgery (IGS), surgeon should spend less time looking at the screen, and more time focused on the patient.


Author(s):  
Niklas Rönnberg ◽  
Jonas Löwgren

Photone is an interactive installation combining color images with musical sonification. The musical expression is generated based on the syntactic (as opposed to semantic) features of an image as it is explored by the user’s pointing device, intending to catalyze a holistic user experience we refer to as modal synergy where visual and auditory modalities multiply rather than add. We collected and analyzed two months’ worth of data from visitors’ interactions with Photone in a public exhibition at a science center. Our results show that a small proportion of visitors engaged in sustained interaction with Photone, as indicated by session times. Among the most deeply engaged visitors, a majority of the interaction was devoted to visually salient objects, i.e., semantic features of the images. However, the data also contains instances of interactive behavior that are best explained by exploration of the syntactic features of an image, and thus may suggest the emergence of modal synergy.


Author(s):  
Brian Hansen ◽  
Leya Breanna Baltaxe-Admony ◽  
Sri Kurniawan ◽  
Angus G. Forbes

In this paper, we explore how sonic features can be used to represent network data structures that define relationships between elements. Representations of networks are pervasive in contemporary life (social networks, route planning, etc), and network analysis is an increasingly important aspect of data science (data mining, biological modeling, deep learning, etc). We present our initial findings on the ability of users to understand, decipher, and recreate sound representations to support primary network tasks, such as counting the number of elements in a network, identifying connections between nodes, determining the relative weight of connections between nodes, and recognizing which category an element belongs to. The results of an initial exploratory study (n=6) indicate that users are able to conceptualize mappings between sounds and visual network features, but that when asked to produce a visual representation of sounds users tend to generate outputs that closely resemble familiar musical notation. A more in-depth pilot study (n=26) more specifically examined which sonic parameters (melody, harmony, timbre, rhythm, dynamics) map most effectively to network features (node count, node classification, connectivity, edge weight). Our results indicate that users can conceptualize relationships between sound features and network features, and can create or use mappings between the aural and visual domains.


Sign in / Sign up

Export Citation Format

Share Document