Speech, Image, and Language Processing for Human Computer Interaction
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

2
(FIVE YEARS 0)

Published By IGI Global

9781466609549, 9781466609556

Author(s):  
Tanveer J. Siddiqui ◽  
Uma Shanker Tiwary

Spoken dialogue systems are a step forward towards the realization of human-like interaction with computer-based systems. This chapter focuses on issues related to spoken dialog systems. It presents a general architecture for spoken dialogue systems for human-computer interaction, describes its components, and highlights key research challenges in them. One important variation in the architecture is modeling knowledge as a separate component. This is unlike existing dialogue systems in which knowledge is usually embedded within other components. This separation makes the architecture more general. The chapter also discusses some of the existing evaluation methods for spoken dialogue systems.


Author(s):  
Uma Shanker Tiwary ◽  
Tanveer J. Siddiqui

The objective of this chapter is twofold. On one hand, it tries to introduce and present various components of Human Computer Interaction (HCI), if HCI is modeled as a process of cognition; on the other hand, it tries to underline those representations and mechanisms which are required to develop a general framework for a collaborative HCI. One must try to separate the specific problem solving skills and specific problem related knowledge from the general skills and knowledge acquired in interactive agents for future use. This separation leads to a distributed deep interaction layer consisting of many cognitive processes. A three layer architecture has been suggested for designing collaborative HCI with multiple human and computational agents.


Author(s):  
Andrew Molineux ◽  
Keith Cheverst

In recent years, vision recognition applications have made the transition from desktop computers to mobile phones. This has allowed a new range of mobile interactions and applications to be realised. However, this shift has unearthed new issues in mobile hardware, interactions and usability. As such the authors present a survey into mobile vision recognition, outlining a number of academic and commercial applications, analysing what tasks they are able to perform and how they achieve them. The authors conclude with a discussion on the issues and trends found in the survey.


Author(s):  
Omar Farooq ◽  
Sekharjit Datta

The area of speech recognition has been thoroughly researched during the past fifty years; however, robustness is still an important challenge to overcome. It has been established that there exists a correlation between speech produced and lip motion which is helpful in the adverse background conditions to improve the recognition performance. This chapter presents main components used in audio-visual speech recognition systems. Results of a prototype experiment conducted on audio-visual corpora for Hindi speech have been reported of simple phoneme recognition task. The chapter also addresses some of the issues related to visual feature extraction and the integration of audio-visual and finally present future research directions.


Author(s):  
Pradipta Biswas

This chapter presents a brief survey of different user modelling techniques used in human computer interaction. It investigates history of development of user modelling techniques and classified the existing models into different categories. In the context of existing modelling approaches it presents a new user model and its deployment through a simulator to help designers in developing accessible systems for people with a wide range of abilities. This chapter will help system analysts and developers to select and use appropriate type of user models for their applications.


Author(s):  
Rashid Ali ◽  
M. M. Sufyan Beg

Metasearching is the process of combining search results of different search systems into a single set of ranked results which, in turn, is expected to provide us the collective benefit of using each of the participating search systems. Since, user is the direct beneficiary of the search results; this motivates the researchers in the field of Human Computer Interaction (HCI) to measure user satisfaction. A user is satisfied if he receives good quality search results in response to his query. To measure user satisfaction, we need to obtain feed back from user. This feedback might also be used to improve the quality of metasearching. The authors discuss the design of a metasearch system that is based on human computer interaction. We compare our method with two other methods Borda’s method and modified Shimura technique. The authors use Spearman’s footrule distance as the measure of comparison. Experimentally, the method performs better than the Borda’s method. The authors argue that the method is significant as it models the user feedback based metasearching and has spam-fighting capabilities.


Author(s):  
Hung-Pin Hsu

In recent years, Metaverse has become a new type of social network. It provides an integrated platform and interactive environment for users to design artifacts and cooperate with each other. Facing this new type of social network, this chapter focuses on the cognitive and interactive behavior of users in the collaborative design activities. The chapter consists of three stages. In stage one the chapter introduces related theories and previous studies in order to present the Metaverse features. In stage two, the author chooses two different design and interactive environments to compare with Metaverse, which are, a normal face to face environment and a regular distance environment. Then the author executes three experiments in these different environments. In stage three, the author analyzes the retrospective data of three experiments with qualitative analysis by undertaking contextual inquiries in order to structure cognitive and interactive models of three environments. Furthermore, the author also executes an in-depth interview to get the qualitative data of subjects’ opinion. Finally, the affinity diagrams could be established with these models and the interview to provide knowledge of Metaverse for readers who research or develop social network environment.


Author(s):  
David Griol ◽  
Zoraida Callejas ◽  
Ramón López-Cózar ◽  
Gonzalo Espejo ◽  
Nieves Ábalos

Multimodal systems have attained increased attention in recent years, which has made possible important improvements in the technologies for recognition, processing, and generation of multimodal information. However, there are still many issues related to multimodality which are not clear, for example, the principles that make it possible to resemble human-human multimodal communication. This chapter focuses on some of the most important challenges that researchers have recently envisioned for future multimodal interfaces. It also describes current efforts to develop intelligent, adaptive, proactive, portable and affective multimodal interfaces.


Author(s):  
Armin Mustafa ◽  
K.S. Venkatesh

This chapter aims to develop an ‘accessory-free’ or ‘minimum accessory’ interface used for communication and computation without the requirement of any specified gadgets such as finger markers, colored gloves, wrist bands, or touch screens. The authors detect various types of gestures, by finding fingertip point locations in a dynamic changing foreground projection with varying illumination on an arbitrary background using visual segmentation by reflectance modeling as opposite to recent approaches which use IR (invisible) channel to do so. The overall performance of the system was found to be adequately fast, accurate, and reliable. The objective is to facilitate in the future, a direct graphical interaction with mobile computing devices equipped with mini projectors instead of conventional displays. The authors term this a dynamic illumination environment as the projected light is liable to change continuously both in time and space and also varies with the content displayed on colored or white surface.


Author(s):  
Navarun Gupta ◽  
Armando Barreto

The role of binaural and immersive sound is becoming crucial in virtual reality and HCI related systems. This chapter proposes a structural model for the pinna, to be used as a block within structural models for the synthesis of Head-Related Transfer Functions, needed for digital audio spatialization. An anthropometrically plausible pinna model is presented, justified and verified by comparison with measured Head-Related Impulse Responses (HRIRs). Similarity levels better than 90% are found in this comparison. Further, the relationships between key anthropometric features of the listener and the parameters of the model are established, as sets of predictive equations. Modeled HRIRs are obtained substituting anthropometric features measured from 10 volunteers into the predictive equations to find the model parameters. These modeled HRIRs are used in listening tests by the subjects to assess the elevation of spatialized sound sources. The modeled HRIRs yielded a smaller average elevation error (29.9o) than “generic” HRIRs (31.4o), but higher than the individually measured HRIRs for the subjects (23.7o).


Sign in / Sign up

Export Citation Format

Share Document