scholarly journals Multi-modal Open World User Identification

2022 ◽  
Vol 11 (1) ◽  
pp. 1-50
Author(s):  
Bahar Irfan ◽  
Michael Garcia Ortiz ◽  
Natalia Lyubova ◽  
Tony Belpaeme

User identification is an essential step in creating a personalised long-term interaction with robots. This requires learning the users continuously and incrementally, possibly starting from a state without any known user. In this article, we describe a multi-modal incremental Bayesian network with online learning, which is the first method that can be applied in such scenarios. Face recognition is used as the primary biometric, and it is combined with ancillary information, such as gender, age, height, and time of interaction to improve the recognition. The Multi-modal Long-term User Recognition Dataset is generated to simulate various human-robot interaction (HRI) scenarios and evaluate our approach in comparison to face recognition, soft biometrics, and a state-of-the-art open world recognition method (Extreme Value Machine). The results show that the proposed methods significantly outperform the baselines, with an increase in the identification rate up to 47.9% in open-set and closed-set scenarios, and a significant decrease in long-term recognition performance loss. The proposed models generalise well to new users, provide stability, improve over time, and decrease the bias of face recognition. The models were applied in HRI studies for user recognition, personalised rehabilitation, and customer-oriented service, which showed that they are suitable for long-term HRI in the real world.

1991 ◽  
Vol 34 (5) ◽  
pp. 1180-1184 ◽  
Author(s):  
Larry E. Humes ◽  
Kathleen J. Nelson ◽  
David B. Pisoni

The Modified Rhyme Test (MRT), recorded using natural speech and two forms of synthetic speech, DECtalk and Votrax, was used to measure both open-set and closed-set speech-recognition performance. Performance of hearing-impaired elderly listeners was compared to two groups of young normal-hearing adults, one listening in quiet, and the other listening in a background of spectrally shaped noise designed to simulate the peripheral hearing loss of the elderly. Votrax synthetic speech yielded significant decrements in speech recognition compared to either natural or DECtalk synthetic speech for all three subject groups. There were no differences in performance between natural speech and DECtalk speech for the elderly hearing-impaired listeners or the young listeners with simulated hearing loss. The normal-hearing young adults listening in quiet out-performed both of the other groups, but there were no differences in performance between the young listeners with simulated hearing loss and the elderly hearing-impaired listeners. When the closed-set identification of synthetic speech was compared to its open-set recognition, the hearing-impaired elderly gained as much from the reduction in stimulus/response uncertainty as the two younger groups. Finally, among the elderly hearing-impaired listeners, speech-recognition performance was correlated negatively with hearing sensitivity, but scores were correlated positively among the different talker conditions. Those listeners with the greatest hearing loss had the most difficulty understanding speech and those having the most trouble understanding natural speech also had the greatest difficulty with synthetic speech.


Author(s):  
Nikolaos Mavridis ◽  
Michael Petychakis ◽  
Alexandros Tsamakos ◽  
Panos Toulis ◽  
Shervin Emami ◽  
...  

AbstractThe overarching goal of the FaceBots project is to support the achievement of sustainable long-term human-robot relationships through the creation of robots with face recognition and natural language capabilities, which exploit and publish online information, and especially social information available on Facebook, and which achieve two significant novelties. The underlying experimental hypothesis is that such relationships can be significantly enhanced if the human and the robot are gradually creating a pool of episodic memories that they can co-refer to (“shared memories”), and if they are both embedded in a social web of other humans and robots they mutually know (“shared friends”). We present a description of system architecture, as well as important concrete results regarding face recognition and transferability of training, with training and testing sets coming from either one or a combination of two sources: an onboard camera which can provide sequences of images, as well as facebook-derived photos. Furthermore, early interaction-related results are presented, and evaluation methodologies as well as interesting extensions are discussed.


1980 ◽  
Vol 45 (2) ◽  
pp. 223-238 ◽  
Author(s):  
Richard H. Wilson ◽  
June K. Antablin

The Picture Identification Task was developed to estimate the word-recognition performance of nonverbal adults. Four lists of 50 monosyllabic words each were assembled and recorded. Each test word and three rhyming alternatives were illustrated and photographed in a quadrant arrangement. The task of the patient was to point to the picture representing the recorded word that was presented through the earphone. In the first experiment with young adults, no significant differences were found between the Picture Identification Task and the Northwestern University Auditory Test No. 6 materials in an open-set response paradigm. In the second experiment, the Picture Identification Task with the picture-pointing response was compared with the Northwestern University Auditory Test No. 6 in both an open-set and a closed-set response paradigm. The results from this experiment demonstrated significant differences among the three response tasks. The easiest task was a closed-set response to words, the next was a closed-set response to pictures, and the most difficult task was an open-set response. At high stimulus-presentation levels, however, the three tasks produced similar results. Finally, the clinical use of the Picture Identification Task is described along with preliminary results obtained from 30 patients with various communicative impairments.


2020 ◽  
Vol 16 (3) ◽  
pp. 155014772091155
Author(s):  
Zhiqiang Liu ◽  
Wenbo Zhu ◽  
Hongzhou Zhang ◽  
Shengjin Wang ◽  
Lu Fang ◽  
...  

The reliability of face recognition system has the characteristics of fuzziness, randomness, and continuity. In order to measure it in unconstrained scenes, we find out and quantify key broad-sense and narrow-sense influencing factors of reliability on the basis of analyzing operation states for six dynamic face recognition systems in the practical use of six public security bureaus. In this article, we propose a novel evaluation method with True Positive Identification Rate in dynamic and M:N mode and create a novel evaluation model of system reliability with the improved Fuzzy Dynamic Bayesian Network. Subsequently, we infer to solve the fuzzy reliability state probabilities of the six systems with Netica and get two most important factors with the improved fuzzy C-means algorithm. We verify the model by comparing the evaluation results with actual achievements of these systems. Finally, we find several vulnerabilities in the system with the least reliability and put forward a few optimization strategies. The proposed method combines advantages of the improved fuzzy C-means model with those of the dynamic Bayesian network to evaluate the reliability of the dynamic face recognition systems, making the evaluation results more reasonable and realistic. It starts a new research of face recognition systems in unconstrained scenes and contributes to the research on face recognition performance evaluation and system reliability analysis. Besides, the proposed method is of practical significance in improving the reliability of the systems in use.


2014 ◽  
Vol 548-549 ◽  
pp. 939-942 ◽  
Author(s):  
Mi Young Cho ◽  
Young Sook Jeong ◽  
Byung Tae Chun

With the increasing of service robots, human-robot interaction for natural communication between user and robot is becoming more and more important. Especially, face recognition is a key issue of HRI. Even though robots mainly use face detection and recognition to provide various services, it is still difficult to guarantee of performance due to insufficient test methods in point of view robot. So, we propose a new performance evaluation method for robot using LED monitor.


2021 ◽  
Vol 18 (5) ◽  
pp. 6620-6637
Author(s):  
Yan Tang ◽  
◽  
Zhijin Zhao ◽  
Chun Li ◽  
Xueyi Ye ◽  
...  

<abstract> <p>For the existing Closed Set Recognition (CSR) methods mistakenly identify unknown jamming signals as a known class, a Conditional Gaussian Encoder (CG-Encoder) for 1-dimensional signal Open Set Recognition (OSR) is designed. The network retains the original form of the signal as much as possible and deep neural network is used to extract useful information. CG-Encoder adopts residual network structure and a new Kullback-Leibler (KL) divergence is defined. In the training phase, the known classes are approximated to different Gaussian distributions in the latent space and the discrimination between classes is increased to improve the recognition performance of the known classes. In the testing phase, a specific and effective OSR algorithm flow is designed. Simulation experiments are carried out on 9 jamming types. The results show that the CSR and OSR performance of CG-Encoder is better than that of the other three kinds of network structures. When the openness is the maximum, the open set average accuracy of CG-Encoder is more than 70%, which is about 30% higher than the worst algorithm, and about 20% higher than the better one. When the openness is the minimum, the average accuracy of OSR is more than 95%.</p> </abstract>


2020 ◽  
Vol 9 (11) ◽  
pp. 9353-9360
Author(s):  
G. Selvi ◽  
I. Rajasekaran

This paper deals with the concepts of semi generalized closed sets in strong generalized topological spaces such as $sg^{\star \star}_\mu$-closed set, $sg^{\star \star}_\mu$-open set, $g^{\star \star}_\mu$-closed set, $g^{\star \star}_\mu$-open set and studied some of its basic properties included with $sg^{\star \star}_\mu$-continuous maps, $sg^{\star \star}_\mu$-irresolute maps and $T_\frac{1}{2}$-space in strong generalized topological spaces.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Clara Borrelli ◽  
Paolo Bestagini ◽  
Fabio Antonacci ◽  
Augusto Sarti ◽  
Stefano Tubaro

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Adam Goodwin ◽  
Sanket Padmanabhan ◽  
Sanchit Hira ◽  
Margaret Glancey ◽  
Monet Slinowsky ◽  
...  

AbstractWith over 3500 mosquito species described, accurate species identification of the few implicated in disease transmission is critical to mosquito borne disease mitigation. Yet this task is hindered by limited global taxonomic expertise and specimen damage consistent across common capture methods. Convolutional neural networks (CNNs) are promising with limited sets of species, but image database requirements restrict practical implementation. Using an image database of 2696 specimens from 67 mosquito species, we address the practical open-set problem with a detection algorithm for novel species. Closed-set classification of 16 known species achieved 97.04 ± 0.87% accuracy independently, and 89.07 ± 5.58% when cascaded with novelty detection. Closed-set classification of 39 species produces a macro F1-score of 86.07 ± 1.81%. This demonstrates an accurate, scalable, and practical computer vision solution to identify wild-caught mosquitoes for implementation in biosurveillance and targeted vector control programs, without the need for extensive image database development for each new target region.


Sign in / Sign up

Export Citation Format

Share Document