On audio recognition performance via robust hashing

This investigation examined the degree to which modification of the consonant-vowel (C-V) intensity ratio affected consonant recognition under conditions in which listeners were forced to rely more heavily on waveform envelope cues than on spectral cues. The stimuli were 22 vowel-consonant-vowel utterances, which had been mixed at six different signal-to-noise ratios with white noise that had been modulated by the speech waveform envelope. The resulting waveforms preserved the gross speech envelope shape, but spectral cues were limited by the white-noise masking. In a second stimulus set, the consonant portion of each utterance was amplified by 10 dB. Sixteen subjects with normal hearing listened to the unmodified stimuli, and 16 listened to the amplified-consonant stimuli. Recognition performance was reduced in the amplified-consonant condition for some consonants, presumably because waveform envelope cues had been distorted. However, for other consonants, especially the voiced stops, consonant amplification improved recognition. Patterns of errors were altered for several consonant groups, including some that showed only small changes in recognition scores. The results indicate that when spectral cues are compromised, nonlinear amplification can alter waveform envelope cues for consonant recognition.

Download Full-text

Processing small, high-contrast letters reduces face recognition performance

PsycEXTRA Dataset ◽

10.1037/e527352012-334 ◽

2006 ◽

Author(s):

Peter J. Hills ◽

Michael B. Lewis

Keyword(s):

Face Recognition ◽

Recognition Performance ◽

High Contrast

Download Full-text

Computer-Based And Paper-Based Measurement Of Recognition Performance

PsycEXTRA Dataset ◽

10.1037/e539372006-001 ◽

1989 ◽

Author(s):

Pat-Anthony Federico

Keyword(s):

Recognition Performance ◽

Computer Based

Download Full-text

IRIS AND FINGER VEIN MULTI MODEL RECOGNITION SYSTEM BASED ON SIFT FEATURES

Journal of Advanced Sciences and Engineering Technologies ◽

10.32441/jaset.v1i2.119 ◽

2018 ◽

Vol 1 (2) ◽

pp. 34-44

Author(s):

Faris E Mohammed ◽

Dr. Eman M ALdaidamony ◽

Prof. A. M Raid

Keyword(s):

Iris Recognition ◽

Recognition Performance ◽

Recognition System ◽

Individual Identification ◽

Work Place ◽

Identification Process ◽

Finger Vein ◽

Noise Point ◽

Vein Recognition ◽

A New Technique

Individual identification process is a very significant process that resides a large portion of day by day usages. Identification process is appropriate in work place, private zones, banks …etc. Individuals are rich subject having many characteristics that can be used for recognition purpose such as finger vein, iris, face …etc. Finger vein and iris key-points are considered as one of the most talented biometric authentication techniques for its security and convenience. SIFT is new and talented technique for pattern recognition. However, some shortages exist in many related techniques, such as difficulty of feature loss, feature key extraction, and noise point introduction. In this manuscript a new technique named SIFT-based iris and SIFT-based finger vein identification with normalization and enhancement is proposed for achieving better performance. In evaluation with other SIFT-based iris or SIFT-based finger vein recognition algorithms, the suggested technique can overcome the difficulties of tremendous key-point extraction and exclude the noise points without feature loss. Experimental results demonstrate that the normalization and improvement steps are critical for SIFT-based recognition for iris and finger vein , and the proposed technique can accomplish satisfactory recognition performance. Keywords: SIFT, Iris Recognition, Finger Vein identification and Biometric Systems. © 2018 JASET, International Scholars and Researchers Association

Download Full-text

Short Report: Speeded Reasoning Moderates the Inverse Relationship Between Autistic-Like Traits and Emotion Recognition

10.31234/osf.io/6m8h9 ◽

2019 ◽

Author(s):

Alex Bertrams ◽

Katja Schlegel

Keyword(s):

Emotion Recognition ◽

Inverse Relationship ◽

Time Pressure ◽

Recognition Performance ◽

Autism Spectrum ◽

Autism Spectrum Quotient ◽

Short Report ◽

Reasoning Ability ◽

Reasoning Test ◽

The Mind

People high in autistic-like traits have been found to have difficulties with recognizing emotions from nonverbal expressions. However, findings on the autism—emotion recognition relationship are inconsistent. In the present study, we investigated whether speeded reasoning ability (reasoning performance under time pressure) moderates the inverse relationship between autistic-like traits and emotion recognition performance. We expected the negative correlation between autistic-like traits and emotion recognition to be less strong when speeded reasoning ability was high. MTurkers (N = 217) completed the ten item version of the Autism Spectrum Quotient (AQ-10), two emotion recognition tests using videos with sound (Geneva Emotion Recognition Test, GERT-S) and pictures (Reading the Mind in the Eyes Test, RMET), and Baddeley's Grammatical Reasoning test to measure speeded reasoning. As expected, the higher the ability in speeded reasoning, the less were higher autistic-like traits related to lower emotion recognition performance. These results suggest that a high ability in making quick mental inferences may (partly) compensate for difficulties with intuitive emotion recognition related to autistic-like traits.

Download Full-text

Robustness Speaker Recognition Based on Feature Space in Clean and Noisy Condition

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327909666181219143918 ◽

2019 ◽

Vol 9 (4) ◽

pp. 497-506 ◽

Cited By ~ 1

Author(s):

Khamis A. Al-Karawi

Keyword(s):

Speech Processing ◽

Speaker Recognition ◽

System Performance ◽

Speaker Verification ◽

Signal To Noise Ratio ◽

Recognition Performance ◽

Feature Space ◽

Signal To Noise ◽

Verification Systems ◽

Noisy Condition

Background & Objective: Speaker Recognition (SR) techniques have been developed into a relatively mature status over the past few decades through development work. Existing methods typically use robust features extracted from clean speech signals, and therefore in idealized conditions can achieve very high recognition accuracy. For critical applications, such as security and forensics, robustness and reliability of the system are crucial. Methods: The background noise and reverberation as often occur in many real-world applications are known to compromise recognition performance. To improve the performance of speaker verification systems, an effective and robust technique is proposed to extract features for speech processing, capable of operating in the clean and noisy condition. Mel Frequency Cepstrum Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GFCC) are the mature techniques and the most common features, which are used for speaker recognition. MFCCs are calculated from the log energies in frequency bands distributed over a mel scale. While GFCC has been acquired from a bank of Gammatone filters, which was originally suggested to model human cochlear filtering. This paper investigates the performance of GFCC and the conventional MFCC feature in clean and noisy conditions. The effects of the Signal-to-Noise Ratio (SNR) and language mismatch on the system performance have been taken into account in this work. Conclusion: Experimental results have shown significant improvement in system performance in terms of reduced equal error rate and detection error trade-off. Performance in terms of recognition rates under various types of noise, various Signal-to-Noise Ratios (SNRs) was quantified via simulation. Results of the study are also presented and discussed.

Download Full-text

IMPROVING EIGENSPACE-BASED FUZZY LOGIC SYSTEM USING A LINEAR INTERPOLATION SCHEME FOR SPEECH PATTERN RECOGNITION

Transactions of the Canadian Society for Mechanical Engineering ◽

10.1139/tcsme-2013-0049 ◽

2013 ◽

Vol 37 (3) ◽

pp. 611-620

Author(s):

Ing-Jr Ding ◽

Chih-Ta Yen

Keyword(s):

Pattern Recognition ◽

Fuzzy Logic ◽

Recognition Performance ◽

Linear Interpolation ◽

Fuzzy Logic System ◽

Interpolation Scheme ◽

Speech Pattern ◽

Eigen Value ◽

Target Speaker ◽

Logic System

The Eigen-FLS approach using an eigenspace-based scheme for fast fuzzy logic system (FLS) establishments has been attempted successfully in speech pattern recognition. However, speech pattern recognition by Eigen-FLS will still encounter a dissatisfactory recognition performance when the collected data for eigen value calculations of the FLS eigenspace is scarce. To tackle this issue, this paper proposes two improved-versioned Eigen-FLS methods, incremental MLED Eigen-FLS and EigenMLLR-like Eigen-FLS, both of which use a linear interpolation scheme for properly adjusting the target speaker’s Eigen-FLS model derived from an FLS eigenspace. Developed incremental MLED Eigen-FLS and EigenMLLR-like Eigen-FLS are superior to conventional Eigen-FLS especially in the situation of insufficient data from the target speaker.

Download Full-text

CorrNet: Fine-Grained Emotion Recognition for Video Watching Using Wearable Physiological Sensors

Sensors ◽

10.3390/s21010052 ◽

2020 ◽

Vol 21 (1) ◽

pp. 52

Author(s):

Tianyi Zhang ◽

Abdallah El Ali ◽

Chen Wang ◽

Alan Hanjalic ◽

Pablo Cesar

Keyword(s):

Emotion Recognition ◽

Electrodermal Activity ◽

Short Form ◽

Recognition Performance ◽

Binary Classification ◽

Wearable Sensors ◽

Recognition Algorithm ◽

Fine Grained ◽

Physiological Sensors ◽

Valence And Arousal

Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1–4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (≤64 Hz) (3) large amounts of neutral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance.

Download Full-text

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

International Journal of Automation and Computing ◽

10.1007/s11633-021-1292-1 ◽

2021 ◽

Author(s):

Wei Jia ◽

Wei Xia ◽

Yang Zhao ◽

Hai Min ◽

Yan-Xiang Chen

Keyword(s):

Deep Learning ◽

Recognition Performance ◽

Research Direction ◽

Palmprint Recognition ◽

Neural Architecture ◽

Development Direction ◽

Vein Recognition ◽

Palm Vein ◽

2D And 3D ◽

Important Research Direction

AbstractPalmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model’s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.

Download Full-text

Deep Convolutional Neural Network with RNNs for Complex Activity Recognition Using Wrist-Worn Wearable Sensor Data

Electronics ◽

10.3390/electronics10141685 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1685

Author(s):

Sakorn Mekruksavanich ◽

Anuchit Jitpattanakul

Keyword(s):

Neural Networks ◽

Activity Recognition ◽

Human Activities ◽

Recognition Performance ◽

Confusion Matrix ◽

Experimental Studies ◽

Industrial Applications ◽

Sensor Data ◽

Complex Activity ◽

Activity Data

Sensor-based human activity recognition (S-HAR) has become an important and high-impact topic of research within human-centered computing. In the last decade, successful applications of S-HAR have been presented through fruitful academic research and industrial applications, including for healthcare monitoring, smart home controlling, and daily sport tracking. However, the growing requirements of many current applications for recognizing complex human activities (CHA) have begun to attract the attention of the HAR research field when compared with simple human activities (SHA). S-HAR has shown that deep learning (DL), a type of machine learning based on complicated artificial neural networks, has a significant degree of recognition efficiency. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are two different types of DL methods that have been successfully applied to the S-HAR challenge in recent years. In this paper, we focused on four RNN-based DL models (LSTMs, BiLSTMs, GRUs, and BiGRUs) that performed complex activity recognition tasks. The efficiency of four hybrid DL models that combine convolutional layers with the efficient RNN-based models was also studied. Experimental studies on the UTwente dataset demonstrated that the suggested hybrid RNN-based models achieved a high level of recognition performance along with a variety of performance indicators, including accuracy, F1-score, and confusion matrix. The experimental results show that the hybrid DL model called CNN-BiGRU outperformed the other DL models with a high accuracy of 98.89% when using only complex activity data. Moreover, the CNN-BiGRU model also achieved the highest recognition performance in other scenarios (99.44% by using only simple activity data and 98.78% with a combination of simple and complex activities).

Download Full-text

On audio recognition performance via robust hashing

Effect of Consonant-Vowel Ratio Modification on Amplitude Envelope Cues for Consonant Recognition

Processing small, high-contrast letters reduces face recognition performance

Computer-Based And Paper-Based Measurement Of Recognition Performance

IRIS AND FINGER VEIN MULTI MODEL RECOGNITION SYSTEM BASED ON SIFT FEATURES

Short Report: Speeded Reasoning Moderates the Inverse Relationship Between Autistic-Like Traits and Emotion Recognition

Robustness Speaker Recognition Based on Feature Space in Clean and Noisy Condition

IMPROVING EIGENSPACE-BASED FUZZY LOGIC SYSTEM USING A LINEAR INTERPOLATION SCHEME FOR SPEECH PATTERN RECOGNITION

CorrNet: Fine-Grained Emotion Recognition for Video Watching Using Wearable Physiological Sensors

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Deep Convolutional Neural Network with RNNs for Complex Activity Recognition Using Wrist-Worn Wearable Sensor Data

Export Citation Format