Multimodal detection and recognition performance of sonar operators

1987 ◽  
Vol 18 (3) ◽  
pp. 249 ◽  
Author(s):  
D.A. Kobus ◽  
J. Russotti ◽  
C. Schlichting ◽  
G. Haskell ◽  
S. Carpenter ◽  
...  
1992 ◽  
Vol 35 (4) ◽  
pp. 942-949 ◽  
Author(s):  
Christopher W. Turner ◽  
David A. Fabry ◽  
Stephanie Barrett ◽  
Amy R. Horwitz

This study examined the possibility that hearing-impaired listeners, in addition to displaying poorer-than-normal recognition of speech presented in background noise, require a larger signal-to-noise ratio for the detection of the speech sounds. Psychometric functions for the detection and recognition of stop consonants were obtained from both normal-hearing and hearing-impaired listeners. Expressing the speech levels in terms of their short-term spectra, the detection of consonants for both subject groups occurred at the same signal-to-noise ratio. In contrast, the hearing-impaired listeners displayed poorer recognition performance than the normal-hearing listeners. These results imply that the higher signal-to-noise ratios required for a given level of recognition by some subjects with hearing loss are not due in part to a deficit in detection of the signals in the masking noise, but rather are due exclusively to a deficit in recognition.


2021 ◽  
Vol 3 (11) ◽  
Author(s):  
Abhra Chaudhuri ◽  
Palaiahnakote Shivakumara ◽  
Pinaki Nath Chowdhury ◽  
Umapada Pal ◽  
Tong Lu ◽  
...  

Abstract For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity of the problem to improve text detection and recognition performance. Here, we consider the following five categories of genres, namely concert, cooking, craft, teleshopping and yoga. For classifying action-oriented video images, we explore ResNet50 for learning the general pixel-distribution level information and the VGG16 network is implemented for learning the features of Maximally Stable Extremal Regions and again another VGG16 is used for learning facial components obtained by a multitask cascaded convolutional network. The approach integrates the outputs of the three above-mentioned models using a fully connected neural network for classification of five action-oriented image classes. We demonstrated the efficacy of the proposed method by testing on our dataset and two other standard datasets, namely, Scene Text Dataset dataset which contains 10 classes of scene images with text information, and the Stanford 40 Actions dataset which contains 40 action classes without text information. Our method outperforms the related existing work and enhances the class-specific performance of text detection and recognition, significantly. Article highlights The method uses pixel, stable-region and face-component information in a noble way for solving complex classification problems. The proposed work fuses different deep learning models for successful classification of action-oriented images. Experiments on our own dataset as well as standard datasets show that the proposed model outperforms related state-of-the-art (SOTA) methods.


Author(s):  
Songjie Wei ◽  
Pengfei Jiang ◽  
Qiuzhuang Yuan ◽  
Meilin Liu

Synthetic aperture radar(SAR) ship target detection plays an increasingly important role in marine monitoring. Aimed at the problems of recognizing small size of ship targets in SAR images and the inability of traditional methods to extract fine target features due to external disturbances, we propose an improved SAR small target detection model based on the deep learning technology. The proposed model mainly consists of two parts:region proposal network(RPN) and object detection network. Firstly, a CNN model is designed and trained to accurately identify small ship targets. Then, the model is used to initialize the parameters of the shared feature extraction layer. Last, we train the proposed object detection model using a self-collected Sentinel-1 SAR small target dataset. The experimental results show that the proposed target detection model has better detection and recognition performance and anti-interference ability for small ship scalable targets in SAR images, and has certain reference value for the research of small target detection in SAR images.


2019 ◽  
Vol 8 (4) ◽  
pp. 9372-9376

In the war prone global arena where every target of an enemy, on and off the land to be kept on eagle’s eye watch and to deceive enemy’s counter measure technologies, there is a strong requirement for most secured signal design which domicile maximum detection range along with high range resolution, to flare them up .The efficiency of the system mainly depends on how much power the generated pulse may possess in the main lobe to that of the side lobes and how independent they are from one another. The measure of Autocorrelation and cross correlation exhibited by polyphase coded sequence is determined mathematically. Multi input and multi output possesses best potential in mitigating the effects of fading, enhancing the resolution, suppressing the signal jamming and interference, which are all very useful in improving the target detection and recognition performance of the system. Recently, most optimization researches of polyphase codes are carried out by genetic algorithm (GA), Particle swarm optimization (PSO) and simulated annealing (SA) but this technique require more parameters for optimization. In order to overcome these difficulties modified PSO algorithm is adopted in the present research to optimize the polyphase coded sequence. The obtained codes are also put for Doppler resilience by including Doppler shift in the process of generation of the codes.


1980 ◽  
Vol 24 (1) ◽  
pp. 531-535
Author(s):  
Alice K. Agin

The purpose of this study was to evaluate operator target detection and recognition performance under realistic conditions to provide design information for the Army Mini-RPV (Remotely Piloted Vehicle) system. Bandwidth compression, atmospheric attenuation, target numerosity, target aspect angle, target type, and background complexity were investigated. The tasks included detection and recognition of realistic tactical vehicle targets with a 256 by 262 element video system. There were five bandwidth compression levels with respect to a 3.02 megabit per second uncompressed rate. Detection performance was significantly degraded at a 15:1 compression ratio, and recognition was degraded at a 7.5:1 compression ratio. Both target numerosity and aspect angle were associated with significant differences in performance. Atmospheric attenuation, target type and background complexity effects were not statistically significant.


Perception ◽  
2021 ◽  
pp. 030100662110559
Author(s):  
Myron Tsikandilakis ◽  
Zhaoliang Yu ◽  
Leonie Kausel ◽  
Gonzalo Boncompte ◽  
Renzo C. Lanfranco ◽  
...  

The theory of universal emotions suggests that certain emotions such as fear, anger, disgust, sadness, surprise and happiness can be encountered cross-culturally. These emotions are expressed using specific facial movements that enable human communication. More recently, theoretical and empirical models have been used to propose that universal emotions could be expressed via discretely different facial movements in different cultures due to the non-convergent social evolution that takes place in different geographical areas. This has prompted the consideration that own-culture emotional faces have distinct evolutionary important sociobiological value and can be processed automatically, and without conscious awareness. In this paper, we tested this hypothesis using backward masking. We showed, in two different experiments per country of origin, to participants in Britain, Chile, New Zealand and Singapore, backward masked own and other-culture emotional faces. We assessed detection and recognition performance, and self-reports for emotionality and familiarity. We presented thorough cross-cultural experimental evidence that when using Bayesian assessment of non-parametric receiver operating characteristics and hit-versus-miss detection and recognition response analyses, masked faces showing own cultural dialects of emotion were rated higher for emotionality and familiarity compared to other-culture emotional faces and that this effect involved conscious awareness.


Author(s):  
D. A. Kobus ◽  
J. Russotti ◽  
C. Schlichting ◽  
G. Haskell ◽  
S. Carpenter ◽  
...  

2021 ◽  
Author(s):  
Hao Zheng ◽  
Jianfang Liu ◽  
Xiaogang Ren

Abstract Although the current vehicle detection and recognition framework based on deep learning has its own characteristics and advantages, it is difficult to effectively combine multi-scale and multi category vehicle features, and there is still room for improvement in vehicle detection and recognition performance. Based on this, an improved fast R-CNN convolutional neural network is proposed to detect dim targets in complex traffic environment. The deep learning model of fast R-CNN convolutional neural network is introduced into the image recognition of complex traffic environment, and a structure optimization method is proposed, which replaces vgg16 in fast RCNN with RESNET to make it suitable for small target recognition in complex background. Max pooling is the down sampling method, and then feature pyramid network is introduced into RPN to generate target candidate box to optimize the structure of convolutional neural network. After training with 1497 images, the complex traffic environment images are identified and tested.


1991 ◽  
Vol 34 (2) ◽  
pp. 415-426 ◽  
Author(s):  
Richard L. Freyman ◽  
G. Patrick Nerbonne ◽  
Heather A. Cote

This investigation examined the degree to which modification of the consonant-vowel (C-V) intensity ratio affected consonant recognition under conditions in which listeners were forced to rely more heavily on waveform envelope cues than on spectral cues. The stimuli were 22 vowel-consonant-vowel utterances, which had been mixed at six different signal-to-noise ratios with white noise that had been modulated by the speech waveform envelope. The resulting waveforms preserved the gross speech envelope shape, but spectral cues were limited by the white-noise masking. In a second stimulus set, the consonant portion of each utterance was amplified by 10 dB. Sixteen subjects with normal hearing listened to the unmodified stimuli, and 16 listened to the amplified-consonant stimuli. Recognition performance was reduced in the amplified-consonant condition for some consonants, presumably because waveform envelope cues had been distorted. However, for other consonants, especially the voiced stops, consonant amplification improved recognition. Patterns of errors were altered for several consonant groups, including some that showed only small changes in recognition scores. The results indicate that when spectral cues are compromised, nonlinear amplification can alter waveform envelope cues for consonant recognition.


Sign in / Sign up

Export Citation Format

Share Document