scholarly journals Hybrid neural network based on novel audio feature for vehicle type identification

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Haoze Chen ◽  
Zhijie Zhang

AbstractDue to the audio information of different types of vehicle models are distinct, the vehicle information can be identified by the audio signal of vehicle accurately. In real life, in order to determine the type of vehicle, we do not need to obtain the visual information of vehicles and just need to obtain the audio information. In this paper, we extract and stitching different features from different aspects: Mel frequency cepstrum coefficients in perceptual characteristics, pitch class profile in psychoacoustic characteristics and short-term energy in acoustic characteristics. In addition, we improve the neural networks classifier by fusing the LSTM unit into the convolutional neural networks. At last, we put the novel feature to the hybrid neural networks to recognize different vehicles. The results suggest the novel feature we proposed in this paper can increase the recognition rate by 7%; destroying the training data randomly by superimposing different kinds of noise can improve the anti-noise ability in our identification system; and LSTM has great advantages in modeling time series, adding LSTM to the networks can improve the recognition rate of 3.39%.

2019 ◽  
Vol 32 (2) ◽  
pp. 87-109 ◽  
Author(s):  
Galit Buchs ◽  
Benedetta Heimler ◽  
Amir Amedi

Abstract Visual-to-auditory Sensory Substitution Devices (SSDs) are a family of non-invasive devices for visual rehabilitation aiming at conveying whole-scene visual information through the intact auditory modality. Although proven effective in lab environments, the use of SSDs has yet to be systematically tested in real-life situations. To start filling this gap, in the present work we tested the ability of expert SSD users to filter out irrelevant background noise while focusing on the relevant audio information. Specifically, nine blind expert users of the EyeMusic visual-to-auditory SSD performed a series of identification tasks via SSDs (i.e., shape, color, and conjunction of the two features). Their performance was compared in two separate conditions: silent baseline, and with irrelevant background sounds from real-life situations, using the same stimuli in a pseudo-random balanced design. Although the participants described the background noise as disturbing, no significant performance differences emerged between the two conditions (i.e., noisy; silent) for any of the tasks. In the conjunction task (shape and color) we found a non-significant trend for a disturbing effect of the background noise on performance. These findings suggest that visual-to-auditory SSDs can indeed be successfully used in noisy environments and that users can still focus on relevant auditory information while inhibiting irrelevant sounds. Our findings take a step towards the actual use of SSDs in real-life situations while potentially impacting rehabilitation of sensory deprived individuals.


2021 ◽  
Vol 13 (0) ◽  
pp. 1-5
Author(s):  
Mantas Tamulionis

Methods based on artificial neural networks (ANN) are widely used in various audio signal processing tasks. This provides opportunities to optimize processes and save resources required for calculations. One of the main objects we need to get to numerically capture the acoustics of a room is the room impulse response (RIR). Increasingly, research authors choose not to record these impulses in a real room but to generate them using ANN, as this gives them the freedom to prepare unlimited-sized training datasets. Neural networks are also used to augment the generated impulses to make them similar to the ones actually recorded. The widest use of ANN so far is observed in the evaluation of the generated results, for example, in automatic speech recognition (ASR) tasks. This review also describes datasets of recorded RIR impulses commonly found in various studies that are used as training data for neural networks.


1993 ◽  
Vol 39 (11) ◽  
pp. 2248-2253 ◽  
Author(s):  
P K Sharpe ◽  
H E Solberg ◽  
K Rootwelt ◽  
M Yearworth

Abstract We studied the potential benefit of using artificial neural networks (ANNs) for the diagnosis of thyroid function. We examined two types of ANN architecture and assessed their robustness in the face of diagnostic noise. The thyroid function data we used had previously been studied by multivariate statistical methods and a variety of pattern-recognition techniques. The total data set comprised 392 cases that had been classified according to both thyroid function and 19 clinical categories. All cases had a complete set of results of six laboratory tests (total thyroxine, free thyroxine, triiodothyronine, triiodothyronine uptake test, thyrotropin, and thyroxine-binding globulin). This data set was divided into subsets used for training the networks and for testing their performance; the test subsets contained various proportions of cases with diagnostic noise to mimic real-life diagnostic situations. The networks studied were a multilayer perceptron trained by back-propagation, and a learning vector quantization network. The training data subsets were selected according to two strategies: either training data based on cases with extreme values for the laboratory tests with randomly selected nonextreme cases added, or training cases from very pure functional groups. Both network architectures were efficient irrespective of the type of training data. The correct allocation of cases in test data subsets was 96.4-99.7% when extreme values were used for training and 92.7-98.8% when only pure cases were used.


2020 ◽  
Vol 12 (1) ◽  
pp. 51-59
Author(s):  
A. A. Moskvin ◽  
A.G. Shishkin

Human emotions play significant role in everyday life. There are a lot of applications of automatic emotion recognition in medicine, e-learning, monitoring, marketing etc. In this paper the method and neural network architecture for real-time human emotion recognition by audio-visual data are proposed. To classify one of seven emotions, deep neural networks, namely, convolutional and recurrent neural networks are used. Visual information is represented by a sequence of 16 frames of 96 × 96 pixels, and audio information - by 140 features for each of a sequence of 37 temporal windows. To reduce the number of audio features autoencoder was used. Audio information in conjunction with visual one is shown to increase recognition accuracy up to 12%. The developed system being not demanding to be computing resources is dynamic in terms of selection of parameters, reducing or increasing the number of emotion classes, as well as the ability to easily add, accumulate and use information from other external devices for further improvement of classification accuracy. 


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 562
Author(s):  
Marcin Kociołek ◽  
Michał Kozłowski ◽  
Antonio Cardone

The perceived texture directionality is an important, not fully explored image characteristic. In many applications texture directionality detection is of fundamental importance. Several approaches have been proposed, such as the fast Fourier-based method. We recently proposed a method based on the interpolated grey-level co-occurrence matrix (iGLCM), robust to image blur and noise but slower than the Fourier-based method. Here we test the applicability of convolutional neural networks (CNNs) to texture directionality detection. To obtain the large amount of training data required, we built a training dataset consisting of synthetic textures with known directionality and varying perturbation levels. Subsequently, we defined and tested shallow and deep CNN architectures. We present the test results focusing on the CNN architectures and their robustness with respect to image perturbations. We identify the best performing CNN architecture, and compare it with the iGLCM, the Fourier and the local gradient orientation methods. We find that the accuracy of CNN is lower, yet comparable to the iGLCM, and it outperforms the other two methods. As expected, the CNN method shows the highest computing speed. Finally, we demonstrate the best performing CNN on real-life images. Visual analysis suggests that the learned patterns generalize to real-life image data. Hence, CNNs represent a promising approach for texture directionality detection, warranting further investigation.


2020 ◽  
Author(s):  
DEEPA P V ◽  
Joseph Jawhar S ◽  
Mary Geisa

Abstract Background: In recent days, the field of nano-technology is becoming popular due to increased rate of accurate detection and effectiveness in clinical patients using Computer-Aided Diagnosis (CAD). The high rate of precision with accuracy and classification of brain tumor as benign or malignant can be achieved with nano-scale imaging technology. This helps to enhance the superiority of life for brain tumor diseased patients. Results: In this work, we propose the novel Semantic nano-segmentation for the detection of brain tumors even in nano scale range. Proposed Semantic Nano-segmentation based on Advanced - Convolutional Neural Networks will help the radiologists to find the brain cancer even at early stages with nodules at very smaller size. The proposed method of Advanced - Convolutional Neural Networks (A-CNN) uses ResNet-50. Here the nano-image is taken as input and tumor image is segmented using Semantic Nano-segmentation which carries an average dice and SSIM values to be 0.2133 and 0.9704 respectively. The accuracy of 93.2% and 92.7% is obtained by the proposed Semantic nano segmentation for benign and malignant tumor images respectively. A-CNN method of automatic classification has an average accuracy of 99.57% and 95.7% for benign and malignant images respectively. Conclusion: This novel nano-method is created for effective detection of tumor area in nanometers (nm) and thus evaluates the disease perfectly. Closeness of the Proposed method at ROC curve with reference to True Positive values indicates higher performance than other methods. Comparative analysis on ResNet-50 with testing and training data at rate of 90% -10%, 80%-20% and 70%-30% respectively is made which proves the effectiveness of the proposed work.


2001 ◽  
Vol 11 (02) ◽  
pp. 167-177 ◽  
Author(s):  
I. M. GALVÁN ◽  
P. ISASI ◽  
R. ALER ◽  
J. M. VALLS

Multilayer feedforward neural networks with backpropagation algorithm have been used successfully in many applications. However, the level of generalization is heavily dependent on the quality of the training data. That is, some of the training patterns can be redundant or irrelevant. It has been shown that with careful dynamic selection of training patterns, better generalization performance may be obtained. Nevertheless, generalization is carried out independently of the novel patterns to be approximated. In this paper, we present a learning method that automatically selects the training patterns more appropriate to the new sample to be predicted. This training method follows a lazy learning strategy, in the sense that it builds approximations centered around the novel sample. The proposed method has been applied to three different domains: two artificial approximation problems and a real time series prediction problem. Results have been compared to standard backpropagation using the complete training data set and the new method shows better generalization abilities.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Georgios Kantidakis ◽  
Elia Biganzoli ◽  
Hein Putter ◽  
Marta Fiocco

Background. Studies focusing on prediction models are widespread in medicine. There is a trend in applying machine learning (ML) by medical researchers and clinicians. Over the years, multiple ML algorithms have been adapted to censored data. However, the choice of methodology should be motivated by the real-life data and their complexity. Here, the predictive performance of ML techniques is compared with statistical models in a simple clinical setting (small/moderate sample size and small number of predictors) with Monte-Carlo simulations. Methods. Synthetic data (250 or 1000 patients) were generated that closely resembled 5 prognostic factors preselected based on a European Osteosarcoma Intergroup study (MRC BO06/EORTC 80931). Comparison was performed between 2 partial logistic artificial neural networks (PLANNs) and Cox models for 20, 40, 61, and 80% censoring. Survival times were generated from a log-normal distribution. Models were contrasted in terms of the C-index, Brier score at 0-5 years, integrated Brier score (IBS) at 5 years, and miscalibration at 2 and 5 years (usually neglected). The endpoint of interest was overall survival. Results. PLANNs original/extended were tuned based on the IBS at 5 years and the C-index, achieving a slightly better performance with the IBS. Comparison with Cox models showed that PLANNs can reach similar predictive performance on simulated data for most scenarios with respect to the C-index, Brier score, or IBS. However, Cox models were frequently less miscalibrated. Performance was robust in scenario data where censored patients were removed before 2 years or curtailing at 5 years was performed (on training data). Conclusion. Survival neural networks reached a comparable predictive performance with Cox models but were generally less well calibrated. All in all, researchers should be aware of burdensome aspects of ML techniques such as data preprocessing, tuning of hyperparameters, and computational intensity that render them disadvantageous against conventional regression models in a simple clinical setting.


Author(s):  
Yongsheng Rao ◽  
Saeed Kosari ◽  
Zehui Shao ◽  
A. A. Talebi ◽  
A. Mahdavi ◽  
...  

AbstractIt is known that Intuitionistic fuzzy models give more precision, flexibility and compatibility to the system as compared to the classic and fuzzy models. Intuitionistic fuzzy tree has an important role in neural networks, computer networks, and clustering. In the design of a network, it is important to analyze connections between the levels. In addition, the intuitionistic fuzzy tree is becoming increasingly significant as it is applied to different areas in real life. The study proposes the novel concepts of intuitionistic fuzzy graph (IFG) and some basic definitions. We investigate the types of arcs, for example, $$\alpha _{\mu }$$ α μ -strong, $$\beta _{\mu }$$ β μ -strong, and $$\delta _{\mu }$$ δ μ -arc in an intuitionistic fuzzy graph, and introduce some of their properties. In particular, the present work develops the concepts of intuitionistic fuzzy bridge (IFB), intuitionistic fuzzy cut nodes (IFCN) and some important properties of an intuitionistic fuzzy bridge. Next, we define an intuitionistic fuzzy cycle (IFC) and an intuitionistic fuzzy tree (IFT). Likewise, we discuss some properties of the IFT and the relationship between an intuitionistic fuzzy tree and an intuitionistic fuzzy cycle. Finally, an application of intuitionistic fuzzy tree is illustrated in other sciences.


Author(s):  
Suping Li ◽  
Zhanfeng Wang ◽  
Jing Wang

Learning vector quantization (LVQ) network and back-propagation (BP) network are constructed easily making use of MATLAB toolbox on the basis of maintaining the recognition rate. Face images are randomly selected from images set as training data of LVQ network and BP network. LVQ algorithm and BP algorithm are used to train network. The automatic recognition of face orientation is realized when the system obtains convergence network. First, all images are processed by edge detection. Then feature vectors representing position of the eye were extracted from edge detected images. Feature vectors of training set are sent to network to adjust the parameters which ensures the convergence speed and performance of the network. Experimental results show that the constructed LVQ network and BP network can judge face orientation according to feature vectors of input images. Generally, the recognition rate of LVQ network is higher than that of BP network. The LVQ network and BP network are both feasible and effective for face orientation recognition to some extent. The advantage of this work is that the recognition system is efficient and easy to promote. This paper focuses on how to use MATLAB easily to design identification network rather than the complexity of identification system. The future research will focus on the stability and robustness of recognition network.


Sign in / Sign up

Export Citation Format

Share Document