scholarly journals Digital Audio Scene Recognition Method Based on Machine Learning Technology

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Sihua Sun

Audio scene recognition is a task that enables devices to understand their environment through digital audio analysis. It belongs to a branch of the field of computer auditory scene. At present, this technology has been widely used in intelligent wearable devices, robot sensing services, and other application scenarios. In order to explore the applicability of machine learning technology in the field of digital audio scene recognition, an audio scene recognition method based on optimized audio processing and convolutional neural network is proposed. Firstly, different from the traditional audio feature extraction method using mel-frequency cepstrum coefficient, the proposed method uses binaural representation and harmonic percussive source separation method to optimize the original audio and extract the corresponding features, so that the system can make use of the spatial features of the scene and then improve the recognition accuracy. Then, an audio scene recognition system with two-layer convolution module is designed and implemented. In terms of network structure, we try to learn from the VGGNet structure in the field of image recognition to increase the network depth and improve the system flexibility. Experimental data analysis shows that compared with traditional machine learning methods, the proposed method can greatly improve the recognition accuracy of each scene and achieve better generalization effect on different data.

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Xiaoying Shen ◽  
Chao Yuan

With the development of the live broadcast industry, security issues in the live broadcast process have become increasingly apparent. At present, the supervision of various live broadcast platforms is basically in a state of human supervision. Manpower supervision is mainly through user reporting and platform supervision measures. However, there are a large number of live broadcast rooms at the same time, and only relying on human supervision can no longer meet the monitoring needs of live broadcasts. Based on this situation, this study proposes a violation information recognition method of a live-broadcasting platform based on machine learning technology. By analyzing the similarities and differences between normal live broadcasts and violation live broadcasts, combined with the characteristics of violation image data, this study mainly detects human skin color and sensitive parts. A prominent feature of violation images is that they contain a large area of naked skin, and the ratio of the area of naked skin to the overall image area of the violation image will exceed the threshold. Skin color recognition plays a role in initial target positioning. The accuracy of skin color recognition is directly related to the recognition accuracy of the entire system, so skin color recognition is the most important part of violation information recognition. Although there are many effective skin color recognition technologies, the accuracy and stability of skin color recognition still need to be improved due to the influence of various external factors, such as light intensity, light source color, and physical equipment. When it is detected that the area of the skin color in the live screen exceeds the threshold, it is preliminarily determined to be a suspected violation video. In order to improve the recognition accuracy, it is necessary to detect sensitive parts of the suspected video. Naked female breasts are a very obvious feature in violation images. This study uses a chest feature extraction method to detect the chest in the image. When the recognition result is a violation image, it is determined that the live broadcast involves violation content. The machine learning algorithm is simple to implement, and the parameters are easy to adjust. The classifier training requires a short time and is suitable for live violation information recognition scenarios. The experimental results on the adopted data set show that the method used in this article can effectively detect videos with violation content. The recognition rate is as high as 85.98%, which is suitable for a real-life environment and has good practical significance.


Author(s):  
Tew Jia Yu ◽  
Chin Poo Lee ◽  
Kian Ming Lim ◽  
Siti Fatimah Abdul Razak

<span>The most common technology used in targeted advertising is facial recognition and vehicle recognition. Even though there are existing systems serving for the targeting purposes, most propose limited functionalities and the system performance is normally unknown. This paper presents an intelligent targeted advertising system with multiple functionalities, namely facial recognition for gender and age, vehicle recognition, and multiple object detection. The main purpose is to improve the effectiveness of outdoor advertising through biometrics approaches and machine learning technology. Machine learning algorithms are implemented for higher recognition accuracy and hence achieved better targeted advertising effect.</span>


2021 ◽  
Vol 257 ◽  
pp. 01019
Author(s):  
Zhe Li ◽  
Haifeng Su

Based on machine learning technology and combining the operation of machine learning from the idea of neural network, this paper focuses on the classification and recognition of image data of transformers, circuit breakers and isolation switches in substations. Firstly, the image enhancement is carried out on the basis of the original image, which simulates the possible scenes in reality. Secondly, using the dual-mode a deconvolutional network to capture significant features from in-depth visible and infrared images. Furthermore, all these features are subjected to the program to conduct transfer learning and weighted fusion. The dual-mode deconvolutional network (DMDN) extracts and highlights the features of the electrical equipment. Compared to traditional model, the recognition accuracy of the improved model is reached at 99.17%.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 313 ◽  
Author(s):  
Pengbo Gao ◽  
Yan Zhang ◽  
Linhuan Zhang ◽  
Ryozo Noguchi ◽  
Tofael Ahamed

Unmanned aerial vehicle (UAV)-based spraying systems have recently become important for the precision application of pesticides, using machine learning approaches. Therefore, the objective of this research was to develop a machine learning system that has the advantages of high computational speed and good accuracy for recognizing spray and non-spray areas for UAV-based sprayers. A machine learning system was developed by using the mutual subspace method (MSM) for images collected from a UAV. Two target lands: agricultural croplands and orchard areas, were considered in building two classifiers for distinguishing spray and non-spray areas. The field experiments were conducted in target areas to train and test the system by using a commercial UAV (DJI Phantom 3 Pro) with an onboard 4K camera. The images were collected from low (5 m) and high (15 m) altitudes for croplands and orchards, respectively. The recognition system was divided into offline and online systems. In the offline recognition system, 74.4% accuracy was obtained for the classifiers in recognizing spray and non-spray areas for croplands. In the case of orchards, the average classifier recognition accuracy of spray and non-spray areas was 77%. On the other hand, the online recognition system performance had an average accuracy of 65.1% for croplands, and 75.1% for orchards. The computational time for the online recognition system was minimal, with an average of 0.0031 s for classifier recognition. The developed machine learning system had an average recognition accuracy of 70%, which can be implemented in an autonomous UAV spray system for recognizing spray and non-spray areas for real-time applications.


Sensors ◽  
2020 ◽  
Vol 20 (5) ◽  
pp. 1415
Author(s):  
Hirokazu Madokoro ◽  
Kazuhisa Nakasho ◽  
Nobuhiro Shimoi ◽  
Hanwool Woo ◽  
Kazuhito Sato

This paper presents a novel bed-leaving sensor system for real-time recognition of bed-leaving behavior patterns. The proposed system comprises five pad sensors installed on a bed, a rail sensor inserted in a safety rail, and a behavior pattern recognizer based on machine learning. The linear characteristic between loads and output was obtained from a load test to evaluate sensor output characteristics. Moreover, the output values change linearly concomitantly with speed to attain the sensor with the equivalent load. We obtained benchmark datasets of continuous and discontinuous behavior patterns from ten subjects. Recognition targets using our sensor prototype and their monitoring system comprise five behavior patterns: sleeping, longitudinal sitting, lateral sitting, terminal sitting, and leaving the bed. We compared machine learning algorithms of five types to recognize five behavior patterns. The experimentally obtained results revealed that the proposed sensor system improved recognition accuracy for both datasets. Moreover, we achieved improved recognition accuracy after integration of learning datasets as a general discriminator.


Speech recognition technology has been developing very fast lately. One of its application is to know the meaning of some terms included in a geographic dictionary. When a subject speaks a word to the system, it will output the word and its meaning and explanation. There are many methods that are applied to speech recognition. One of the methods that can be applied and improve the accuracy of speech recognition is the use of a deep learning method, i.e. Convolutional Neural Network (CNN). In this research, CNN's speech recognition accuracy for the Indonesian geographic dictionary is analyzed to show that CNN can improve the accuracy of speech recognition compared to speech recognition with Gaussian mixture model and hidden Markov model (GMM-HMM). CNN is one of deep learning methods that analyzes and finds similarity in Mel-frequency cepstral coefficients (MFCC) from sound waves. This research is performed by making models of the spoken words using CNN under Python and TensorFlow. CNN is trained with these models from speech data collected and prepared from 20 students, consists of 19 men and a woman of different ages from 19 to 23 years. The vocabulary of the database consists of 50 words. The result of this research is a desktop application with the trained models implemented. Our application can recognize well the spoken words from subjects. Testing of the trained models was performed to examine the accuracy of the build speech recognition system. The result of the CNN speech recognition method from the Indonesian geographic dictionary is 80% accuracy for isolated words and 72.67% for continuous words in our research.


2021 ◽  
Author(s):  
Linghui Xu ◽  
Jiansong Chen ◽  
Fei Wang ◽  
Yuting Chen ◽  
Wei Yang ◽  
...  

Abstract Background: Pathological gaits of children may lead to terrible diseases, such as osteoarthritis or scoliosis. By monitoring the gait pattern of a child, proper therapeutic measures can be recommended to avoid the terrible consequence. However, low-cost systems for pathological gait recognition of children automatically have not been on market yet. Our goal was to design a low-cost gait-recognition system for children with only pressure information.Methods: In this study, we design a pathological gait-recognition system (PGRS) with an 8 × 8 pressure-sensor array. An intelligent gait-recognition method (IGRM) based on machine learning and pure plantar pressure information is also proposed in static and dynamic sections to realize high accuracy and good real-time performance. To verifying the recognition effect, a total of seventeen children were recruited in the experiments wearing PGRS to recognize three pathological gaits (toe in, toe out, and flat) and normal gait. Children are asked to walk naturally on level ground in the dynamic section or stand naturally and comfortably in the static section. The evaluation of the performance of recognition results included stratified 10-fold cross-validation with recall, precision, and a time cost as metrics.Results: The experimental results show that all of the IGRMs have been identified with a practically applicable degree of average accuracy either in the dynamic or static section. Experimental results indicate that the IGRM has 92.41% and 97.79% recognition accuracy respectively in the static and dynamic sections. And we find methods in the static section have less recognition accuracy due to the unnatural gesture of children when standing.Conclusions: In this study, a low-cost PGRS has been verified and realize feasibility, highly average precision, and good real-time performance of gait recognition. The experimental results reveal the potential for the computer supervision of non-pathological and pathological gaits in the plantar-pressure patterns of children and for providing feedback in the application of gait-abnormality rectification.


Music is a widely used data format in the explosion of Internet information. Automatically identifying the style of online music in the Internet is an important and hot topic in the field of music information retrieval and music production. Recently, automatic music style recognition has been used in many real life scenes. Due to the emerging of machine learning, it provides a good foundation for automatic music style recognition. This paper adopts machine learning technology to establish an automatic music style recognition system. First, the online music is process by waveform analysis to remove the noises. Second, the denoised music signals are represented as sample entropy features by using empirical model decomposition. Lastly, the extracted features are used to learn a relative margin support vector machine model to predict future music style. The experimental results demonstrate the effectiveness of the proposed framework.


2011 ◽  
Vol 268-270 ◽  
pp. 82-87
Author(s):  
Zhi Peng Zhao ◽  
Yi Gang Cen ◽  
Xiao Fang Chen

In this paper, we proposed a new noise speech recognition method based on the compressive sensing theory. Through compressive sensing, our method increases the anti-noise ability of speech recognition system greatly, which leads to the improvement of the recognition accuracy. According to the experiments, our proposed method achieved better recognition performance compared with the traditional isolated word recognition method based on DTW algorithm.


Sign in / Sign up

Export Citation Format

Share Document