Automatic Multiscale-based Peak Detection on Short Time Energy and Spectral Centroid Feature Extraction for Conversational Speech Segmentation

2021 ◽  
Author(s):  
Barlian Henryranu Prasetio ◽  
Edita Rosana Widasari ◽  
Hiroki Tamura
2014 ◽  
Vol 931-932 ◽  
pp. 1397-1401
Author(s):  
Pathasu Doungmala ◽  
Anupap Meesomboon

In this paper we present a new system to classify TV programs into predefined categories based on the analysis of their audio and video contents. This is very useful in intelligent display and storage systems that can select channels and record or skip contents according to the consumer's preference. Distinguishable patterns exist in different categories of TV programs in terms of human faces and audio. In this paper four categories divided into news, cartoon, variety and sport are of interest. News and variety have differences between frames less than sport and cartoon. For audio feature, we apply short time energy, zero crossing, spectral centroid and short time Fourier transform for feature extraction. For face feature, in the first step, Haar like feature is employed for face detection and eigenface is then applied for feature extraction. Then, neural network is implemented for classification. From experimental results, classification rate of 95% accuracy which is better than the other paper is achievable.


2012 ◽  
Vol 4 (1) ◽  
Author(s):  
David David

Abstract. Voice recognition technology is currently experiencing growth, especially in the case of speech processing. Speech processing is a way to extract the desired information from a voice signal. This study discusses the classification of human voice system male and female. Extract the characteristics of the voice signal in each frame time domain and frequency domain is to help simplify and speed calculations. The features for voice or other audio between Short Time Energy, Zero Crossing Rate, Spectral Centroid, and others. Test results show that the classification system the human voice using the backpropagation neural network and Levenberg-Marquadt algorithm to change matrix weight is very good because of the complexity and rapid calculation which is not too high. Database voice sample of 40 voices with the test data as much as 5 votes. The output of the system is the result of the classification that has been identified with a similarity value>=0.5 for male and <0.5 as a female. Testing using artificial neural network produced an average success rate in voice classification amounted to 91%.Keywords: Feature Extraction, Classification, Backpropagation, Levenberg-Marquadt Algorithm, Human Voice Abstrak. Teknologi pengenalan suara saat ini telah mengalami perkembangan terutama dalam hal speech processing. Speech processing merupakan suatu cara untuk mengekstrak informasi yang diinginkan dari sebuah sinyal suara. Penelitian ini membahas sistem klasifikasi suara manusia male dan female. Mengekstrak ciri dari sinyal suara setiap frame pada kawasan waktu dan kawasan frekuensi sangat membantu untuk  menyederhanakan dan mempercepat perhitungan. Adapun fitur-fitur untuk suara atau audio antara lain Short Time Energy, Zero Crossing Rate, Spectral Centroid dan lain-lain. Hasil pengujian sistem menunjukkan bahwa klasifikasi suara manusia dengan menggunakan jaringan saraf tiruan backpropagation dan algoritma Levenberg-Marquadt untuk perubahan matriks bobot, sangat baik dan cepat karena kompleksitas perhitungan yang tidak terlalu tinggi. Database sample suara sebanyak 40 buah dengan data test sebanyak 5 suara. Output dari sistem adalah hasil klasifikasi yang telah dikenali dengan nilai kemiripan >= 0,5 sebagai pria dan < 0,5 sebagai wanita. Pengujian dengan menggunakan jaringan saraf tiruan dihasilkan rata-rata tingkat keberhasilan dalam klasifikasi suara adalah sebesar 91 %.Kata Kunci: Feature Extraction, Klasifikasi, Backpropagation, Algoritma Levenberg-Marquadt, Suara Manusia


2021 ◽  
Vol 11 (15) ◽  
pp. 6748
Author(s):  
Hsun-Ping Hsieh ◽  
Fandel Lin ◽  
Jiawei Jiang ◽  
Tzu-Ying Kuo ◽  
Yu-En Chang

Research on flourishing public bike-sharing systems has been widely discussed in recent years. In these studies, many existing works focus on accurately predicting individual stations in a short time. This work, therefore, aims to predict long-term bike rental/drop-off demands at given bike station locations in the expansion areas. The real-world bike stations are mainly built-in batches for expansion areas. To address the problem, we propose LDA (Long-Term Demand Advisor), a framework to estimate the long-term characteristics of newly established stations. In LDA, several engineering strategies are proposed to extract discriminative and representative features for long-term demands. Moreover, for original and newly established stations, we propose several feature extraction methods and an algorithm to model the correlations between urban dynamics and long-term demands. Our work is the first to address the long-term demand of new stations, providing the government with a tool to pre-evaluate the bike flow of new stations before deployment; this can avoid wasting resources such as personnel expense or budget. We evaluate real-world data from New York City’s bike-sharing system, and show that our LDA framework outperforms baseline approaches.


Author(s):  
Christian Herff ◽  
Dean J. Krusienski

AbstractClinical data is often collected and processed as time series: a sequence of data indexed by successive time points. Such time series can be from sources that are sampled over short time intervals to represent continuous biophysical wave-(one word waveforms) forms such as the voltage measurements representing the electrocardiogram, to measurements that are sampled daily, weekly, yearly, etc. such as patient weight, blood triglyceride levels, etc. When analyzing clinical data or designing biomedical systems for measurements, interventions, or diagnostic aids, it is important to represent the information contained within such time series in a more compact or meaningful form (e.g., noise filtering), amenable to interpretation by a human or computer. This process is known as feature extraction. This chapter will discuss some fundamental techniques for extracting features from time series representing general forms of clinical data.


Author(s):  
Dinesh Bhatia ◽  
Animesh Mishra

The role of ECG analysis in the diagnosis of cardio-vascular ailments has been significant in recent times. Although effective, the present computational algorithms lack accuracy, and no technique till date is capable of predicting the onset of a CVD condition with precision. In this chapter, the authors attempt to formulate a novel mapping technique based on feature extraction using fractional Fourier transform (FrFT) and map generation using self-organizing maps (SOM). FrFT feature extraction from the ECG data has been performed in a manner reminiscent of short time Fourier transform (STFT). Results show capability to generate maps from the isolated ECG wavetrains with better prediction capability to ascertain the onset of CVDs, which is not possible using conventional algorithms. Promising results provide the ability to visualize the data in a time evolution manner with the help of maps and histograms to predict onset of different CVD conditions and the ability to generate the required output with unsupervised training helping in greater generalization than previous reported techniques.


1973 ◽  
Vol 28 (1) ◽  
pp. 105-109 ◽  
Author(s):  
H. Jäger ◽  
R. Schöfer

For shock waves produced by special wire explosions the short time energy input condition of the theories of Lin, Sakurai and Vlases-Jones is fairly good fulfilled. In these cases the shock wave energies can be easily determined from the expansion velocity of the waves. Variation of the parameters of the discharge circuit show, how these parameters should be chosen in order to get a maximum transfer of energy either to the shock waves or to the wire material.


2019 ◽  
Vol 9 (18) ◽  
pp. 3642
Author(s):  
Lin Liang ◽  
Haobin Wen ◽  
Fei Liu ◽  
Guang Li ◽  
Maolin Li

The incipient damages of mechanical equipment excite weak impulse vibration, which is hidden, almost unobservable, in the collected signal, making fault detection and failure prevention at the inchoate stage rather challenging. Traditional feature extraction techniques, such as bandpass filtering and time-frequency analysis, are suitable for matrix processing but challenged by the higher-order data. To tackle these problems, a novel method of impulse feature extraction for vibration signals, based on sparse non-negative tensor factorization is presented in this paper. Primarily, the phase space reconstruction and the short time Fourier transform are successively employed to convert the original signal into time-frequency distributions, which are further arranged into a three-way tensor to obtain a time-frequency multi-aspect array. The tensor is decomposed by sparse non-negative tensor factorization via hierarchical alternating least squares algorithm, after which the latent components are reconstructed from the factors by the inverse short time Fourier transform and eventually help extract the impulse feature through envelope analysis. For performance verification, the experimental analysis on the bearing datasets and the swashplate piston pump has confirmed the effectiveness of the proposed method. Comparisons to the traditional methods, including maximum correlated kurtosis deconvolution, singular value decomposition, and maximum spectrum kurtosis, also suggest its better performance of feature extraction.


Sign in / Sign up

Export Citation Format

Share Document