scholarly journals Models and a Tecnique for Determining the Speech Activity of a User of a Socio-Cyberphysical System

2020 ◽  
Vol 23 (6) ◽  
pp. 225-240
Author(s):  
E. E. Usina ◽  
A. R. Shabanova ◽  
I. V. Lebedev

Purpose of reseach. The article presents the development of the model-algorithmic support for the process of determining the speech activity of a user of a socio-cyberphysical system. A topological model of a distributed subsystem of audio recordings implemented in limited physical spaces (rooms) is proposed; the model makes it possible to assess the quality of perceived audio signals for the case of distribution of microphones in such a room. Based on this model, a technique for determining the speech activity of a user of a socio-cyberphysical system, which maximizes the quality of perceived audio signals when a user moves in a room by means of determining the installation coordinates of microphones has been developed. Methods. The mathematical tools of graph theory and set theory was used for the most complete analysis and formal description of the distributed subsystem of the audiorecording. In order to determine the coordinates of the placement of microphones in one room, a relevant technique was developed; it involves performing such operations as emitting a speech signal in a room using acoustic equipment and measuring signal levels using a noise meter in the places intended for installing microphones.  Results. The dependences of the correlation coefficient of the combined signal and the initial test signal on the distance to the signal source were calculated for a different number of microphones. The obtained dependences allow us to determine the minimum required number of spaced microphones to ensure high-quality recording of the user’s speech. The results of testing the developed technique for determining speech activity in a particular room indicate the possibility and high efficiency of determining the speech activity of a user of a socio-cyberphysical system. Conclusion. Application of the proposed technique for determining the speech activity of a user of a sociocyberphysical system will improve the recording quality of the audio signal and, as a consequence, its subsequent processing, taking into account the possible movement of a user. 

Author(s):  
Kazuhiro Kondo

This chapter proposes two data-hiding algorithms for stereo audio signals. The first algorithm embeds data into a stereo audio signal by adding data-dependent mutual delays to the host stereo audio signal. The second algorithm adds fixed delay echoes with polarities that are data dependent and amplitudes that are adjusted such that the interchannel correlation matches the original signal. The robustness and the quality of the data-embedded audio will be given and compared for both algorithms. Both algorithms were shown to be fairly robust against common distortions, such as added noise, audio coding, and sample rate conversion. The embedded audio quality was shown to be “fair” to “good” for the first algorithm and “good” to “excellent” for the second algorithm, depending on the input source.


2007 ◽  
Vol 01 (03) ◽  
pp. 307-318 ◽  
Author(s):  
ATMAN JBARI ◽  
ABDELLAH ADIB ◽  
DRISS ABOUTAJDINE

In this paper, we address the problem of Blind Audio Separation (BAS) by content evaluation of audio signals in the Time-Scale domain. Most of the proposed techniques rely on independence or at least uncorrelation assumption of the source signals exploiting mutual information or second/high order statistics. Here, we present a new algorithm, for instantaneous mixture, that considers only different time-scale source signature properties. Our approach lies in wavelet transformation advantages and proposes for this a new representation; Spatial Time Scale Distributions (STSD), to characterize energy and interference of the observed data. The BAS will be allowed by joint diagonalization, without a prior orthogonality constraint, of a set of selected diagonal STSD matrices. Several criteria will be proposed, in the transformed time-scale space, to assess the separated audio signal contents. We describe the logistics of the separation and the content rating, thus an exemplary implementation on synthetic signals and real audio recordings show the high efficiency of the proposed technique to restore the audio signal contents.


Author(s):  
Katja M. Hynynen ◽  
Juho Ratava ◽  
Tuomo Lindh ◽  
Mikko Rikkonen ◽  
Ville Ryynänen ◽  
...  

Chatter is an unfavorable phenomenon in turning operation causing poor surface quality. Active chatter elimination methods require the chatter to be detected before the control reacts. In this paper, a chatter detection method based on a coherence function of the acceleration of the tool in the x direction and an audio signal is proposed. The method was experimentally tested on longitudinal turning of a stock bar and facing of a hollow bar. The results show that the proposed method detects the chatter in an early stage and allows correcting control actions before the chatter influences the surface quality of the workpiece. The method is applicable both to facing and longitudinal turning.


2021 ◽  
Vol 23 (07) ◽  
pp. 62-70
Author(s):  
Nagesh B ◽  
◽  
Dr. M. Uttara Kumari ◽  

Audio processing is an important branch under the signal processing domain. It deals with the manipulation of the audio signals to achieve a task like filtering, data compression, speech processing, noise suppression, etc. which improves the quality of the audio signal. For applications such as natural language processing, speech generation, automatic speech recognition, the conventional algorithms aren’t sufficient. There is a need for machine learning or deep learning algorithms which can be implemented so that the audio signal processing can be achieved with good results and accuracy. In this paper, a review of the various algorithms used by researchers in the past has been described and gives the appropriate algorithm that can be used for the respective applications.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 676
Author(s):  
Andrej Zgank

Animal activity acoustic monitoring is becoming one of the necessary tools in agriculture, including beekeeping. It can assist in the control of beehives in remote locations. It is possible to classify bee swarm activity from audio signals using such approaches. A deep neural networks IoT-based acoustic swarm classification is proposed in this paper. Audio recordings were obtained from the Open Source Beehive project. Mel-frequency cepstral coefficients features were extracted from the audio signal. The lossless WAV and lossy MP3 audio formats were compared for IoT-based solutions. An analysis was made of the impact of the deep neural network parameters on the classification results. The best overall classification accuracy with uncompressed audio was 94.09%, but MP3 compression degraded the DNN accuracy by over 10%. The evaluation of the proposed deep neural networks IoT-based bee activity acoustic classification showed improved results if compared to the previous hidden Markov models system.


Electronics ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1349
Author(s):  
Stefan Lattner ◽  
Javier Nistal

Lossy audio codecs compress (and decompress) digital audio streams by removing information that tends to be inaudible in human perception. Under high compression rates, such codecs may introduce a variety of impairments in the audio signal. Many works have tackled the problem of audio enhancement and compression artifact removal using deep-learning techniques. However, only a few works tackle the restoration of heavily compressed audio signals in the musical domain. In such a scenario, there is no unique solution for the restoration of the original signal. Therefore, in this study, we test a stochastic generator of a Generative Adversarial Network (GAN) architecture for this task. Such a stochastic generator, conditioned on highly compressed musical audio signals, could one day generate outputs indistinguishable from high-quality releases. Therefore, the present study may yield insights into more efficient musical data storage and transmission. We train stochastic and deterministic generators on MP3-compressed audio signals with 16, 32, and 64 kbit/s. We perform an extensive evaluation of the different experiments utilizing objective metrics and listening tests. We find that the models can improve the quality of the audio signals over the MP3 versions for 16 and 32 kbit/s and that the stochastic generators are capable of generating outputs that are closer to the original signals than those of the deterministic generators.


Author(s):  
Roberto Barcala-Furelos ◽  
Cristian Abelairas-Gómez ◽  
Alejandra Alonso-Calvete ◽  
Francisco Cano-Noguera ◽  
Aida Carballo-Fazanes ◽  
...  

Abstract Introduction: On-boat resuscitation can be applied by lifeguards in an inflatable rescue boat (IRB). Due to Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-COV-2) and recommendations for the use of personal protective equipment (PPE), prehospital care procedures need to be re-evaluated. The objective of this study was to determine how the use of PPE influences the amount of preparation time needed before beginning actual resuscitation and the quality of cardiopulmonary resuscitation (CPR; QCPR) on an IRB. Methods: Three CPR tests were performed by 14 lifeguards, in teams of two, wearing different PPE: (1) Basic PPE (B-PPE): gloves, a mask, and protective glasses; (2) Full PPE (F-PPE): B-PPE + a waterproof apron; and (3) Basic PPE + plastic blanket (B+PPE). On-boat resuscitation using a bag-valve-mask (BVM) and high efficiency particulate air (HEPA) filter was performed sailing at 20km/hour. Results: Using B-PPE takes less time and is significantly faster than F-PPE (B-PPE 17 [SD = 2] seconds versus F-PPE 69 [SD = 17] seconds; P = .001), and the use of B+PPE is slightly higher (B-PPE 17 [SD = 2] seconds versus B+PPE 34 [SD = 6] seconds; P = .002). The QCPR remained similar in all three scenarios (P >.05), reaching values over 79%. Conclusion: The use of PPE during on-board resuscitation is feasible and does not interfere with quality when performed by trained lifeguards. The use of a plastic blanket could be a quick and easy alternative to offer extra protection to lifeguards during CPR on an IRB.


2013 ◽  
Vol 842 ◽  
pp. 634-638
Author(s):  
Yan Jing ◽  
Feng Zhao

By comparison, this paper determines inner bore processing technic program of the engineering machinery hydraulic cylinder block and makes some analysis of the rolling processing technic and relevant emerging issues to propose reasonable and feasible process route and process parameters and ensure the quality of the cylinder processing. It also shows the design of boring-rolling compound tools with high efficiency and high precision for given cylinders.


2014 ◽  
Vol 989-994 ◽  
pp. 1631-1634
Author(s):  
Ping Wang ◽  
Bin Wang

Product data is the source data of product lifecycle in manufacturing enterprise. The quality of product data largely determines the effect of the application of Engineering analysis, simulation assembly and CNC programming work and so on. In order to solve the problems of the existing product data quality, such as validation custom trival, lack of high efficiency and flexibility, etc. The validation method of product data quality (PDQ) based on class was proposed in NX software environment, the representation of validation rules classes of product data quality, validation rules customization and implementation of validation process were introduced in detail in this study. Finally, an application case was employed to verify the practicability and effectiveness of the proposed method.


2014 ◽  
Vol 08 (02) ◽  
pp. 229-243
Author(s):  
Sachin Deshpande

The newly approved High Efficiency Video Coding Standard (HEVC) includes temporal sub-layering feature, which provides temporal scalability. Two types of pictures — Temporal Sub-layer Access Pictures and Step-wise Temporal Sub-layer Access Pictures are provided for this purpose. This paper utilizes the temporal scalability in HEVC to provide bandwidth adaptive HTTP streaming. We describe our HTTP streaming algorithm, which is media timeline aware and which dynamically switches temporal sub-layers on the server side. We performed subjective tests to determine user perception regarding acceptable frame rates when using temporal scalability of HEVC. These results are used to control the algorithm's temporal switching behavior to provide a good quality of experience to the user. We applied Internet and 3GPP error-delay patterns to validate the performance of our algorithm.


Sign in / Sign up

Export Citation Format

Share Document