A Backward Compatible Multichannel Audio Compression Method

This paper proposes a backward-compatible multichannel audio codec based on downmix and upmix operation. The codec represents a multichannel audio input signal with downmixed mono signal and spatial parametric data. The encoding method consists of three parts: spatial temporal analysis of audio signal, compressing multi-channel audio into mono audio and encoding mono signals. The proposed codec combines high audio quality and low parameter coding rate and the method is simpler and more effective than the conventional methods. With this method, its possible to transmit or store multi-channel audio signals as mono audio signals.

Download Full-text

A Backward Compatible MultiChannel Audio Compression Method

Proceedings of the 2012 2nd International Conference on Computer and Information Applications (ICCIA 2012) ◽

10.2991/iccia.2012.146 ◽

2012 ◽

Author(s):

Xuefei Gao ◽

Guo Yang ◽

Jing Wang ◽

Xiang Xie ◽

Jingming Kuang

Keyword(s):

Audio Compression ◽

Compression Method ◽

Multichannel Audio

Download Full-text

Data Hiding for Stereo Audio Signals

Advances in Multimedia and Interactive Technologies - Multimedia Information Hiding Technologies and Methodologies for Controlling Data ◽

10.4018/978-1-4666-2217-3.ch006 ◽

2013 ◽

pp. 104-128

Author(s):

Kazuhiro Kondo

Keyword(s):

Data Hiding ◽

Audio Signal ◽

Audio Coding ◽

Original Signal ◽

Audio Signals ◽

Audio Quality ◽

Input Source ◽

Rate Conversion ◽

Fixed Delay

This chapter proposes two data-hiding algorithms for stereo audio signals. The first algorithm embeds data into a stereo audio signal by adding data-dependent mutual delays to the host stereo audio signal. The second algorithm adds fixed delay echoes with polarities that are data dependent and amplitudes that are adjusted such that the interchannel correlation matches the original signal. The robustness and the quality of the data-embedded audio will be given and compared for both algorithms. Both algorithms were shown to be fairly robust against common distortions, such as added noise, audio coding, and sample rate conversion. The embedded audio quality was shown to be “fair” to “good” for the first algorithm and “good” to “excellent” for the second algorithm, depending on the input source.

Download Full-text

An Effective Watermarking Method Based on Energy Averaging in Audio Signals

Mathematical Problems in Engineering ◽

10.1155/2018/6420314 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

S. E. Tsai ◽

S. M. Yang

Keyword(s):

Signal Processing ◽

Data Compression ◽

Error Correcting Code ◽

Audio Signal ◽

Segment Length ◽

Signal Quality ◽

Audio Signals ◽

Segment Sequence ◽

Audio Quality ◽

Dct Coefficients

Methods based on discrete cosine transform (DCT) have been proposed for digital watermarking of audio signals; however, the watermark is often vulnerable to data compression and signal processing. This paper presents an effective audio watermarking method by energy averaging of DCT coefficients such that an audio signal with watermark is robust to data processing. The method is to divide an audio signal into segments by three parameters defining the segment length, the segment sequence of watermark location, and the frequency range of DCT coefficients for watermark location. An error correcting code is also integrated to improve audio signal quality after watermarking. Experimental results show that the method is robust to data compression and many other kinds of signal processing. No original signal is required for decoding the watermark. Comparison of watermarking performance with a recent work validates that the watermarking method has better audio quality and higher robustness.

Download Full-text

A novel multichannel audio signal compression method based on tensor representation and decomposition

China Communications ◽

10.1109/cc.2014.6825261 ◽

2014 ◽

Vol 11 (3) ◽

pp. 80-90 ◽

Cited By ~ 2

Author(s):

Wang Jing ◽

Xie Xiang ◽

Kuang Jingming

Keyword(s):

Audio Signal ◽

Signal Compression ◽

Tensor Representation ◽

Compression Method ◽

Multichannel Audio ◽

Signal Compression Method

Download Full-text

Codificação perceptiva de áudio por meio de decomposições atômicas em exponenciais complexas

Revista Principia - Divulgação Científica e Tecnológica do IFPB ◽

10.18265/1517-03062015v1n46p196-212 ◽

2019 ◽

Vol 1 (46) ◽

pp. 196

Author(s):

Valmir Dos Santos Nogueira Junior ◽

Michel Pompeu Tcheou ◽

Flávio Rainho Ávila

Keyword(s):

Matching Pursuit ◽

Rate Distortion ◽

Compact Representation ◽

Audio Signals ◽

Audio Compression ◽

Audio Quality ◽

Perceptual Evaluation ◽

Layer I ◽

Analysis System ◽

Complex Exponential

<p class="Standard">The atomic decomposition of signals by algorithm of the class “Matching Pursuit” (MP) has been applied in audio compression. Literature review suggests that, the use of psychoacoustic criteria allows a more compact representation of the signal, without loss of perceived quality. This work presents the implementation of an analysis system by synthesis of audio signals using MP associated with the use of psychoacoustic global masking threshold, inspired by MPEG layer I, as well as Complex Exponential Dictionaries (DEC). For the compression of the signal, we used the optimization of rate-distortion by operational curves, adjusting the Lagrange multiplier. The performance of the compression method for different types of signals is evaluated by an objective measurement standardized by the International Telecommunications Union (ITU), the PEAQ (Perceptual Evaluation of Audio Quality) based on the bit rate per sample, obtaining satisfactory results.</p>

Download Full-text

Performance Evaluation of Multichannel Audio Compression

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v10.i1.pp146-153 ◽

2018 ◽

Vol 10 (1) ◽

pp. 146

Author(s):

Teddy Surya Gunawan ◽

Mira Kartiwi

Keyword(s):

Performance Evaluation ◽

Performance Metrics ◽

Lossless Compression ◽

Audio Signals ◽

Audio Quality ◽

Multichannel Audio ◽

Performance Metric ◽

Audio Files ◽

Channel Configuration ◽

Integrated Performance

<p>In recent years, multichannel audio systems are widely used in modern sound devices as it can provide more realistic and engaging experience to the listener. This paper focuses on the performance evaluation of three lossy, i.e. AAC, Ogg Vorbis, and Opus, and three lossless compression, i.e. FLAC, TrueAudio, and WavPack, for multichannel audio signals, including stereo, 5.1 and 7.1 channels. Experiments were conducted on the same three audio files but with different channel configurations. The performance of each encoder was evaluated based on its encoding time (averaged over 100 times), data reduction, and audio quality. Usually, there is always a trade-off between the three metrics. To simplify the evaluation, a new integrated performance metric was proposed that combines all the three performance metrics. Using the new measure, FLAC was found to be the best lossless compression, while Ogg Vorbis and Opus were found to be the best for lossy compression depends on the channel configuration. This result could be used in determining the proper audio format for multichannel audio systems.</p>

Download Full-text

Utterance Clustering Using Stereo Audio Channels

Computational Intelligence and Neuroscience ◽

10.1155/2021/6151651 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Yingjun Dong ◽

Neil G. MacLaren ◽

Yiding Cao ◽

Francis J. Yammarino ◽

Shelley D. Dionne ◽

...

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Audio Signal ◽

Gaussian Mixture ◽

Audio Signal Processing ◽

Audio Signals ◽

Multichannel Audio ◽

Audio Recordings ◽

Left And Right ◽

Complicated Conditions

Utterance clustering is one of the actively researched topics in audio signal processing and machine learning. This study aims to improve the performance of utterance clustering by processing multichannel (stereo) audio signals. Processed audio signals were generated by combining left- and right-channel audio signals in a few different ways and then by extracting the embedded features (also called d-vectors) from those processed audio signals. This study applied the Gaussian mixture model for supervised utterance clustering. In the training phase, a parameter-sharing Gaussian mixture model was obtained to train the model for each speaker. In the testing phase, the speaker with the maximum likelihood was selected as the detected speaker. Results of experiments with real audio recordings of multiperson discussion sessions showed that the proposed method that used multichannel audio signals achieved significantly better performance than a conventional method with mono-audio signals in more complicated conditions.

Download Full-text

IoT-Based Bee Swarm Activity Acoustic Classification Using Deep Neural Networks

Sensors ◽

10.3390/s21030676 ◽

2021 ◽

Vol 21 (3) ◽

pp. 676

Author(s):

Andrej Zgank

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Markov Models ◽

Audio Signal ◽

Audio Signals ◽

Mel Frequency Cepstral Coefficients ◽

Animal Activity ◽

The Impact ◽

Acoustic Classification ◽

Swarm Activity

Animal activity acoustic monitoring is becoming one of the necessary tools in agriculture, including beekeeping. It can assist in the control of beehives in remote locations. It is possible to classify bee swarm activity from audio signals using such approaches. A deep neural networks IoT-based acoustic swarm classification is proposed in this paper. Audio recordings were obtained from the Open Source Beehive project. Mel-frequency cepstral coefficients features were extracted from the audio signal. The lossless WAV and lossy MP3 audio formats were compared for IoT-based solutions. An analysis was made of the impact of the deep neural network parameters on the classification results. The best overall classification accuracy with uncompressed audio was 94.09%, but MP3 compression degraded the DNN accuracy by over 10%. The evaluation of the proposed deep neural networks IoT-based bee activity acoustic classification showed improved results if compared to the previous hidden Markov models system.

Download Full-text

Stochastic Restoration of Heavily Compressed Musical Audio Using Generative Adversarial Networks

Electronics ◽

10.3390/electronics10111349 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1349

Author(s):

Stefan Lattner ◽

Javier Nistal

Keyword(s):

Data Storage ◽

Audio Signal ◽

Human Perception ◽

Generative Adversarial Networks ◽

Audio Signals ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Extensive Evaluation ◽

Listening Tests ◽

Musical Audio

Lossy audio codecs compress (and decompress) digital audio streams by removing information that tends to be inaudible in human perception. Under high compression rates, such codecs may introduce a variety of impairments in the audio signal. Many works have tackled the problem of audio enhancement and compression artifact removal using deep-learning techniques. However, only a few works tackle the restoration of heavily compressed audio signals in the musical domain. In such a scenario, there is no unique solution for the restoration of the original signal. Therefore, in this study, we test a stochastic generator of a Generative Adversarial Network (GAN) architecture for this task. Such a stochastic generator, conditioned on highly compressed musical audio signals, could one day generate outputs indistinguishable from high-quality releases. Therefore, the present study may yield insights into more efficient musical data storage and transmission. We train stochastic and deterministic generators on MP3-compressed audio signals with 16, 32, and 64 kbit/s. We perform an extensive evaluation of the different experiments utilizing objective metrics and listening tests. We find that the models can improve the quality of the audio signals over the MP3 versions for 16 and 32 kbit/s and that the stochastic generators are capable of generating outputs that are closer to the original signals than those of the deterministic generators.

Download Full-text

Enhancing the green efficiency of fundamental sectors in China’s industrial system: A spatial-temporal analysis

Journal of Management Science and Engineering ◽

10.1016/j.jmse.2021.03.002 ◽

2021 ◽

Author(s):

Jiangxue Zhang ◽

Xu Liu ◽

Xue Zhang ◽

Yuan Chang ◽

Changbo Wang ◽

...

Keyword(s):

Temporal Analysis ◽

Industrial System ◽

System A ◽

Spatial Temporal Analysis

Download Full-text