Watermarking Based on Compressive Sensing for Digital Speech Detection and Recovery †

In this paper, a novel imperceptible, fragile and blind watermark scheme is proposed for speech tampering detection and self-recovery. The embedded watermark data for content recovery is calculated from the original discrete cosine transform (DCT) coefficients of host speech. The watermark information is shared in a frames-group instead of stored in one frame. The scheme trades off between the data waste problem and the tampering coincidence problem. When a part of a watermarked speech signal is tampered with, one can accurately localize the tampered area, the watermark data in the area without any modification still can be extracted. Then, a compressive sensing technique is employed to retrieve the coefficients by exploiting the sparseness in the DCT domain. The smaller the tampered the area, the better quality of the recovered signal is. Experimental results show that the watermarked signal is imperceptible, and the recovered signal is intelligible for high tampering rates of up to 47.6%. A deep learning-based enhancement method is also proposed and implemented to increase the SNR of recovered speech signal.

Download Full-text

Convolutional neural-network-based classification of retinal images with different combinations of filtering techniques

Open Computer Science ◽

10.1515/comp-2020-0177 ◽

2021 ◽

Vol 11 (1) ◽

pp. 480-490

Author(s):

Asha Gnana Priya Henry ◽

Anitha Jude

Keyword(s):

Neural Network ◽

Diabetic Retinopathy ◽

Deep Learning ◽

Convolutional Neural Network ◽

Performance Metrics ◽

Input Image ◽

Application Techniques ◽

Enhancement Method ◽

Neural Network Input

Abstract Retinal image analysis is one of the important diagnosis methods in modern ophthalmology because eye information is present in the retina. The image acquisition process may have some effects and can affect the quality of the image. This can be improved by better image enhancement techniques combined with the computer-aided diagnosis system. Deep learning is one of the important computational application techniques used for a medical imaging application. The main aim of this article is to find the best enhancement techniques for the identification of diabetic retinopathy (DR) and are tested with the commonly used deep learning techniques, and the performances are measured. In this article, the input image is taken from the Indian-based database named as Indian Diabetic Retinopathy Image Dataset, and 13 filters are used including smoothing and sharpening filters for enhancing the images. Then, the quality of the enhancement techniques is compared using performance metrics and better results are obtained for Median, Gaussian, Bilateral, Wiener, and partial differential equation filters and are combined for improving the enhancement of images. The output images from all the enhanced filters are given as the convolutional neural network input and the results are compared to find the better enhancement method.

Download Full-text

Reference Sharing Mechanism-Based Self-Embedding Watermarking Scheme with Deterministic Content Reconstruction

Security and Communication Networks ◽

10.1155/2018/2516324 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Dongmei Niu ◽

Hongxia Wang ◽

Minquan Cheng ◽

Canghong Shi

Keyword(s):

Reference Data ◽

Generator Matrix ◽

Mds Code ◽

Coincidence Problem ◽

Image Block ◽

Maximum Distance Separable ◽

Tampered Image ◽

Watermarking Scheme ◽

Content Recovery

This paper presents a reference sharing mechanism-based self-embedding watermarking scheme. The host image is embedded with watermark bits including the reference data for content recovery and the authentication data for tampering location. The special encoding matrix derived from the generator matrix of selected systematic Maximum Distance Separable (MDS) code is adopted. The reference data is generated by encoding all the representative data of the original image blocks. On the receiver side, the tampered image blocks can be located by the authentication data. The reference data embedded in one image block can be shared by all the image blocks to restore the tampered content. The tampering coincidence problem can be avoided at the extreme. The maximal tampering rate is deduced theoretically. Experimental results show that, as long as the tampering rate is less than the maximal tampering rate, the content recovery is deterministic. The quality of recovered content does not decrease with the maximal tampering rate.

Download Full-text

Compressed Sensing Speech Signal Enhancement Research

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v6.i1.pp26-35 ◽

2017 ◽

Vol 6 (1) ◽

pp. 26

Author(s):

Kuangfeng Ning ◽

Guojun Qin

Keyword(s):

Compressed Sensing ◽

Compressive Sensing ◽

Input Signal ◽

Speech Signal ◽

Signal To Noise Ratio ◽

Sampling Theorem ◽

Compressed Domain ◽

Signal To Noise ◽

Perceptual Evaluation

The proposed Compressive sensing method is a new alternative method, it is used to eliminate noise from the input signal, and the quality of the speech signal is enhanced with fewer samples, thus it is required for the reconstruction than needed in some of the methods like Nyquist sampling theorem. The basic idea is that the speech signals are sparse in nature, and most of the noise signals are non-sparse in nature, and Compressive Sensing(CS) eliminates the non-sparse components and it reconstructs only the sparse components of the input signal. Experimental results prove that the average segmental SNR (signal to noise ratio) and PESQ (perceptual evaluation of speech quality) scores are better in the compressed domain．

Download Full-text

Analysis of color Image Enhancement Using DWT, Wavelet Shrinkageand FHE Methods

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.21 ◽

2017 ◽

Vol 7 (8) ◽

pp. 56

Author(s):

Ashish Dwivedi ◽

Nirupma Tiwari

Keyword(s):

Image Enhancement ◽

Color Image ◽

Human Perception ◽

Histogram Equalization ◽

Wavelet Shrinkage ◽

Visual Appearance ◽

Enhancement Method ◽

Fuzzy Histogram ◽

Rgb Image

Image enhancement (IE) is very important in the field where visual appearance of an image is the main. Image enhancement is the process of improving the image in such a way that the resulting or output image is more suitable than the original image for specific task. With the help of image enhancement process the quality of image can be improved to get good quality images so that they can be clear for human perception or for the further analysis done by machines.Image enhancement method enhances the quality, visual appearance, improves clarity of images, removes blurring and noise, increases contrast and reveals details. The aim of this paper is to study and determine limitations of the existing IE techniques. This paper will provide an overview of different IE techniques commonly used. We Applied DWT on original RGB image then we applied FHE (Fuzzy Histogram Equalization) after DWT we have done the wavelet shrinkage on Three bands (LH, HL, HH). After that we fuse the shrinkage image and FHE image together and we get the enhance image.

Download Full-text

A unified framework for automated person re-indentification

Transport and Communication Science Journal ◽

10.25073/tcsj.71.7.11 ◽

2020 ◽

Vol 71 (7) ◽

pp. 868-880

Author(s):

Nguyen Hong-Quan ◽

Nguyen Thuy-Binh ◽

Tran Duc-Long ◽

Le Thi-Lan

Keyword(s):

Deep Learning ◽

Video Analysis ◽

Camera Network ◽

Unified Framework ◽

Person Detection ◽

Practical Applications ◽

Detection And Tracking ◽

Analysis System ◽

Bounding Boxes

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

Singular Values Decomposition and Lifting Wavelet Transform for Speech Signal Embedding into Digital Image

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096511666180511151646 ◽

2019 ◽

Vol 12 (2) ◽

pp. 138-151

Author(s):

Mourad Talbi ◽

Med Salim Bouhlel

Keyword(s):

Wavelet Transform ◽

Speech Signal ◽

Signal To Noise Ratio ◽

Perceptual Quality ◽

Lifting Wavelet Transform ◽

Signal To Noise ◽

Perceptual Evaluation ◽

Lifting Wavelet ◽

Noise Ratio

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.

Download Full-text

Fast Multi-Focus Fusion Based on Deep Learning for Early-Stage Embryo Image Enhancement

Sensors ◽

10.3390/s21030863 ◽

2021 ◽

Vol 21 (3) ◽

pp. 863

Author(s):

Vidas Raudonis ◽

Agne Paulauskaite-Taraseviciene ◽

Kristina Sutiene

Keyword(s):

Deep Learning ◽

Image Fusion ◽

Early Stage ◽

Image Data ◽

Cell Detection ◽

Processing Times ◽

Fused Image ◽

Stage Embryo ◽

Early Stage Embryo

Background: Cell detection and counting is of essential importance in evaluating the quality of early-stage embryo. Full automation of this process remains a challenging task due to different cell size, shape, the presence of incomplete cell boundaries, partially or fully overlapping cells. Moreover, the algorithm to be developed should process a large number of image data of different quality in a reasonable amount of time. Methods: Multi-focus image fusion approach based on deep learning U-Net architecture is proposed in the paper, which allows reducing the amount of data up to 7 times without losing spectral information required for embryo enhancement in the microscopic image. Results: The experiment includes the visual and quantitative analysis by estimating the image similarity metrics and processing times, which is compared to the results achieved by two wellknown techniques—Inverse Laplacian Pyramid Transform and Enhanced Correlation Coefficient Maximization. Conclusion: Comparatively, the image fusion time is substantially improved for different image resolutions, whilst ensuring the high quality of the fused image.

Download Full-text

Toward an Automatic Quality Assessment of Voice-Based Telemedicine Consultations: A Deep Learning Approach

Sensors ◽

10.3390/s21093279 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3279

Author(s):

Maria Habib ◽

Mohammad Faris ◽

Raneem Qaddoura ◽

Manal Alomari ◽

Alaa Alomari ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Quality Assessment ◽

Transcript Level ◽

Assessment Process ◽

Classification Model ◽

Systematic Evaluation ◽

Transcript Levels ◽

Perceptual Evaluation

Maintaining a high quality of conversation between doctors and patients is essential in telehealth services, where efficient and competent communication is important to promote patient health. Assessing the quality of medical conversations is often handled based on a human auditory-perceptual evaluation. Typically, trained experts are needed for such tasks, as they follow systematic evaluation criteria. However, the daily rapid increase of consultations makes the evaluation process inefficient and impractical. This paper investigates the automation of the quality assessment process of patient–doctor voice-based conversations in a telehealth service using a deep-learning-based classification model. For this, the data consist of audio recordings obtained from Altibbi. Altibbi is a digital health platform that provides telemedicine and telehealth services in the Middle East and North Africa (MENA). The objective is to assist Altibbi’s operations team in the evaluation of the provided consultations in an automated manner. The proposed model is developed using three sets of features: features extracted from the signal level, the transcript level, and the signal and transcript levels. At the signal level, various statistical and spectral information is calculated to characterize the spectral envelope of the speech recordings. At the transcript level, a pre-trained embedding model is utilized to encompass the semantic and contextual features of the textual information. Additionally, the hybrid of the signal and transcript levels is explored and analyzed. The designed classification model relies on stacked layers of deep neural networks and convolutional neural networks. Evaluation results show that the model achieved a higher level of precision when compared with the manual evaluation approach followed by Altibbi’s operations team.

Download Full-text

Synthetic speech detection through short-term and long-term prediction traces

EURASIP Journal on Information Security ◽

10.1186/s13635-021-00116-3 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Clara Borrelli ◽

Paolo Bestagini ◽

Fabio Antonacci ◽

Augusto Sarti ◽

Stefano Tubaro

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Synthetic Speech ◽

Opinion Formation ◽

Closed Set ◽

Speech Detection ◽

Open Set ◽

Technological Advances ◽

Speech Generation ◽

Long Term Prediction

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.

Download Full-text