scholarly journals Watermarking Based on Compressive Sensing for Digital Speech Detection and Recovery †

Sensors ◽  
2018 ◽  
Vol 18 (7) ◽  
pp. 2390 ◽  
Author(s):  
Wenhuan Lu ◽  
Zonglei Chen ◽  
Ling Li ◽  
Xiaochun Cao ◽  
Jianguo Wei ◽  
...  

In this paper, a novel imperceptible, fragile and blind watermark scheme is proposed for speech tampering detection and self-recovery. The embedded watermark data for content recovery is calculated from the original discrete cosine transform (DCT) coefficients of host speech. The watermark information is shared in a frames-group instead of stored in one frame. The scheme trades off between the data waste problem and the tampering coincidence problem. When a part of a watermarked speech signal is tampered with, one can accurately localize the tampered area, the watermark data in the area without any modification still can be extracted. Then, a compressive sensing technique is employed to retrieve the coefficients by exploiting the sparseness in the DCT domain. The smaller the tampered the area, the better quality of the recovered signal is. Experimental results show that the watermarked signal is imperceptible, and the recovered signal is intelligible for high tampering rates of up to 47.6%. A deep learning-based enhancement method is also proposed and implemented to increase the SNR of recovered speech signal.

2021 ◽  
Vol 11 (1) ◽  
pp. 480-490
Author(s):  
Asha Gnana Priya Henry ◽  
Anitha Jude

Abstract Retinal image analysis is one of the important diagnosis methods in modern ophthalmology because eye information is present in the retina. The image acquisition process may have some effects and can affect the quality of the image. This can be improved by better image enhancement techniques combined with the computer-aided diagnosis system. Deep learning is one of the important computational application techniques used for a medical imaging application. The main aim of this article is to find the best enhancement techniques for the identification of diabetic retinopathy (DR) and are tested with the commonly used deep learning techniques, and the performances are measured. In this article, the input image is taken from the Indian-based database named as Indian Diabetic Retinopathy Image Dataset, and 13 filters are used including smoothing and sharpening filters for enhancing the images. Then, the quality of the enhancement techniques is compared using performance metrics and better results are obtained for Median, Gaussian, Bilateral, Wiener, and partial differential equation filters and are combined for improving the enhancement of images. The output images from all the enhanced filters are given as the convolutional neural network input and the results are compared to find the better enhancement method.


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Dongmei Niu ◽  
Hongxia Wang ◽  
Minquan Cheng ◽  
Canghong Shi

This paper presents a reference sharing mechanism-based self-embedding watermarking scheme. The host image is embedded with watermark bits including the reference data for content recovery and the authentication data for tampering location. The special encoding matrix derived from the generator matrix of selected systematic Maximum Distance Separable (MDS) code is adopted. The reference data is generated by encoding all the representative data of the original image blocks. On the receiver side, the tampered image blocks can be located by the authentication data. The reference data embedded in one image block can be shared by all the image blocks to restore the tampered content. The tampering coincidence problem can be avoided at the extreme. The maximal tampering rate is deduced theoretically. Experimental results show that, as long as the tampering rate is less than the maximal tampering rate, the content recovery is deterministic. The quality of recovered content does not decrease with the maximal tampering rate.


Author(s):  
Kuangfeng Ning ◽  
Guojun Qin

<span lang="EN-US">The proposed Compressive sensing method is a new alternative method</span><span lang="EN-US">, it is</span><span lang="EN-US"> used to eliminate noise from the input signal</span><span lang="EN-US">,</span><span lang="EN-US"> and the quality of the speech signal </span><span lang="EN-US">is </span><span lang="EN-US">enhance</span><span lang="EN-US">d</span><span lang="EN-US"> with fewer samples</span><span lang="EN-US">, thus it is</span><span lang="EN-US"> required for the reconstruction than needed in some of the methods like Nyquist sampling theorem. The basic idea is</span><span lang="EN-US"> that </span><span lang="EN-US">the speech signals are sparse in nature</span><span lang="EN-US">,</span><span lang="EN-US"> and most of the noise signals are non-sparse in nature, and Compressive </span><span lang="EN-US">S</span><span lang="EN-US">ensing</span><span lang="EN-US">(</span><span lang="EN-US">CS) eliminates the non-sparse components and </span><span lang="EN-US">it </span><span lang="EN-US">reconstructs only the sparse components of the input signal. Experimental results prove that the average segmental SNR (signal to noise ratio) and PESQ (perceptual evaluation of speech quality) scores are better in the compressed domain</span><span lang="EN-US">.</span>


Author(s):  
Ashish Dwivedi ◽  
Nirupma Tiwari

Image enhancement (IE) is very important in the field where visual appearance of an image is the main. Image enhancement is the process of improving the image in such a way that the resulting or output image is more suitable than the original image for specific task. With the help of image enhancement process the quality of image can be improved to get good quality images so that they can be clear for human perception or for the further analysis done by machines.Image enhancement method enhances the quality, visual appearance, improves clarity of images, removes blurring and noise, increases contrast and reveals details. The aim of this paper is to study and determine limitations of the existing IE techniques. This paper will provide an overview of different IE techniques commonly used. We Applied DWT on original RGB image then we applied FHE (Fuzzy Histogram Equalization) after DWT we have done the wavelet shrinkage on Three bands (LH, HL, HH). After that we fuse the shrinkage image and FHE image together and we get the enhance image.


2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


Author(s):  
Mourad Talbi ◽  
Med Salim Bouhlel

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 863
Author(s):  
Vidas Raudonis ◽  
Agne Paulauskaite-Taraseviciene ◽  
Kristina Sutiene

Background: Cell detection and counting is of essential importance in evaluating the quality of early-stage embryo. Full automation of this process remains a challenging task due to different cell size, shape, the presence of incomplete cell boundaries, partially or fully overlapping cells. Moreover, the algorithm to be developed should process a large number of image data of different quality in a reasonable amount of time. Methods: Multi-focus image fusion approach based on deep learning U-Net architecture is proposed in the paper, which allows reducing the amount of data up to 7 times without losing spectral information required for embryo enhancement in the microscopic image. Results: The experiment includes the visual and quantitative analysis by estimating the image similarity metrics and processing times, which is compared to the results achieved by two wellknown techniques—Inverse Laplacian Pyramid Transform and Enhanced Correlation Coefficient Maximization. Conclusion: Comparatively, the image fusion time is substantially improved for different image resolutions, whilst ensuring the high quality of the fused image.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3279
Author(s):  
Maria Habib ◽  
Mohammad Faris ◽  
Raneem Qaddoura ◽  
Manal Alomari ◽  
Alaa Alomari ◽  
...  

Maintaining a high quality of conversation between doctors and patients is essential in telehealth services, where efficient and competent communication is important to promote patient health. Assessing the quality of medical conversations is often handled based on a human auditory-perceptual evaluation. Typically, trained experts are needed for such tasks, as they follow systematic evaluation criteria. However, the daily rapid increase of consultations makes the evaluation process inefficient and impractical. This paper investigates the automation of the quality assessment process of patient–doctor voice-based conversations in a telehealth service using a deep-learning-based classification model. For this, the data consist of audio recordings obtained from Altibbi. Altibbi is a digital health platform that provides telemedicine and telehealth services in the Middle East and North Africa (MENA). The objective is to assist Altibbi’s operations team in the evaluation of the provided consultations in an automated manner. The proposed model is developed using three sets of features: features extracted from the signal level, the transcript level, and the signal and transcript levels. At the signal level, various statistical and spectral information is calculated to characterize the spectral envelope of the speech recordings. At the transcript level, a pre-trained embedding model is utilized to encompass the semantic and contextual features of the textual information. Additionally, the hybrid of the signal and transcript levels is explored and analyzed. The designed classification model relies on stacked layers of deep neural networks and convolutional neural networks. Evaluation results show that the model achieved a higher level of precision when compared with the manual evaluation approach followed by Altibbi’s operations team.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Clara Borrelli ◽  
Paolo Bestagini ◽  
Fabio Antonacci ◽  
Augusto Sarti ◽  
Stefano Tubaro

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.


Sign in / Sign up

Export Citation Format

Share Document