scholarly journals Improving TDWZ Correlation Noise Estimation: A Deep Learning based Approach

Author(s):  
Vũ Hữu Tiến ◽  
Thao Nguyen Thi Huong ◽  
San Vu Van ◽  
Xiem HoangVan

Transform domain Wyner-Ziv video coding (TDWZ) has shown its benefits in compressing video applications with limited resources such as visual surveillance systems, remote sensing and wireless sensor networks. In TDWZ, the correlation noise model (CNM) plays a vital role since it directly affects to the number of bits needed to send from the encoder and thus the overall TDWZ compression performance. To achieve CNM with high accurate for TDWZ, we propose in this paper a novel CNM estimation approach in which the CNM with Laplacian distribution is adaptively estimated based on a deep learning (DL) mechanism. The proposed DL based CNM includes two hidden layers and a linear activation function to adaptively update the Laplacian parameter. Experimental results showed that the proposed TDWZ codec significantly outperforms the relevant benchmarks, notably by around 35% bitrate saving when compared to the DISCOVER codec and around 22% bitrate saving when compared to the HEVC Intra benchmark while providing a similar perceptual quality.

2015 ◽  
Vol 752-753 ◽  
pp. 1110-1115
Author(s):  
Rui Cai ◽  
Deng Yin Zhang

In transform domain distributed video coding scheme, we found that there was a certain deviation between Laplacian statistical distribution and the distribution of small and large residual coefficients. To reduce this deviation, this paper proposes a hybrid distribution correlation noise model (HDCNM) based on K-Mediods, which models small coefficients as improved Laplacian distribution while modeling large ones as Cauchy distribution. The parameter estimation algorithm is also given. The experimental results show that the hybrid model proposed in this paper can describe the distribution of residual coefficients between WZ frame and side information accurately, so as to improve the distortion performance of transform domain distributed video coding effectively, and reduce the computational complexity of decoder.


Author(s):  
V. Vinodhini ◽  
B. Sathiyabhama ◽  
S. Sankar ◽  
Ramasubbareddy Somula

Video captions help people to understand in a noisy environment or when the sound is muted. It helps people having impaired hearing to understand much better. Captions not only support the content creators and translators but also boost the search engine optimization. Many advanced areas like computer vision and human-computer interaction play a vital role as there is a successful growth of deep learning techniques. Numerous surveys on deep learning models are evolved with different methods, architecture, and metrics. Working with video subtitles is still challenging in terms of activity recognition in video. This paper proposes a deep structured model that is effective towards activity recognition, automatically classifies and caption it in a single architecture. The first process includes subtracting the foreground from the background; this is done by building a 3D convolutional neural network (CNN) model. A Gaussian mixture model is used to remove the backdrop. The classification is done using long short-term memory networks (LSTM). A hidden Markov model (HMM) is used to generate the high quality data. Next, it uses the nonlinear activation function to perform the normalization process. Finally, the video captioning is achieved by using natural language.


Face recognition plays a vital role in security purpose. In recent years, the researchers have focused on the pose illumination, face recognition, etc,. The traditional methods of face recognition focus on Open CV’s fisher faces which results in analyzing the face expressions and attributes. Deep learning method used in this proposed system is Convolutional Neural Network (CNN). Proposed work includes the following modules: [1] Face Detection [2] Gender Recognition [3] Age Prediction. Thus the results obtained from this work prove that real time age and gender detection using CNN provides better accuracy results compared to other existing approaches.


2019 ◽  
Vol 9 (22) ◽  
pp. 4871 ◽  
Author(s):  
Quan Liu ◽  
Chen Feng ◽  
Zida Song ◽  
Joseph Louis ◽  
Jian Zhou

Earthmoving is an integral civil engineering operation of significance, and tracking its productivity requires the statistics of loads moved by dump trucks. Since current truck loads’ statistics methods are laborious, costly, and limited in application, this paper presents the framework of a novel, automated, non-contact field earthmoving quantity statistics (FEQS) for projects with large earthmoving demands that use uniform and uncovered trucks. The proposed FEQS framework utilizes field surveillance systems and adopts vision-based deep learning for full/empty-load truck classification as the core work. Since convolutional neural network (CNN) and its transfer learning (TL) forms are popular vision-based deep learning models and numerous in type, a comparison study is conducted to test the framework’s core work feasibility and evaluate the performance of different deep learning models in implementation. The comparison study involved 12 CNN or CNN-TL models in full/empty-load truck classification, and the results revealed that while several provided satisfactory performance, the VGG16-FineTune provided the optimal performance. This proved the core work feasibility of the proposed FEQS framework. Further discussion provides model choice suggestions that CNN-TL models are more feasible than CNN prototypes, and models that adopt different TL methods have advantages in either working accuracy or speed for different tasks.


2021 ◽  
pp. 136943322098663
Author(s):  
Diana Andrushia A ◽  
Anand N ◽  
Eva Lubloy ◽  
Prince Arulraj G

Health monitoring of concrete including, detecting defects such as cracking, spalling on fire affected concrete structures plays a vital role in the maintenance of reinforced cement concrete structures. However, this process mostly uses human inspection and relies on subjective knowledge of the inspectors. To overcome this limitation, a deep learning based automatic crack detection method is proposed. Deep learning is a vibrant strategy under computer vision field. The proposed method consists of U-Net architecture with an encoder and decoder framework. It performs pixel wise classification to detect the thermal cracks accurately. Binary Cross Entropy (BCA) based loss function is selected as the evaluation function. Trained U-Net is capable of detecting major thermal cracks and minor thermal cracks under various heating durations. The proposed, U-Net crack detection is a novel method which can be used to detect the thermal cracks developed on fire exposed concrete structures. The proposed method is compared with the other state-of-the-art methods and found to be accurate with 78.12% Intersection over Union (IoU).


2021 ◽  
Vol 7 (2) ◽  
pp. 12
Author(s):  
Yousef I. Mohamad ◽  
Samah S. Baraheem ◽  
Tam V. Nguyen

Automatic event recognition in sports photos is both an interesting and valuable research topic in the field of computer vision and deep learning. With the rapid increase and the explosive spread of data, which is being captured momentarily, the need for fast and precise access to the right information has become a challenging task with considerable importance for multiple practical applications, i.e., sports image and video search, sport data analysis, healthcare monitoring applications, monitoring and surveillance systems for indoor and outdoor activities, and video captioning. In this paper, we evaluate different deep learning models in recognizing and interpreting the sport events in the Olympic Games. To this end, we collect a dataset dubbed Olympic Games Event Image Dataset (OGED) including 10 different sport events scheduled for the Olympic Games Tokyo 2020. Then, the transfer learning is applied on three popular deep convolutional neural network architectures, namely, AlexNet, VGG-16 and ResNet-50 along with various data augmentation methods. Extensive experiments show that ResNet-50 with the proposed photobombing guided data augmentation achieves 90% in terms of accuracy.


Author(s):  
Chen-Xu Liu ◽  
Gui-Lan Yu

This study presents an approach based on deep learning to design layered periodic wave barriers with consideration of typical range of soil parameters. Three cases are considered where P wave and S wave exist separately or simultaneously. The deep learning model is composed of an autoencoder with a pretrained decoder which has three branches to output frequency attenuation domains for three different cases. A periodic activation function is used to improve the design accuracy, and condition variables are applied in the code layer of the autoencoder to meet the requirements of practical multi working conditions. Forty thousand sets of data are generated to train, validate, and test the model, and the designed results are highly consistent with the targets. The presented approach has great generality, feasibility, rapidity, and accuracy on designing layered periodic wave barriers which exhibit good performance in wave suppression in targeted frequency range.


2021 ◽  
Author(s):  
Shwetank Krishna ◽  
Syahrir Ridha ◽  
Suhaib Umer Ilyas ◽  
Scott Campbell ◽  
Uday Bhan ◽  
...  

Abstract Accurate prediction of downhole pressure differential (surge/swab pressure gradient) in the eccentric annulus of ultra-deep wells during tripping operation is a necessity to optimize well geometry, reduction of drilling anomalies, and prevention of hazardous drilling accidents. Therefore, a new predictive model is developed to forecast surge/swab pressure gradient by using feed-forward and backpropagation deep neural networks (FFBP-DNN). A theoretical-based model is developed that follows the physical and mechanical aspects of surge/swab pressure generation in eccentric annulus during tripping operation. The data generated from this model, field data, and experimental data are used to train and test the FFBP-DNN networks. The network is developed used Keras’s deep learning framework. After testing the models, the most optimal arrangement of FFBP-DNN is the ReLU algorithm as an activation function, 4-hidden layers, the learning rate of 0.003, and 2300 of training numbers. The optimum FFBP-DNN model is validated by comparing it with field data (Wells K 470 and K 480, North Sea). It shows an excellent argument between predicted data and field data with an error range of ±7.68 %.


Agriculture is the backbone and plays a vital role in many Asian countries. Farmers mainly depend on their agricultural produce for their living. A report says one-third of the farmers income account’s for the agricultural loss which is primarily due to plant diseases. To combat this farmers are in need of a early plant disease identification mechanism. Observation of individual plants in the farm for detecting the disease is labor-intensive and time consuming work, if the farm is vast and multiple plants are cultivated then it’s even worse. To solve such issues, current technologies like the Internet of Things (IoT) and artificial intelligence (AI) and Machine Learning (ML) are used to predict the diseases more effectively. Farmers usually detect plant diseases with the help of images captured manually and analyzed separately by experts. The proposed system renders an efficient solution for detecting multiple diseases in several plant varieties. The system is designed to detect and recognize several plant varieties, specifically pepper, grapes, and strawberry. The proposed system discovers various plant’s various diseases based on the inputs obtained by capturing images from a built-in camera present in the Autonomous rover. The rover also record’s it’s GPS location and makes a map of the entire farm traced and checked by the robot. The images are processed and are classified into their respective categories using deep learning algorithms. Convolutional neural networks the powerful methodology for image classification is the underlying principle applied. The deep learning model’s architecture namely, VGG16 and InceptionResNetV2, are used to train the model. These models are primarily made of convolutional layers. On testing, we recorded am accuracy of 93.21% was obtained from VGG16, and 95.24% from InceptionResNetV2.


Sign in / Sign up

Export Citation Format

Share Document