scholarly journals Video Forensics: Identifying Colorized Images Using Deep Learning

2021 ◽  
Vol 11 (2) ◽  
pp. 476
Author(s):  
Carlos Ulloa ◽  
Dora M. Ballesteros ◽  
Diego Renza

In recent years there has been a significant increase in images and videos circulating in social networks and media, edited with different techniques, including colorization. This has a negative impact on the forensic field because it is increasingly difficult to discern what is original content and what is fake. To address this problem, we propose two models (a custom architecture and a transfer-learning-based model) based on CNNs that allows a fast recognition of the colorized images (or videos). In the experimental test, the effect of three hyperparameters on the performance of the classifier were analyzed in terms of HTER (Half Total Error Rate). The best result was found for the Adam optimizer, with a dropout of 0.25 and an input image size of 400 × 400 pixels. Additionally, the proposed models are compared with each other in terms of performance and inference times and with some state-of-the-art approaches. In terms of inference times per image, the proposed custom model is 12x faster than the transfer-learning-based model; however, in terms of precision (P), recall and F1-score, the transfer-learning-based model is better than the custom model. Both models generalize better than other models reported in the literature.

2011 ◽  
Vol 194-196 ◽  
pp. 248-254
Author(s):  
Shao Jun Chu ◽  
Pei Xiao Liu ◽  
Pei Xian Chen

The burden calculation of ferromanganese alloy was calculated based on the slag composition and designed product. The calculation results showed the total error rate of this method was 1.53% and the error rate of ore, coke and silicon was 4.66%, 1.71%, and 5.66% respectively, which was much better than using the traditional elements recovery method with the total error rate was 8.00% and the silicon error rate reached to 18.55%. This new method is more accurate than the traditional method and much closer to the actual production data. And it can apply to different ferroalloy factories because it is based on phase diagram and the mass conservation law. At the same time, the calculation result can reflect the gap between enterprise production craft level and ideal production level. This method has certain reference value to improve production technology, product quality and economic profit of enterprise.


2020 ◽  
Vol 30 (10) ◽  
pp. 2050060
Author(s):  
Pankaj Mishra ◽  
Claudio Piciarelli ◽  
Gian Luca Foresti

Image anomaly detection is an application-driven problem where the aim is to identify novel samples, which differ significantly from the normal ones. We here propose Pyramidal Image Anomaly DEtector (PIADE), a deep reconstruction-based pyramidal approach, in which image features are extracted at different scale levels to better catch the peculiarities that could help to discriminate between normal and anomalous data. The features are dynamically routed to a reconstruction layer and anomalies can be identified by comparing the input image with its reconstruction. Unlike similar approaches, the comparison is done by using structural similarity and perceptual loss rather than trivial pixel-by-pixel comparison. The proposed method performed at par or better than the state-of-the-art methods when tested on publicly available datasets such as CIFAR10, COIL-100 and MVTec.


2021 ◽  
Vol 12 (2) ◽  
pp. 1-24
Author(s):  
Md Abul Bashar ◽  
Richi Nayak

Language model (LM) has become a common method of transfer learning in Natural Language Processing (NLP) tasks when working with small labeled datasets. An LM is pretrained using an easily available large unlabelled text corpus and is fine-tuned with the labelled data to apply to the target (i.e., downstream) task. As an LM is designed to capture the linguistic aspects of semantics, it can be biased to linguistic features. We argue that exposing an LM model during fine-tuning to instances that capture diverse semantic aspects (e.g., topical, linguistic, semantic relations) present in the dataset will improve its performance on the underlying task. We propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance. Experimental results show that MAS performs better than random sampling as well as the state-of-the-art active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier.


2020 ◽  
Author(s):  
Abhinav Sagar ◽  
J Dheeba

AbstractIn this work, we address the problem of skin cancer classification using convolutional neural networks. A lot of cancer cases early on are misdiagnosed as something else leading to severe consequences including the death of a patient. Also there are cases in which patients have some other problems and doctors think they might have skin cancer. This leads to unnecessary time and money spent for further diagnosis. In this work, we address both of the above problems using deep neural networks and transfer learning architecture. We have used publicly available ISIC databases for both training and testing our model. Our work achieves an accuracy of 0.935, precision of 0.94, recall of 0.77, F1 score of 0.85 and ROC-AUC of 0.861 which is better than the previous state of the art approaches.


Author(s):  
Rajni Sethi ◽  
Sreedevi Indu

Optical properties of water distort the quality of underwater images. Underwater images are characterized by poor contrast, color cast, noise and haze. These images need to be pre-processed so as to get some information. In this paper, a novel technique named Fusion of Underwater Image Enhancement and Restoration (FUIER) has been proposed which enhances as well as restores underwater images with a target to act on all major issues in underwater images, i.e. color cast removal, contrast enhancement and dehazing. It generates two versions of the single input image and these two versions are fused using Laplacian pyramid-based fusion to get the enhanced image. The proposed method works efficiently for all types of underwater images captured in different conditions (turbidity, depth, salinity, etc.). Results obtained using the proposed method are better than those for state-of-the-art methods.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3809 ◽  
Author(s):  
Yuda Song ◽  
Yunfang Zhu ◽  
Xin Du

Deep convolutional neural networks have achieved great performance on various image restoration tasks. Specifically, the residual dense network (RDN) has achieved great results on image noise reduction by cascading multiple residual dense blocks (RDBs) to make full use of the hierarchical feature. However, the RDN only performs well in denoising on a single noise level, and the computational cost of the RDN increases significantly with the increase in the number of RDBs, and this only slightly improves the effect of denoising. To overcome this, we propose the dynamic residual dense network (DRDN), a dynamic network that can selectively skip some RDBs based on the noise amount of the input image. Moreover, the DRDN allows modifying the denoising strength to manually get the best outputs, which can make the network more effective for real-world denoising. Our proposed DRDN can perform better than the RDN and reduces the computational cost by 40 – 50 % . Furthermore, we surpass the state-of-the-art CBDNet by 1.34 dB on the real-world noise benchmark.


2021 ◽  
Author(s):  
Akila Pemasiri ◽  
Kien Nguyen ◽  
Sridha Sridha ◽  
Clinton Fookes

Abstract This work addresses hand mesh recovery from a single RGB image. In contrast to most of the existing approaches where parametric hand models are employed as the prior, we show that the hand mesh can be learned directly from the input image. We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training. By interpreting the mesh as a graph, our model is able to capture the topological relationship among the mesh vertices. We also introduce a 3D surface descriptor into the GAN architecture to further capture the associated 3D features. We conduct experiments with the proposed Im2Mesh GAN architecture in two settings: one where we can reap the benefits of coupled groundtruth data availability of the images and the corresponding meshes; and the other which combats the more challenging problem of mesh estimation without the corresponding groundtruth. Through extensive evaluations we demonstrate that even without using any hand priors the proposed method performs on par or better than the state-of-the-art.


2020 ◽  
Author(s):  
Pathikkumar Patel ◽  
Bhargav Lad ◽  
Jinan Fiaidhi

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.


2020 ◽  
Vol 34 (03) ◽  
pp. 2594-2601
Author(s):  
Arjun Akula ◽  
Shuai Wang ◽  
Song-Chun Zhu

We present CoCoX (short for Conceptual and Counterfactual Explanations), a model for explaining decisions made by a deep convolutional neural network (CNN). In Cognitive Psychology, the factors (or semantic-level features) that humans zoom in on when they imagine an alternative to a model prediction are often referred to as fault-lines. Motivated by this, our CoCoX model explains decisions made by a CNN using fault-lines. Specifically, given an input image I for which a CNN classification model M predicts class cpred, our fault-line based explanation identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class calt. We argue that, due to the conceptual and counterfactual nature of fault-lines, our CoCoX explanations are practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, showing that CoCoX significantly outperforms the state-of-the-art explainable AI models. Our implementation is available at https://github.com/arjunakula/CoCoX


Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


Sign in / Sign up

Export Citation Format

Share Document