scholarly journals Deep Learning-based Face Super-resolution: A Survey

2023 ◽  
Vol 55 (1) ◽  
pp. 1-36
Author(s):  
Junjun Jiang ◽  
Chenyang Wang ◽  
Xianming Liu ◽  
Jiayi Ma

Face super-resolution (FSR), also known as face hallucination, which is aimed at enhancing the resolution of low-resolution (LR) face images to generate high-resolution face images, is a domain-specific image super-resolution problem. Recently, FSR has received considerable attention and witnessed dazzling advances with the development of deep learning techniques. To date, few summaries of the studies on the deep learning-based FSR are available. In this survey, we present a comprehensive review of deep learning-based FSR methods in a systematic manner. First, we summarize the problem formulation of FSR and introduce popular assessment metrics and loss functions. Second, we elaborate on the facial characteristics and popular datasets used in FSR. Third, we roughly categorize existing methods according to the utilization of facial characteristics. In each category, we start with a general description of design principles, present an overview of representative approaches, and then discuss the pros and cons among them. Fourth, we evaluate the performance of some state-of-the-art methods. Fifth, joint FSR and other tasks, and FSR-related applications are roughly introduced. Finally, we envision the prospects of further technological advancement in this field.

2018 ◽  
Author(s):  
Andre Lamurias ◽  
Luka A. Clarke ◽  
Francisco M. Couto

AbstractRecent studies have proposed deep learning techniques, namely recurrent neural networks, to improve biomedical text mining tasks. However, these techniques rarely take advantage of existing domain-specific resources, such as ontologies. In Life and Health Sciences there is a vast and valuable set of such resources publicly available, which are continuously being updated. Biomedical ontologies are nowadays a mainstream approach to formalize existing knowledge about entities, such as genes, chemicals, phenotypes, and disorders. These resources contain supplementary information that may not be yet encoded in training data, particularly in domains with limited labeled data.We propose a new model, BO-LSTM, that takes advantage of domain-specific ontologies, by representing each entity as the sequence of its ancestors in the ontology. We implemented BO-LSTM as a recurrent neural network with long short-term memory units and using an open biomedical ontology, which in our case-study was Chemical Entities of Biological Interest (ChEBI). We assessed the performance of BO-LSTM on detecting and classifying drug-drug interactions in a publicly available corpus from an international challenge, composed of 792 drug descriptions and 233 scientific abstracts. By using the domain-specific ontology in addition to word embeddings and WordNet, BO-LSTM improved both the F1-score of the detection and classification of drug-drug interactions, particularly in a document set with a limited number of annotations. Our findings demonstrate that besides the high performance of current deep learning techniques, domain-specific ontologies can still be useful to mitigate the lack of labeled data.Author summaryA high quantity of biomedical information is only available in documents such as scientific articles and patents. Due to the rate at which new documents are produced, we need automatic methods to extract useful information from them. Text mining is a subfield of information retrieval which aims at extracting relevant information from text. Scientific literature is a challenge to text mining because of the complexity and specificity of the topics approached. In recent years, deep learning has obtained promising results in various text mining tasks by exploring large datasets. On the other hand, ontologies provide a detailed and sound representation of a domain and have been developed to diverse biomedical domains. We propose a model that combines deep learning algorithms with biomedical ontologies to identify relations between concepts in text. We demonstrate the potential of this model to extract drug-drug interactions from abstracts and drug descriptions. This model can be applied to other biomedical domains using an annotated corpus of documents and an ontology related to that domain to train a new classifier.


2019 ◽  
Vol 63 (11) ◽  
pp. 1658-1667
Author(s):  
M J Castro-Bleda ◽  
S España-Boquera ◽  
J Pastor-Pellicer ◽  
F Zamora-Martínez

Abstract This paper presents the ‘NoisyOffice’ database. It consists of images of printed text documents with noise mainly caused by uncleanliness from a generic office, such as coffee stains and footprints on documents or folded and wrinkled sheets with degraded printed text. This corpus is intended to train and evaluate supervised learning methods for cleaning, binarization and enhancement of noisy images of grayscale text documents. As an example, several experiments of image enhancement and binarization are presented by using deep learning techniques. Also, double-resolution images are also provided for testing super-resolution methods. The corpus is freely available at UCI Machine Learning Repository. Finally, a challenge organized by Kaggle Inc. to denoise images, using the database, is described in order to show its suitability for benchmarking of image processing systems.


Author(s):  
Bertrand Bessagnet ◽  
Maxime Beauchamp ◽  
Laurent Menut ◽  
Ronan Fablet ◽  
Enrico Pisoni ◽  
...  

2020 ◽  
Vol 79 (47-48) ◽  
pp. 36063-36075 ◽  
Author(s):  
Valentina Franzoni ◽  
Giulio Biondi ◽  
Alfredo Milani

AbstractCrowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. A critical question concerning the innovative concept of crowd emotions is whether the emotional content of crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. In this work, we present a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual. Transfer learning techniques are used on a convolutional neural network, pre-trained on low-level features using the well-known ImageNet extensive dataset of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which we fine-tune the domain-specific features. Experiments held on the finally trained Convolutional Neural Network show promising performances of the proposed model to classify the emotions of the crowd.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1601
Author(s):  
Rujing Wang ◽  
Liu Liu ◽  
Chengjun Xie ◽  
Po Yang ◽  
Rui Li ◽  
...  

The recent explosion of large volume of standard dataset of annotated images has offered promising opportunities for deep learning techniques in effective and efficient object detection applications. However, due to a huge difference of quality between these standardized dataset and practical raw data, it is still a critical problem on how to maximize utilization of deep learning techniques in practical agriculture applications. Here, we introduce a domain-specific benchmark dataset, called AgriPest, in tiny wild pest recognition and detection, providing the researchers and communities with a standard large-scale dataset of practically wild pest images and annotations, as well as evaluation procedures. During the past seven years, AgriPest captures 49.7K images of four crops containing 14 species of pests by our designed image collection equipment in the field environment. All of the images are manually annotated by agricultural experts with up to 264.7K bounding boxes of locating pests. This paper also offers a detailed analysis of AgriPest where the validation set is split into four types of scenes that are common in practical pest monitoring applications. We explore and evaluate the performance of state-of-the-art deep learning techniques over AgriPest. We believe that the scale, accuracy, and diversity of AgriPest can offer great opportunities to researchers in computer vision as well as pest monitoring applications.


2021 ◽  
Vol 15 ◽  
Author(s):  
Karl M. Kuntzelman ◽  
Jacob M. Williams ◽  
Phui Cheng Lim ◽  
Ashok Samal ◽  
Prahalada K. Rao ◽  
...  

In recent years, multivariate pattern analysis (MVPA) has been hugely beneficial for cognitive neuroscience by making new experiment designs possible and by increasing the inferential power of functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and other neuroimaging methodologies. In a similar time frame, “deep learning” (a term for the use of artificial neural networks with convolutional, recurrent, or similarly sophisticated architectures) has produced a parallel revolution in the field of machine learning and has been employed across a wide variety of applications. Traditional MVPA also uses a form of machine learning, but most commonly with much simpler techniques based on linear calculations; a number of studies have applied deep learning techniques to neuroimaging data, but we believe that those have barely scratched the surface of the potential deep learning holds for the field. In this paper, we provide a brief introduction to deep learning for those new to the technique, explore the logistical pros and cons of using deep learning to analyze neuroimaging data – which we term “deep MVPA,” or dMVPA – and introduce a new software toolbox (the “Deep Learning In Neuroimaging: Exploration, Analysis, Tools, and Education” package, DeLINEATE for short) intended to facilitate dMVPA for neuroscientists (and indeed, scientists more broadly) everywhere.


2021 ◽  
Vol 7 ◽  
pp. e621
Author(s):  
Syed Muhammad Arsalan Bashir ◽  
Yi Wang ◽  
Mahrukh Khan ◽  
Yilong Niu

Image super-resolution (SR) is one of the vital image processing methods that improve the resolution of an image in the field of computer vision. In the last two decades, significant progress has been made in the field of super-resolution, especially by utilizing deep learning methods. This survey is an effort to provide a detailed survey of recent progress in single-image super-resolution in the perspective of deep learning while also informing about the initial classical methods used for image super-resolution. The survey classifies the image SR methods into four categories, i.e., classical methods, supervised learning-based methods, unsupervised learning-based methods, and domain-specific SR methods. We also introduce the problem of SR to provide intuition about image quality metrics, available reference datasets, and SR challenges. Deep learning-based approaches of SR are evaluated using a reference dataset. Some of the reviewed state-of-the-art image SR methods include the enhanced deep SR network (EDSR), cycle-in-cycle GAN (CinCGAN), multiscale residual network (MSRN), meta residual dense network (Meta-RDN), recurrent back-projection network (RBPN), second-order attention network (SAN), SR feedback network (SRFBN) and the wavelet-based residual attention network (WRAN). Finally, this survey is concluded with future directions and trends in SR and open problems in SR to be addressed by the researchers.


Sign in / Sign up

Export Citation Format

Share Document