scholarly journals Image Synthesis Based On Feature Description

Author(s):  
Rohan Bolusani

Abstract: Generating realistic images from text is innovative and interesting, but modern-day machine learning models are still far from this goal. With research and development in the field of natural language processing, neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, in the field of machine learning, generative adversarial networks (GANs) have begun to generate extremely accurate images of especially in categories, such as faces, album covers, and room interiors. In this work, the main goal is to develop a neural network to bridge these advances in text and image modelling, by essentially translating characters to pixels the project will demonstrate the capability of generative models by taking detailed text descriptions and generate plausible images. Keywords: Deep Learning, Computer Vision, NLP, Generative Adversarial Networks

2021 ◽  
Vol 8 (1) ◽  
pp. 3-31
Author(s):  
Yuan Xue ◽  
Yuan-Chen Guo ◽  
Han Zhang ◽  
Tao Xu ◽  
Song-Hai Zhang ◽  
...  

AbstractIn many applications of computer graphics, art, and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph, or layout, and have a computer system automatically generate photo-realistic images according to that input. While classically, works that allow such automatic image content generation have followed a framework of image retrieval and composition, recent advances in deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and flow-based methods have enabled more powerful and versatile image generation approaches. This paper reviews recent works for image synthesis given intuitive user input, covering advances in input versatility, image generation methodology, benchmark datasets, and evaluation metrics. This motivates new perspectives on input representation and interactivity, cross fertilization between major image generation paradigms, and evaluation and comparison of generation methods.


Algorithms ◽  
2018 ◽  
Vol 11 (10) ◽  
pp. 164 ◽  
Author(s):  
Aggeliki Vlachostergiou ◽  
George Caridakis ◽  
Phivos Mylonas ◽  
Andreas Stafylopatis

The ability to learn robust, resizable feature representations from unlabeled data has potential applications in a wide variety of machine learning tasks. One way to create such representations is to train deep generative models that can learn to capture the complex distribution of real-world data. Generative adversarial network (GAN) approaches have shown impressive results in producing generative models of images, but relatively little work has been done on evaluating the performance of these methods for the learning representation of natural language, both in supervised and unsupervised settings at the document, sentence, and aspect level. Extensive research validation experiments were performed by leveraging the 20 Newsgroups corpus, the Movie Review (MR) Dataset, and the Finegrained Sentiment Dataset (FSD). Our experimental analysis suggests that GANs can successfully learn representations of natural language texts at all three aforementioned levels.


2017 ◽  
Author(s):  
Takafumi Arakaki ◽  
G. Barello ◽  
Yashar Ahmadian

AbstractTuning curves characterizing the response selectivities of biological neurons often exhibit large degrees of irregularity and diversity across neurons. Theoretical network models that feature heterogeneous cell populations or random connectivity also give rise to diverse tuning curves. However, a general framework for fitting such models to experimentally measured tuning curves is lacking. We address this problem by proposing to view mechanistic network models as generative models whose parameters can be optimized to fit the distribution of experimentally measured tuning curves. A major obstacle for fitting such models is that their likelihood function is not explicitly available or is highly intractable to compute. Recent advances in machine learning provide ways for fitting generative models without the need to evaluate the likelihood and its gradient. Generative Adversarial Networks (GAN) provide one such framework which has been successful in traditional machine learning tasks. We apply this approach in two separate experiments, showing how GANs can be used to fit commonly used mechanistic models in theoretical neuroscience to datasets of measured tuning curves. This fitting procedure avoids the computationally expensive step of inferring latent variables, e.g., the biophysical parameters of individual cells or the particular realization of the full synaptic connectivity matrix, and directly learns model parameters which characterize the statistics of connectivity or of single-cell properties. Another strength of this approach is that it fits the entire, joint distribution of experimental tuning curves, instead of matching a few summary statistics picked a priori by the user. More generally, this framework opens the door to fitting theoretically motivated dynamical network models directly to simultaneously or non-simultaneously recorded neural responses.


Electronics ◽  
2019 ◽  
Vol 8 (3) ◽  
pp. 292 ◽  
Author(s):  
Md Zahangir Alom ◽  
Tarek M. Taha ◽  
Chris Yakopcic ◽  
Stefan Westberg ◽  
Paheding Sidike ◽  
...  

In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have discussed recent developments, such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on DL using neural networks and a survey on Reinforcement Learning (RL). However, those papers have not discussed individual advanced techniques for training large-scale deep learning models and the recently developed method of generative models.


2020 ◽  
Vol 10 (4) ◽  
pp. 1449
Author(s):  
Hansoo Lee ◽  
Jonggeun Kim ◽  
Eun Kyeong Kim ◽  
Sungshin Kim

Ground-based weather radar can observe a wide range with a high spatial and temporal resolution. They are beneficial to meteorological research and services by providing valuable information. Recent weather radar data related research has focused on applying machine learning and deep learning to solve complicated problems. It is a well-known fact that an adequate amount of data is a positively necessary condition in machine learning and deep learning. Generative adversarial networks (GANs) have received extensive attention for their remarkable data generation capacity, with a fascinating competitive structure having been proposed since. Consequently, a massive number of variants have been proposed; which model is adequate to solve the given problem is an inevitable concern. In this paper, we propose exploring the problem of radar image synthesis and evaluating different GANs with authentic radar observation results. The experimental results showed that the improved Wasserstein GAN is more capable of generating similar radar images while achieving higher structural similarity results.


Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1216
Author(s):  
Sung-Wook Park ◽  
Jae-Sub Ko ◽  
Jun-Ho Huh ◽  
Jong-Chan Kim

The emergence of deep learning model GAN (Generative Adversarial Networks) is an important turning point in generative modeling. GAN is more powerful in feature and expression learning compared to machine learning-based generative model algorithms. Nowadays, it is also used to generate non-image data, such as voice and natural language. Typical technologies include BERT (Bidirectional Encoder Representations from Transformers), GPT-3 (Generative Pretrained Transformer-3), and MuseNet. GAN differs from the machine learning-based generative model and the objective function. Training is conducted by two networks: generator and discriminator. The generator converts random noise into a true-to-life image, whereas the discriminator distinguishes whether the input image is real or synthetic. As the training continues, the generator learns more sophisticated synthesis techniques, and the discriminator grows into a more accurate differentiator. GAN has problems, such as mode collapse, training instability, and lack of evaluation matrix, and many researchers have tried to solve these problems. For example, solutions such as one-sided label smoothing, instance normalization, and minibatch discrimination have been proposed. The field of application has also expanded. This paper provides an overview of GAN and application solutions for computer vision and artificial intelligence healthcare field researchers. The structure and principle of operation of GAN, the core models of GAN proposed to date, and the theory of GAN were analyzed. Application examples of GAN such as image classification and regression, image synthesis and inpainting, image-to-image translation, super-resolution and point registration were then presented. The discussion tackled GAN’s problems and solutions, and the future research direction was finally proposed.


2021 ◽  
Vol 13 (19) ◽  
pp. 4011
Author(s):  
Husam A. H. Al-Najjar ◽  
Biswajeet Pradhan ◽  
Raju Sarkar ◽  
Ghassan Beydoun ◽  
Abdullah Alamri

Landslide susceptibility mapping has significantly progressed with improvements in machine learning techniques. However, the inventory / data imbalance (DI) problem remains one of the challenges in this domain. This problem exists as a good quality landslide inventory map, including a complete record of historical data, is difficult or expensive to collect. As such, this can considerably affect one’s ability to obtain a sufficient inventory or representative samples. This research developed a new approach based on generative adversarial networks (GAN) to correct imbalanced landslide datasets. The proposed method was tested at Chukha Dzongkhag, Bhutan, one of the most frequent landslide prone areas in the Himalayan region. The proposed approach was then compared with the standard methods such as the synthetic minority oversampling technique (SMOTE), dense imbalanced sampling, and sparse sampling (i.e., producing non-landslide samples as many as landslide samples). The comparisons were based on five machine learning models, including artificial neural networks (ANN), random forests (RF), decision trees (DT), k-nearest neighbours (kNN), and the support vector machine (SVM). The model evaluation was carried out based on overall accuracy (OA), Kappa Index, F1-score, and area under receiver operating characteristic curves (AUROC). The spatial database was established with a total of 269 landslides and 10 conditioning factors, including altitude, slope, aspect, total curvature, slope length, lithology, distance from the road, distance from the stream, topographic wetness index (TWI), and sediment transport index (STI). The findings of this study have shown that both GAN and SMOTE data balancing approaches have helped to improve the accuracy of machine learning models. According to AUROC, the GAN method was able to boost the models by reaching the maximum accuracy of ANN (0.918), RF (0.933), DT (0.927), kNN (0.878), and SVM (0.907) when default parameters used. With the optimum parameters, all models performed best with GAN at their highest accuracy of ANN (0.927), RF (0.943), DT (0.923) and kNN (0.889), except SVM obtained the highest accuracy of (0.906) with SMOTE. Our finding suggests that RF balanced with GAN can provide the most reasonable criterion for landslide prediction. This research indicates that landslide data balancing may substantially affect the predictive capabilities of machine learning models. Therefore, the issue of DI in the spatial prediction of landslides should not be ignored. Future studies could explore other generative models for landslide data balancing. By using state-of-the-art GAN, the proposed model can be considered in the areas where the data are limited or imbalanced.


Author(s):  
R Wisnu Prio Pamungkas ◽  
Rakhmi Khalida ◽  
Siti Setiawati

ABSTRACT   Recently computers have been able to produce realistic photos from text. This is one of the potentials of machine learning to be used creatively. Machine learning is the field of solving problems that require an equivalent understanding of human intelligence. In this study using the Generative Adversarial Networks (GAN) algorithm is used to create images from text descriptions. The basic GAN architecture consists of 2 networks called a Generator and Discriminator network. The results of this study is images that are still not detailed in interpreting a text description, but the authors try to produce images that inspire, images can be more poetic when tried using poetry, lyrics, or book quotes. Keywords: GAN, Image Synthesis, Text Description   ABSTRAK   Baru-baru ini komputer mampu menghasilkan foto-foto yang realistis dari sebuah teks. Hal ini adalah salah satu potensi dari machine learning untuk digunakan secara kreatif. Machine learning adalah bidang menyelesaikan masalah-masalah yang membutuhkan pemahaman yang setara dengan kecerdasan manusia. Pada penelitian ini menggunakan algoritme Generative Adversarial Networks (GAN) digunakan untuk menciptakan gambar dari deskripsi teks. Dasar arsitektur GAN terdiri dari 2 jaringan yang disebut sebagai jaringan Generator dan Discriminator. Hasil dari penelitian ini berupa gambar yang masih tidak detail dalam memaknai sebuah deskripsi teks, tetapi penulis mencoba menghasilkan gambar yang menginspirasi, gambar dapat lebih puitis ketika dicoba menggunakan puisi, lirik, atau kutipan buku. Kata Kunci: GAN, Sintesis Gambar, Deskripsi Teks


2022 ◽  
Vol 54 (8) ◽  
pp. 1-49
Author(s):  
Abdul Jabbar ◽  
Xi Li ◽  
Bourahla Omar

The Generative Models have gained considerable attention in unsupervised learning via a new and practical framework called Generative Adversarial Networks (GAN) due to their outstanding data generation capability. Many GAN models have been proposed, and several practical applications have emerged in various domains of computer vision and machine learning. Despite GANs excellent success, there are still obstacles to stable training. The problems are Nash equilibrium, internal covariate shift, mode collapse, vanishing gradient, and lack of proper evaluation metrics. Therefore, stable training is a crucial issue in different applications for the success of GANs. Herein, we survey several training solutions proposed by different researchers to stabilize GAN training. We discuss (I) the original GAN model and its modified versions, (II) a detailed analysis of various GAN applications in different domains, and (III) a detailed study about the various GAN training obstacles as well as training solutions. Finally, we reveal several issues as well as research outlines to the topic.


2021 ◽  
Vol 54 (2) ◽  
pp. 1-38
Author(s):  
Yashar Deldjoo ◽  
Tommaso Di Noia ◽  
Felice Antonio Merra

Latent-factor models (LFM) based on collaborative filtering (CF), such as matrix factorization (MF) and deep CF methods, are widely used in modern recommender systems (RS) due to their excellent performance and recommendation accuracy. However, success has been accompanied with a major new arising challenge: Many applications of machine learning (ML) are adversarial in nature [146]. In recent years, it has been shown that these methods are vulnerable to adversarial examples, i.e., subtle but non-random perturbations designed to force recommendation models to produce erroneous outputs. The goal of this survey is two-fold: (i) to present recent advances on adversarial machine learning (AML) for the security of RS (i.e., attacking and defense recommendation models) and (ii) to show another successful application of AML in generative adversarial networks (GANs) for generative applications, thanks to their ability for learning (high-dimensional) data distributions. In this survey, we provide an exhaustive literature review of 76 articles published in major RS and ML journals and conferences. This review serves as a reference for the RS community working on the security of RS or on generative models using GANs to improve their quality.


Sign in / Sign up

Export Citation Format

Share Document