scholarly journals CAPTCHA recognition based on a generative-adversarial network

Author(s):  
Anatoliy Parfenov ◽  
Peter Sychov

CAPTCHA recognition is certainly not a new research topic. Over the past decade, researchers have demonstrated various ways to automatically recognize text-based CAPTCHAs. However, in such methods, the recognition setup requires a large participation of experts and carries a laborious process of collecting and marking data. This article presents a general, low-cost, but effective approach to automatically solving text-based CAPTCHAs based on deep learning. This approach is based on the architecture of a generative-competitive network, which will significantly reduce the number of real required CAPTCHAs.

Information ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 249
Author(s):  
Xin Jin ◽  
Yuanwen Zou ◽  
Zhongbing Huang

The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classification accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufficient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classification. ResNet is one of the most used deep learning classification networks. The classification accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classification system based on WGAN-GP and ResNet is useful for the classification of imbalanced images. Moreover, our method could potentially solve the low classification accuracy in biomedical images caused by insufficient numbers of original images and the imbalanced distribution of original images.


Author(s):  
Xinyi Li ◽  
Liqiong Chang ◽  
Fangfang Song ◽  
Ju Wang ◽  
Xiaojiang Chen ◽  
...  

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.


2021 ◽  
Author(s):  
James Howard ◽  
◽  
Joe Tracey ◽  
Mike Shen ◽  
Shawn Zhang ◽  
...  

Borehole image logs are used to identify the presence and orientation of fractures, both natural and induced, found in reservoir intervals. The contrast in electrical or acoustic properties of the rock matrix and fluid-filled fractures is sufficiently large enough that sub-resolution features can be detected by these image logging tools. The resolution of these image logs is based on the design and operation of the tools, and generally is in the millimeter per pixel range. Hence the quantitative measurement of actual width remains problematic. An artificial intelligence (AI) -based workflow combines the statistical information obtained from a Machine-Learning (ML) segmentation process with a multiple-layer neural network that defines a Deep Learning process that enhances fractures in a borehole image. These new images allow for a more robust analysis of fracture widths, especially those that are sub-resolution. The images from a BHTV log were first segmented into rock and fluid-filled fractures using a ML-segmentation tool that applied multiple image processing filters that captured information to describe patterns in fracture-rock distribution based on nearest-neighbor behavior. The robust ML analysis was trained by users to identify these two components over a short interval in the well, and then the regression model-based coefficients applied to the remaining log. Based on the training, each pixel was assigned a probability value between 1.0 (being a fracture) and 0.0 (pure rock), with most of the pixels assigned one of these two values. Intermediate probabilities represented pixels on the edge of rock-fracture interface or the presence of one or more sub-resolution fractures within the rock. The probability matrix produced a map or image of the distribution of probabilities that determined whether a given pixel in the image was a fracture or partially filled with a fracture. The Deep Learning neural network was based on a Conditional Generative Adversarial Network (cGAN) approach where the probability map was first encoded and combined with a noise vector that acted as a seed for diverse feature generation. This combination was used to generate new images that represented the BHTV response. The second layer of the neural network, the adversarial or discriminator portion, determined whether the generated images were representative of the actual BHTV by comparing the generated images with actual images from the log and producing an output probability of whether it was real or fake. This probability was then used to train the generator and discriminator models that were then applied to the entire log. Several scenarios were run with different probability maps. The enhanced BHTV images brought out fractures observed in the core photos that were less obvious in the original BTHV log through enhanced continuity and improved resolution on fracture widths.


Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3913 ◽  
Author(s):  
Mingxuan Li ◽  
Ou Li ◽  
Guangyi Liu ◽  
Ce Zhang

With the recently explosive growth of deep learning, automatic modulation recognition has undergone rapid development. Most of the newly proposed methods are dependent on large numbers of labeled samples. We are committed to using fewer labeled samples to perform automatic modulation recognition in the cognitive radio domain. Here, a semi-supervised learning method based on adversarial training is proposed which is called signal classifier generative adversarial network. Most of the prior methods based on this technology involve computer vision applications. However, we improve the existing network structure of a generative adversarial network by adding the encoder network and a signal spatial transform module, allowing our framework to address radio signal processing tasks more efficiently. These two technical improvements effectively avoid nonconvergence and mode collapse problems caused by the complexity of the radio signals. The results of simulations show that compared with well-known deep learning methods, our method improves the classification accuracy on a synthetic radio frequency dataset by 0.1% to 12%. In addition, we verify the advantages of our method in a semi-supervised scenario and obtain a significant increase in accuracy compared with traditional semi-supervised learning methods.


Cataract is a degenerative condition that, according to estimations, will rise globally. Even though there are various proposals about its diagnosis, there are remaining problems to be solved. This paper aims to identify the current situation of the recent investigations on cataract diagnosis using a framework to conduct the literature review with the intention of answering the following research questions: RQ1) Which are the existing methods for cataract diagnosis? RQ2) Which are the features considered for the diagnosis of cataracts? RQ3) Which is the existing classification when diagnosing cataracts? RQ4) And Which obstacles arise when diagnosing cataracts? Additionally, a cross-analysis of the results was made. The results showed that new research is required in: (1) the classification of “congenital cataract” and, (2) portable solutions, which are necessary to make cataract diagnoses easily and at a low cost.


Author(s):  
S. M. Tilon ◽  
F. Nex ◽  
D. Duarte ◽  
N. Kerle ◽  
G. Vosselman

Abstract. Degradation and damage detection provides essential information to maintenance workers in routine monitoring and to first responders in post-disaster scenarios. Despite advance in Earth Observation (EO), image analysis and deep learning techniques, the quality and quantity of training data for deep learning is still limited. As a result, no robust method has been found yet that can transfer and generalize well over a variety of geographic locations and typologies of damages. Since damages can be seen as anomalies, occurring sparingly over time and space, we propose to use an anomaly detecting Generative Adversarial Network (GAN) to detect damages. The main advantages of using GANs are that only healthy unannotated images are needed, and that a variety of damages, including the never before seen damage, can be detected. In this study we aimed to investigate 1) the ability of anomaly detecting GANs to detect degradation (potholes and cracks) in asphalt road infrastructures using Mobile Mapper imagery and building damage (collapsed buildings, rubble piles) using post-disaster aerial imagery, and 2) the sensitivity of this method against various types of pre-processing. Our results show that we can detect damages in urban scenes at satisfying levels but not on asphalt roads. Future work will investigate how to further classify the found damages and how to improve damage detection for asphalt roads.


2020 ◽  
Author(s):  
Yang Zhang ◽  
Ning Yue ◽  
Min‐Ying Su ◽  
Bo Liu ◽  
Yi Ding ◽  
...  

2020 ◽  
Vol 245 (7) ◽  
pp. 597-605 ◽  
Author(s):  
Tri Vu ◽  
Mucong Li ◽  
Hannah Humayun ◽  
Yuan Zhou ◽  
Junjie Yao

With balanced spatial resolution, penetration depth, and imaging speed, photoacoustic computed tomography (PACT) is promising for clinical translation such as in breast cancer screening, functional brain imaging, and surgical guidance. Typically using a linear ultrasound (US) transducer array, PACT has great flexibility for hand-held applications. However, the linear US transducer array has a limited detection angle range and frequency bandwidth, resulting in limited-view and limited-bandwidth artifacts in the reconstructed PACT images. These artifacts significantly reduce the imaging quality. To address these issues, existing solutions often have to pay the price of system complexity, cost, and/or imaging speed. Here, we propose a deep-learning-based method that explores the Wasserstein generative adversarial network with gradient penalty (WGAN-GP) to reduce the limited-view and limited-bandwidth artifacts in PACT. Compared with existing reconstruction and convolutional neural network approach, our model has shown improvement in imaging quality and resolution. Our results on simulation, phantom, and in vivo data have collectively demonstrated the feasibility of applying WGAN-GP to improve PACT’s image quality without any modification to the current imaging set-up. Impact statement This study has the following main impacts. It offers a promising solution for removing limited-view and limited-bandwidth artifact in PACT using a linear-array transducer and conventional image reconstruction, which have long hindered its clinical translation. Our solution shows unprecedented artifact removal ability for in vivo image, which may enable important applications such as imaging tumor angiogenesis and hypoxia. The study reports, for the first time, the use of an advanced deep-learning model based on stabilized generative adversarial network. Our results have demonstrated its superiority over other state-of-the-art deep-learning methods.


Author(s):  
Mohammad Shahab Uddin ◽  
Jiang Li

Deep learning models are data driven. For example, the most popular convolutional neural network (CNN) model used for image classification or object detection requires large labeled databases for training to achieve competitive performances. This requirement is not difficult to be satisfied in the visible domain since there are lots of labeled video and image databases available nowadays. However, given the less popularity of infrared (IR) camera, the availability of labeled infrared videos or image databases is limited. Therefore, training deep learning models in infrared domain is still challenging. In this chapter, we applied the pix2pix generative adversarial network (Pix2Pix GAN) and cycle-consistent GAN (Cycle GAN) models to convert visible videos to infrared videos. The Pix2Pix GAN model requires visible-infrared image pairs for training while the Cycle GAN relaxes this constraint and requires only unpaired images from both domains. We applied the two models to an open-source database where visible and infrared videos provided by the signal multimedia and telecommunications laboratory at the Federal University of Rio de Janeiro. We evaluated conversion results by performance metrics including Inception Score (IS), Frechet Inception Distance (FID) and Kernel Inception Distance (KID). Our experiments suggest that cycle-consistent GAN is more effective than pix2pix GAN for generating IR images from optical images.


Sign in / Sign up

Export Citation Format

Share Document