scholarly journals Deep Learning Exoplanets Detection by Combining Real and Synthetic Data

Author(s):  
Sara Cuéllar ◽  
Paulo Granados ◽  
Ernesto Fabregas ◽  
Michel Curé ◽  
Hector Vargas ◽  
...  

Scientists and astronomers have attached Scientists and astronomers have attached great importance to the task of discovering new exoplanets, even more so if they are in the habitable zone. To date, more than 4300 exoplanets have been confirmed by NASA, using various discovery techniques, including planetary transits, in addition to the use of various databases provided by space and ground-based telescopes. This article proposes the development of a deep learning system for detecting planetary transits in Kepler Telescope lightcurves. The approach is based on related work from the literature and enhanced to validation with real lightcurves. A CNN classification model is trained from a mixture of real and synthetic data, and validated only with real data and different from those used in the training stage. The best ratio of synthetic data is determined by the perform of an optimisation technique and a sensitivity analysis. The precision, accuracy and true positive rate of the best model obtained are determined and compared with other similar works. The results demonstrate that the use of synthetic data on the training stage can improve the transit detection performance on real light curves.

Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7785
Author(s):  
Jun Mao ◽  
Change Zheng ◽  
Jiyan Yin ◽  
Ye Tian ◽  
Wenbin Cui

Training a deep learning-based classification model for early wildfire smoke images requires a large amount of rich data. However, due to the episodic nature of fire events, it is difficult to obtain wildfire smoke image data, and most of the samples in public datasets suffer from a lack of diversity. To address these issues, a method using synthetic images to train a deep learning classification model for real wildfire smoke was proposed in this paper. Firstly, we constructed a synthetic dataset by simulating a large amount of morphologically rich smoke in 3D modeling software and rendering the virtual smoke against many virtual wildland background images with rich environmental diversity. Secondly, to better use the synthetic data to train a wildfire smoke image classifier, we applied both pixel-level domain adaptation and feature-level domain adaptation. The CycleGAN-based pixel-level domain adaptation method for image translation was employed. On top of this, the feature-level domain adaptation method incorporated ADDA with DeepCORAL was adopted to further reduce the domain shift between the synthetic and real data. The proposed method was evaluated and compared on a test set of real wildfire smoke and achieved an accuracy of 97.39%. The method is applicable to wildfire smoke classification tasks based on RGB single-frame images and would also contribute to training image classification models without sufficient data.


2019 ◽  
Vol 486 (3) ◽  
pp. 4158-4165 ◽  
Author(s):  
Dmitry A Duev ◽  
Ashish Mahabal ◽  
Quanzhi Ye ◽  
Kushal Tirumala ◽  
Justin Belicki ◽  
...  

ABSTRACT We present DeepStreaks, a convolutional-neural-network, deep-learning system designed to efficiently identify streaking fast-moving near-Earth objects that are detected in the data of the Zwicky Transient Facility (ZTF), a wide-field, time-domain survey using a dedicated 47 deg2 camera attached to the Samuel Oschin 48-inch Telescope at the Palomar Observatory in California, United States. The system demonstrates a 96–98 per cent true positive rate, depending on the night, while keeping the false positive rate below 1 per cent. The sensitivity of DeepStreaks is quantified by the performance on the test data sets as well as using known near-Earth objects observed by ZTF. The system is deployed and adapted for usage within the ZTF Solar system framework and has significantly reduced human involvement in the streak identification process, from several hours to typically under 10 min per day.


2021 ◽  
Vol 11 (9) ◽  
pp. 3863
Author(s):  
Ali Emre Öztürk ◽  
Ergun Erçelebi

A large amount of training image data is required for solving image classification problems using deep learning (DL) networks. In this study, we aimed to train DL networks with synthetic images generated by using a game engine and determine the effects of the networks on performance when solving real-image classification problems. The study presents the results of using corner detection and nearest three-point selection (CDNTS) layers to classify bird and rotary-wing unmanned aerial vehicle (RW-UAV) images, provides a comprehensive comparison of two different experimental setups, and emphasizes the significant improvements in the performance in deep learning-based networks due to the inclusion of a CDNTS layer. Experiment 1 corresponds to training the commonly used deep learning-based networks with synthetic data and an image classification test on real data. Experiment 2 corresponds to training the CDNTS layer and commonly used deep learning-based networks with synthetic data and an image classification test on real data. In experiment 1, the best area under the curve (AUC) value for the image classification test accuracy was measured as 72%. In experiment 2, using the CDNTS layer, the AUC value for the image classification test accuracy was measured as 88.9%. A total of 432 different combinations of trainings were investigated in the experimental setups. The experiments were trained with various DL networks using four different optimizers by considering all combinations of batch size, learning rate, and dropout hyperparameters. The test accuracy AUC values for networks in experiment 1 ranged from 55% to 74%, whereas the test accuracy AUC values in experiment 2 networks with a CDNTS layer ranged from 76% to 89.9%. It was observed that the CDNTS layer has considerable effects on the image classification accuracy performance of deep learning-based networks. AUC, F-score, and test accuracy measures were used to validate the success of the networks.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chiaki Kuwada ◽  
Yoshiko Ariji ◽  
Yoshitaka Kise ◽  
Takuma Funakoshi ◽  
Motoki Fukuda ◽  
...  

AbstractAlthough panoramic radiography has a role in the examination of patients with cleft alveolus (CA), its appearances is sometimes difficult to interpret. The aims of this study were to develop a computer-aided diagnosis system for diagnosing the CA status on panoramic radiographs using a deep learning object detection technique with and without normal data in the learning process, to verify its performance in comparison to human observers, and to clarify some characteristic appearances probably related to the performance. The panoramic radiographs of 383 CA patients with cleft palate (CA with CP) or without cleft palate (CA only) and 210 patients without CA (normal) were used to create two models on the DetectNet. The models 1 and 2 were developed based on the data without and with normal subjects, respectively, to detect the CAs and classify them into with or without CP. The model 2 reduced the false positive rate (1/30) compared to the model 1 (12/30). The overall accuracy of Model 2 was higher than Model 1 and human observers. The model created in this study appeared to have the potential to detect and classify CAs on panoramic radiographs, and might be useful to assist the human observers.


2014 ◽  
Author(s):  
Andreas Tuerk ◽  
Gregor Wiktorin ◽  
Serhat Güler

Quantification of RNA transcripts with RNA-Seq is inaccurate due to positional fragment bias, which is not represented appropriately by current statistical models of RNA-Seq data. This article introduces the Mix2(rd. "mixquare") model, which uses a mixture of probability distributions to model the transcript specific positional fragment bias. The parameters of the Mix2model can be efficiently trained with the Expectation Maximization (EM) algorithm resulting in simultaneous estimates of the transcript abundances and transcript specific positional biases. Experiments are conducted on synthetic data and the Universal Human Reference (UHR) and Brain (HBR) sample from the Microarray quality control (MAQC) data set. Comparing the correlation between qPCR and FPKM values to state-of-the-art methods Cufflinks and PennSeq we obtain an increase in R2value from 0.44 to 0.6 and from 0.34 to 0.54. In the detection of differential expression between UHR and HBR the true positive rate increases from 0.44 to 0.71 at a false positive rate of 0.1. Finally, the Mix2model is used to investigate biases present in the MAQC data. This reveals 5 dominant biases which deviate from the common assumption of a uniform fragment distribution. The Mix2software is available at http://www.lexogen.com/fileadmin/uploads/bioinfo/mix2model.tgz.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Shenghua Cheng ◽  
Sibo Liu ◽  
Jingya Yu ◽  
Gong Rao ◽  
Yuwei Xiao ◽  
...  

AbstractComputer-assisted diagnosis is key for scaling up cervical cancer screening. However, current recognition algorithms perform poorly on whole slide image (WSI) analysis, fail to generalize for diverse staining and imaging, and show sub-optimal clinical-level verification. Here, we develop a progressive lesion cell recognition method combining low- and high-resolution WSIs to recommend lesion cells and a recurrent neural network-based WSI classification model to evaluate the lesion degree of WSIs. We train and validate our WSI analysis system on 3,545 patient-wise WSIs with 79,911 annotations from multiple hospitals and several imaging instruments. On multi-center independent test sets of 1,170 patient-wise WSIs, we achieve 93.5% Specificity and 95.1% Sensitivity for classifying slides, comparing favourably to the average performance of three independent cytopathologists, and obtain 88.5% true positive rate for highlighting the top 10 lesion cells on 447 positive slides. After deployment, our system recognizes a one giga-pixel WSI in about 1.5 min.


2021 ◽  
Author(s):  
Shenghua Cheng ◽  
Sibo Liu ◽  
Jingya Yu ◽  
Gong Rao ◽  
Yuwei Xiao ◽  
...  

Abstract Computer-assisted diagnosis is key for popularizing cervical cancer screening. However, current recognition algorithms are insufficient in accuracy and generalization for cervical lesion cells, especially when facing diversity data in clinical applications. Inspired by manual reading slide under microscopes, we develop a progressive lesion cell recognition method combing low and high resolutions WSIs to recommend lesion cells and a recurrent neural network-based WSI classification model to evaluate the lesion degree of WSIs. After validating our system on 3,545 patient-wise WSIs with 79,218 annotations from multiple hospitals and several imaging instruments, on multi-center independent test sets of 1,170 patient-wise WSIs, we achieve 93.5% Specificity and 95.1% Sensitivity for classifying slides, closely equivalent to the average level of three independent cytopathologists, and obtain 88.5% TPR (true positive rate) for recommending top 10 lesion cells on 447 positive slides. After deploying, our system recognizes one giga-pixel WSI in about 1.5 minutes using one Nvidia 1080Ti GPU.


Author(s):  
Du Chunqi ◽  
Shinobu Hasegawa

In computer vision and computer graphics, 3D reconstruction is the process of capturing real objects’ shapes and appearances. 3D models always can be constructed by active methods which use high-quality scanner equipment, or passive methods that learn from the dataset. However, both of these two methods only aimed to construct the 3D models, without showing what element affects the generation of 3D models. Therefore, the goal of this research is to apply deep learning to automatically generating 3D models, and finding the latent variables which affect the reconstructing process. The existing research GANs can be trained in little data with two networks called Generator and Discriminator, respectively. Generator can produce synthetic data, and Discriminator can discriminate between the generator’s output and real data. The existing research shows that InFoGAN can maximize the mutual information between latent variables and observation. In our approach, we will generate the 3D models based on InFoGAN and design two constraints, shape-constraint and parameters-constraint, respectively. Shape-constraint utilizes the data augmentation method to limit the synthetic data generated in the models’ profiles. At the same time, we also try to employ parameters-constraint to find the 3D models’ relationship corresponding to the latent variables. Furthermore, our approach will be a challenge in the architecture of generating 3D models built on InFoGAN. Finally, in the process of generation, we might discover the contribution of the latent variables influencing the 3D models to the whole network.


2021 ◽  
Vol 11 (24) ◽  
pp. 11938
Author(s):  
Denis Zherdev ◽  
Larisa Zherdeva ◽  
Sergey Agapov ◽  
Anton Sapozhnikov ◽  
Artem Nikonorov ◽  
...  

Human poses and the behaviour estimation for different activities in (virtual reality/augmented reality) VR/AR could have numerous beneficial applications. Human fall monitoring is especially important for elderly people and for non-typical activities with VR/AR applications. There are a lot of different approaches to improving the fidelity of fall monitoring systems through the use of novel sensors and deep learning architectures; however, there is still a lack of detail and diverse datasets for training deep learning fall detectors using monocular images. The issues with synthetic data generation based on digital human simulation were implemented and examined using the Unreal Engine. The proposed pipeline provides automatic “playback” of various scenarios for digital human behaviour simulation, and the result of a proposed modular pipeline for synthetic data generation of digital human interaction with the 3D environments is demonstrated in this paper. We used the generated synthetic data to train the Mask R-CNN-based segmentation of the falling person interaction area. It is shown that, by training the model with simulation data, it is possible to recognize a falling person with an accuracy of 97.6% and classify the type of person’s interaction impact. The proposed approach also allows for covering a variety of scenarios that can have a positive effect at a deep learning training stage in other human action estimation tasks in an VR/AR environment.


Author(s):  
Hajar Danesh ◽  
Keivan Maghooli ◽  
Alireza Dehghani ◽  
Rahele Kafieh

AbstractNowadays, retinal optical coherence tomography (OCT) plays an important role in ophthalmology and automatic analysis of the OCT is of real importance: image denoising facilitates a better diagnosis and image segmentation and classification are undeniably critical in treatment evaluation. Synthetic OCT was recently considered to provide a benchmark for quantitative comparison of automatic algorithms and to be utilized in the training stage of novel solutions based on deep learning. Due to complicated data structure in retinal OCTs, a limited number of delineated OCT datasets are already available in presence of abnormalities; furthermore, the intrinsic three-dimensional (3D) structure of OCT is ignored in many public 2D datasets. We propose a new synthetic method, applicable to 3D data and feasible in presence of abnormalities like diabetic macular edema (DME). In this method, a limited number of OCT data is used during the training step and the Active Shape Model is used to produce synthetic OCTs plus delineation of retinal boundaries and location of abnormalities. Statistical comparison of thickness maps showed that synthetic dataset can be used as a statistically acceptable representative of the original dataset (p > 0.05). Visual inspection of the synthesized vessels was also promising. Regarding the texture features of the synthesized datasets, Q-Q plots were used, and even in cases that the points have slightly digressed from the straight line, the p-values of the Kolmogorov–Smirnov test rejected the null hypothesis and showed the same distribution in texture features of the real and the synthetic data. The proposed algorithm provides a unique benchmark for comparison of OCT enhancement methods and a tailored augmentation method to overcome the limited number of OCTs in deep learning algorithms. Graphical abstract


Sign in / Sign up

Export Citation Format

Share Document