scholarly journals Fully Automated DCNN-Based Thermal Images Annotation Using Neural Network Pretrained on RGB Data

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1552
Author(s):  
Adam Ligocki ◽  
Ales Jelinek ◽  
Ludek Zalud ◽  
Esa Rahtu

One of the biggest challenges of training deep neural network is the need for massive data annotation. To train the neural network for object detection, millions of annotated training images are required. However, currently, there are no large-scale thermal image datasets that could be used to train the state of the art neural networks, while voluminous RGB image datasets are available. This paper presents a method that allows to create hundreds of thousands of annotated thermal images using the RGB pre-trained object detector. A dataset created in this way can be used to train object detectors with improved performance. The main gain of this work is the novel method for fully automatic thermal image labeling. The proposed system uses the RGB camera, thermal camera, 3D LiDAR, and the pre-trained neural network that detects objects in the RGB domain. Using this setup, it is possible to run the fully automated process that annotates the thermal images and creates the automatically annotated thermal training dataset. As the result, we created a dataset containing hundreds of thousands of annotated objects. This approach allows to train deep learning models with similar performance as the common human-annotation-based methods do. This paper also proposes several improvements to fine-tune the results with minimal human intervention. Finally, the evaluation of the proposed solution shows that the method gives significantly better results than training the neural network with standard small-scale hand-annotated thermal image datasets.

2020 ◽  
Vol 2020 ◽  
pp. 1-9 ◽  
Author(s):  
Manhuai Lu ◽  
Yuanxiang Mou

The postproduction defect classification and detection of bearings still relies on manual detection, which is time-consuming and tedious. To address this, we propose a bearing defect classification network based on an autoencoder to enhance the efficiency and accuracy of bearing defect detection. An improved autoencoder is used to reduce dimension feature extraction and reduce large-scale images to small-scale images through encoder dimensional reduction. Defect classification is completed by feeding the extracted features into a convolutional classification network. Comparative experiments show that the neural network can effectively complete feature selection and substantially improve classification accuracy while avoiding the laborious algorithm of the conventional method.


2019 ◽  
Author(s):  
Yosuke Toda ◽  
Fumio Okura ◽  
Jun Ito ◽  
Satoshi Okada ◽  
Toshinori Kinoshita ◽  
...  

Incorporating deep learning in the image analysis pipeline has opened the possibility of introducing precision phenotyping in the field of agriculture. However, to train the neural network, a sufficient amount of training data must be prepared, which requires a time-consuming manual data annotation process that often becomes the limiting step. Here, we show that an instance segmentation neural network (Mask R-CNN) aimed to phenotype the barley seed morphology of various cultivars, can be sufficiently trained purely by a synthetically generated dataset. Our attempt is based on the concept of domain randomization, where a large amount of image is generated by randomly orienting the seed object to a virtual canvas. After training with such a dataset, performance based on recall and the average Precision of the real-world test dataset achieved 96% and 95%, respectively. Applying our pipeline enables extraction of morphological parameters at a large scale, enabling precise characterization of the natural variation of barley from a multivariate perspective. Importantly, we show that our approach is effective not only for barley seeds but also for various crops including rice, lettuce, oat, and wheat, and thus supporting the fact that the performance benefits of this technique is generic. We propose that constructing and utilizing such synthetic data can be a powerful method to alleviate human labor costs needed to prepare the training dataset for deep learning in the agricultural domain.


2021 ◽  
Author(s):  
François-Marie Bréon ◽  
Leslie David ◽  
Pierre Chatelanaz ◽  
Frédéric Chevallier

Abstract. In David et al (2021), we introduced a neural network (NN) approach for estimating the column-averaged dry air mole fraction of CO2 (XCO2) and the surface pressure from the reflected solar spectra acquired by the OCO-2 instrument. The results indicated great potential for the technique as the comparison against both model estimates and independent TCCON measurements showed an accuracy and precision similar or better than that of the operational ACOS (NASA’s Atmospheric CO2 Observations from Space retrievals – ACOS) algorithm. Yet, subsequent analysis showed that the neural network estimate often mimics the training dataset and is unable to retrieve small scale features such as CO2 plumes from industrial sites. Importantly, we found that, with the same inputs as those used to estimate XCO2 and surface pressure, the NN technique is able to estimate latitude and date with unexpected skill, i.e. with an error whose standard deviation is only 7° and 61 days, respectively. The information about the date mainly comes from the weak CO2 band, that is influenced by the well-mixed and increasing concentrations of CO2 in the stratosphere. The availability of such information in the measured spectrum may therefore allow the NN to exploit it rather than the direct CO2 imprint in the spectrum, to estimate XCO2. Thus, our first version of the NN performed well mostly because the XCO2 fields used for the training were remarkably accurate, but it did not bring any added value. Further to this analysis, we designed a second version of the NN, excluding the weak CO2 band from the input. This new version has a different behaviour as it does retrieve XCO2 enhancements downwind of emission hotspots, i.e. a feature that is not in the training dataset. The comparison against the reference Total Carbon Column Observing Network (TCCON) and the surface-air-sample-driven inversion of the Copernicus Atmosphere Monitoring Service (CAMS) remains very good, as in the first version of the NN. In addition, the difference with the CAMS model (also called innovation in a data assimilation context) for NASA Atmospheric CO2 Observations from Space (ACOS) and the NN estimates are significantly correlated. These results confirm the potential of the NN approach for an operational processing of satellite observations aiming at the monitoring of CO2 concentrations and fluxes.


2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2636 ◽  
Author(s):  
Xia Fang ◽  
Wang Jie ◽  
Tao Feng

In the field of machine vision defect detection for a micro workpiece, it is very important to make the neural network realize the integrity of the mask in analyte segmentation regions. In the process of the recognition of small workpieces, fatal defects are always contained in borderline areas that are difficult to demarcate. The non-maximum suppression (NMS) of intersection over union (IOU) will lose crucial texture information especially in the clutter and occlusion detection areas. In this paper, simple linear iterative clustering (SLIC) is used to augment the mask as well as calibrate the score of the mask. We propose an SLIC head of object instance segmentation in proposal regions (Mask R-CNN) containing a network block to learn the quality of the predict masks. It is found that parallel K-means in the limited region mechanism in the SLIC head improved the confidence of the mask score, in the context of our workpiece. A continuous fine-tune mechanism was utilized to continuously improve the model robustness in a large-scale production line. We established a detection system, which included an optical fiber locator, telecentric lens system, matrix stereoscopic light, a rotating platform, and a neural network with an SLIC head. The accuracy of defect detection is effectively improved for micro workpieces with clutter and borderline areas.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2868
Author(s):  
Wenxuan Zhao ◽  
Yaqin Zhao ◽  
Liqi Feng ◽  
Jiaxi Tang

The purpose of image dehazing is the reduction of the image degradation caused by suspended particles for supporting high-level visual tasks. Besides the atmospheric scattering model, convolutional neural network (CNN) has been used for image dehazing. However, the existing image dehazing algorithms are limited in face of unevenly distributed haze and dense haze in real-world scenes. In this paper, we propose a novel end-to-end convolutional neural network called attention enhanced serial Unet++ dehazing network (AESUnet) for single image dehazing. We attempt to build a serial Unet++ structure that adopts a serial strategy of two pruned Unet++ blocks based on residual connection. Compared with the simple Encoder–Decoder structure, the serial Unet++ module can better use the features extracted by encoders and promote contextual information fusion in different resolutions. In addition, we take some improvement measures to the Unet++ module, such as pruning, introducing the convolutional module with ResNet structure, and a residual learning strategy. Thus, the serial Unet++ module can generate more realistic images with less color distortion. Furthermore, following the serial Unet++ blocks, an attention mechanism is introduced to pay different attention to haze regions with different concentrations by learning weights in the spatial domain and channel domain. Experiments are conducted on two representative datasets: the large-scale synthetic dataset RESIDE and the small-scale real-world datasets I-HAZY and O-HAZY. The experimental results show that the proposed dehazing network is not only comparable to state-of-the-art methods for the RESIDE synthetic datasets, but also surpasses them by a very large margin for the I-HAZY and O-HAZY real-world dataset.


Author(s):  
Shaolei Wang ◽  
Zhongyuan Wang ◽  
Wanxiang Che ◽  
Sendong Zhao ◽  
Ting Liu

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.


2019 ◽  
Vol 10 (15) ◽  
pp. 4129-4140 ◽  
Author(s):  
Kyle Mills ◽  
Kevin Ryczko ◽  
Iryna Luchak ◽  
Adam Domurad ◽  
Chris Beeler ◽  
...  

We present a physically-motivated topology of a deep neural network that can efficiently infer extensive parameters (such as energy, entropy, or number of particles) of arbitrarily large systems, doing so with scaling.


2012 ◽  
Vol 542-543 ◽  
pp. 1398-1402
Author(s):  
Guo Zhong Cheng ◽  
Wei Feng ◽  
Fang Song Cui ◽  
Shi Lu Zhang

This study improves the neural network algorithm that was presented by J.J.Hopfield for solving TSP(travelling salesman problem) and gets an effective algorithm whose time complexity is O(n*n), so we can solve quickly TSP more than 500 cities in microcomputer. The paper considers the algorithm based on the replacement function of the V Value. The improved algorithm can greatly reduces the time and space complexities of Hopfield method. The TSP examples show that the proposed algorithm could efficiently find a satisfactory solution and has a fast convergence speed.


2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Jordan Ott ◽  
Mike Pritchard ◽  
Natalie Best ◽  
Erik Linstead ◽  
Milan Curcic ◽  
...  

Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model’s emergent behavior to be assessed, i.e., when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of the optimizer proves unexpectedly critical. This in turn reveals many new neural network architectures that produce considerable improvements in climate model stability including some with reduced error, for an especially challenging training dataset.


Sign in / Sign up

Export Citation Format

Share Document