scholarly journals A Deep Learning Approach on Building Detection from Unmanned Aerial Vehicle-Based Images in Riverbank Monitoring

Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3921 ◽  
Author(s):  
Wuttichai Boonpook ◽  
Yumin Tan ◽  
Yinghua Ye ◽  
Peerapong Torteeka ◽  
Kritanai Torsri ◽  
...  

Buildings along riverbanks are likely to be affected by rising water levels, therefore the acquisition of accurate building information has great importance not only for riverbank environmental protection but also for dealing with emergency cases like flooding. UAV-based photographs are flexible and cloud-free compared to satellite images and can provide very high-resolution images up to centimeter level, while there exist great challenges in quickly and accurately detecting and extracting building from UAV images because there are usually too many details and distortions on UAV images. In this paper, a deep learning (DL)-based approach is proposed for more accurately extracting building information, in which the network architecture, SegNet, is used in the semantic segmentation after the network training on a completely labeled UAV image dataset covering multi-dimension urban settlement appearances along a riverbank area in Chongqing. The experiment results show that an excellent performance has been obtained in the detection of buildings from untrained locations with an average overall accuracy more than 90%. To verify the generality and advantage of the proposed method, the procedure is further evaluated by training and testing with another two open standard datasets which have a variety of building patterns and styles, and the final overall accuracies of building extraction are more than 93% and 95%, respectively.

Author(s):  
S. Ham ◽  
Y. Oh ◽  
K. Choi ◽  
I. Lee

Detecting unregistered buildings from aerial images is an important task for urban management such as inspection of illegal buildings in green belt or update of GIS database. Moreover, the data acquisition platform of photogrammetry is evolving from manned aircraft to UAVs (Unmanned Aerial Vehicles). However, it is very costly and time-consuming to detect unregistered buildings from UAV images since the interpretation of aerial images still relies on manual efforts. To overcome this problem, we propose a system which automatically detects unregistered buildings from UAV images based on deep learning methods. Specifically, we train a deconvolutional network with publicly opened geospatial data, semantically segment a given UAV image into a building probability map and compare the building map with existing GIS data. Through this procedure, we could detect unregistered buildings from UAV images automatically and efficiently. We expect that the proposed system can be applied for various urban management tasks such as monitoring illegal buildings or illegal land-use change.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4442
Author(s):  
Zijie Niu ◽  
Juntao Deng ◽  
Xu Zhang ◽  
Jun Zhang ◽  
Shijia Pan ◽  
...  

It is important to obtain accurate information about kiwifruit vines to monitoring their physiological states and undertake precise orchard operations. However, because vines are small and cling to trellises, and have branches laying on the ground, numerous challenges exist in the acquisition of accurate data for kiwifruit vines. In this paper, a kiwifruit canopy distribution prediction model is proposed on the basis of low-altitude unmanned aerial vehicle (UAV) images and deep learning techniques. First, the location of the kiwifruit plants and vine distribution are extracted from high-precision images collected by UAV. The canopy gradient distribution maps with different noise reduction and distribution effects are generated by modifying the threshold and sampling size using the resampling normalization method. The results showed that the accuracies of the vine segmentation using PSPnet, support vector machine, and random forest classification were 71.2%, 85.8%, and 75.26%, respectively. However, the segmentation image obtained using depth semantic segmentation had a higher signal-to-noise ratio and was closer to the real situation. The average intersection over union of the deep semantic segmentation was more than or equal to 80% in distribution maps, whereas, in traditional machine learning, the average intersection was between 20% and 60%. This indicates the proposed model can quickly extract the vine distribution and plant position, and is thus able to perform dynamic monitoring of orchards to provide real-time operation guidance.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8072
Author(s):  
Yu-Bang Chang ◽  
Chieh Tsai ◽  
Chang-Hong Lin ◽  
Poki Chen

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6540
Author(s):  
Qian Pan ◽  
Maofang Gao ◽  
Pingbo Wu ◽  
Jingwen Yan ◽  
Shilei Li

Yellow rust is a disease with a wide range that causes great damage to wheat. The traditional method of manually identifying wheat yellow rust is very inefficient. To improve this situation, this study proposed a deep-learning-based method for identifying wheat yellow rust from unmanned aerial vehicle (UAV) images. The method was based on the pyramid scene parsing network (PSPNet) semantic segmentation model to classify healthy wheat, yellow rust wheat, and bare soil in small-scale UAV images, and to investigate the spatial generalization of the model. In addition, it was proposed to use the high-accuracy classification results of traditional algorithms as weak samples for wheat yellow rust identification. The recognition accuracy of the PSPNet model in this study reached 98%. On this basis, this study used the trained semantic segmentation model to recognize another wheat field. The results showed that the method had certain generalization ability, and its accuracy reached 98%. In addition, the high-accuracy classification result of a support vector machine was used as a weak label by weak supervision, which better solved the labeling problem of large-size images, and the final recognition accuracy reached 94%. Therefore, the present study method facilitated timely control measures to reduce economic losses.


2021 ◽  
Vol 12 ◽  
Author(s):  
Prabhakar Maheswari ◽  
Purushothaman Raja ◽  
Orly Enrique Apolo-Apolo ◽  
Manuel Pérez-Ruiz

Smart farming employs intelligent systems for every domain of agriculture to obtain sustainable economic growth with the available resources using advanced technologies. Deep Learning (DL) is a sophisticated artificial neural network architecture that provides state-of-the-art results in smart farming applications. One of the main tasks in this domain is yield estimation. Manual yield estimation undergoes many hurdles such as labor-intensive, time-consuming, imprecise results, etc. These issues motivate the development of an intelligent fruit yield estimation system that offers more benefits to the farmers in deciding harvesting, marketing, etc. Semantic segmentation combined with DL adds promising results in fruit detection and localization by performing pixel-based prediction. This paper reviews the different literature employing various techniques for fruit yield estimation using DL-based semantic segmentation architectures. It also discusses the challenging issues that occur during intelligent fruit yield estimation such as sampling, collection, annotation and data augmentation, fruit detection, and counting. Results show that the fruit yield estimation employing DL-based semantic segmentation techniques yields better performance than earlier techniques because of human cognition incorporated into the architecture. Future directions like customization of DL architecture for smart-phone applications to predict the yield, development of more comprehensive model encompassing challenging situations like occlusion, overlapping and illumination variation, etc., were also discussed.


2021 ◽  
Vol 11 (20) ◽  
pp. 9691
Author(s):  
Nur Atirah Muhadi ◽  
Ahmad Fikri Abdullah ◽  
Siti Khairunniza Bejo ◽  
Muhammad Razif Mahadi ◽  
Ana Mijic

The interest in visual-based surveillance systems, especially in natural disaster applications, such as flood detection and monitoring, has increased due to the blooming of surveillance technology. In this work, semantic segmentation based on convolutional neural networks (CNN) was proposed to identify water regions from the surveillance images. This work presented two well-established deep learning algorithms, DeepLabv3+ and SegNet networks, and evaluated their performances using several evaluation metrics. Overall, both networks attained high accuracy when compared to the measurement data but the DeepLabv3+ network performed better than the SegNet network, achieving over 90% for overall accuracy and IoU metrics, and around 80% for boundary F1 score (BF score), respectively. When predicting new images using both trained networks, the results show that both networks successfully distinguished water regions from the background but the outputs from DeepLabv3+ were more accurate than the results from the SegNet network. Therefore, the DeepLabv3+ network was used for practical application using a set of images captured at five consecutive days in the study area. The segmentation result and water level markers extracted from light detection and ranging (LiDAR) data were overlaid to estimate river water levels and observe the water fluctuation. River water levels were predicted based on the elevation from the predefined markers. The proposed water level framework was evaluated according to Spearman’s rank-order correlation coefficient. The correlation coefficient was 0.91, which indicates a strong relationship between the estimated water level and observed water level. Based on these findings, it can be concluded that the proposed approach has high potential as an alternative monitoring system that offers water region information and water level estimation for flood management and related activities.


2021 ◽  
Author(s):  
◽  
Martin Mundt

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.


Author(s):  
D. M. Huang ◽  
H. R. Zhao ◽  
Y. Yao

Abstract. Buildings, where most human activities happen, are one of the most important crucial objects in remote sensing images. Extracting building information is of great significance importance for conducting sustainable development-related researches. The extracted building information is a fundamental data source for further researches, including evaluating the living conditions of people, monitoring building conditions, predicting disaster risks and so on. In recent years, convolutional neural networks have been widely employed in building detection, and have gained significant progresses. However, in these automatic detection procedures, the critical brightness information is often neglected, with all buildings simply classified into the same category. To make the building detection more efficient and precise, we propose a simple yet efficient multitask method employing several lightness detectors, each of which is dedicated to the building detection in a specific brightness interval. Experiment results show that the building detection accuracy could be improved by 8.1% with the assistance of the additional lightness information.


Author(s):  
C. Najjaj ◽  
H. Rhinane ◽  
A. Hilali

Abstract. Researchers in computer vision and machine learning are becoming increasingly interested in image semantic segmentation. Many methods based on convolutional neural networks (CNNs) have been proposed and have made considerable progress in the building extraction mission. This other methods can result in suboptimal segmentation outcomes. Recently, to extract buildings with a great precision, we propose a model which can recognize all the buildings and present them in mask with white and the other classes in black. This developed network, which is based on U-Net, will boost the model's sensitivity. This paper provides a deep learning approach for building detection on satellite imagery applied in Casablanca city, Firstly, to begin we describe the terminology of this field. Next, the main datasets exposed in this project which’s 1000 satellite imagery. Then, we train the model UNET for 25 epochs on the training and validation datasets and testing the pretrained weight model with some unseen satellite images. Finally, the experimental results show that the proposed model offers good performance obtained as a binary mask that extract all the buildings in the region of Casablanca with a higher accuracy and entirety to achieve an average F1 score on test data of 0.91.


2021 ◽  
Vol 12 (25) ◽  
pp. 85
Author(s):  
Giacomo Patrucco ◽  
Francesco Setragno

<p class="VARAbstract">Digitisation processes of movable heritage are becoming increasingly popular to document the artworks stored in our museums. A growing number of strategies for the three-dimensional (3D) acquisition and modelling of these invaluable assets have been developed in the last few years. Their objective is to efficiently respond to this documentation need and contribute to deepening the knowledge of the masterpieces investigated constantly by researchers operating in many fieldworks. Nowadays, one of the most effective solutions is represented by the development of image-based techniques, usually connected to a Structure-from-Motion (SfM) photogrammetric approach. However, while images acquisition is relatively rapid, the processes connected to data processing are very time-consuming and require the operator’s substantial manual involvement. Developing deep learning-based strategies can be an effective solution to enhance the automatism level. In this research, which has been carried out in the framework of the digitisation of a wooden maquettes collection stored in the ‘Museo Egizio di Torino’, using a photogrammetric approach, an automatic masking strategy using deep learning techniques is proposed, to increase the level of automatism and therefore, optimise the photogrammetric pipeline. Starting from a manually annotated dataset, a neural network was trained to automatically perform a semantic classification to isolate the maquettes from the background. The proposed methodology allowed the researchers to obtain automatically segmented masks with a high degree of accuracy. The workflow is described (as regards acquisition strategies, dataset processing, and neural network training). In addition, the accuracy of the results is evaluated and discussed. Finally, the researchers proposed the possibility of performing a multiclass segmentation on the digital images to recognise different object categories in the images, as well as to define a semantic hierarchy to perform automatic classification of different elements in the acquired images.</p><p><strong>Highlights:</strong></p><ul><li><p>In the framework of movable heritage digitisation processes, many procedures are very time-consuming, and they still require the operator’s substantial manual involvement.</p></li><li><p>This research proposes using deep learning techniques to enhance the automatism level in the generation of exclusion masks, improving the optimisation of the photogrammetric procedures.</p></li><li><p>Following this strategy, the possibility of performing a multiclass semantic segmentation (on the 2D images and, consequently, on the 3D point cloud) is also discussed, considering the accuracy of the obtainable results.</p></li></ul>


Sign in / Sign up

Export Citation Format

Share Document