Parsing of Urban Facades from 3D Point Clouds Based on a Novel Multi-View Domain

Recently, performance improvement in facade parsing from 3D point clouds has been brought about by designing more complex network structures, which cost huge computing resources and do not take full advantage of prior knowledge of facade structure. Instead, from the perspective of data distribution, we construct a new hierarchical mesh multi-view data domain based on the characteristics of facade objects to achieve fusion of deep-learning models and prior knowledge, thereby significantly improving segmentation accuracy. We comprehensively evaluate the current mainstream method on the RueMonge 2014 data set and demonstrate the superiority of our method. The mean intersection-over-union index on the facade-parsing task reached 76.41%, which is 2.75% higher than the current best result. In addition, through comparative experiments, the reasons for the performance improvement of the proposed method are further analyzed.

Download Full-text

Detection of Surface Defects in Logs Using Point Cloud Data and Deep Learning

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2021.15.67 ◽

2021 ◽

Vol 15 ◽

pp. 607-616

Author(s):

Shengbo Liu ◽

Pengyuan Fu ◽

Lei Yan ◽

Jian Wu ◽

Yandong Zhao

Keyword(s):

Deep Learning ◽

Surface Defects ◽

Recognition Rate ◽

Point Clouds ◽

Data Set ◽

Cloud Data ◽

Wood Processing ◽

3D Point Clouds ◽

3D Data ◽

Deep Learning Network

Deep learning classification based on 3D point clouds has gained considerable research interest in recent years.The classification and quantitative analysis of wood defects are of great significance to the wood processing industry. In order to solve the problems of slow processing and low robustness of 3D data. This paper proposes an improvement based on littlepoint CNN lightweight deep learning network, adding BN layer. And based on the data set made by ourselves, the test is carried out. The new network bnlittlepoint CNN has been improved in speed and recognition rate. The correct rate of recognition for non defect log, non defect log and defect log as well as defect knot and dead knot can reach 95.6%.Finally, the "dead knot" and "loose knot" are quantitatively analyzed based on the "integral" idea, and the volume and surface area of the defect are obtained to a certain extent,the error is not more than 1.5% and the defect surface reconstruction is completed based on the triangulation idea.

Download Full-text

Cotton Stand Counting from Unmanned Aerial System Imagery Using MobileNet and CenterNet Deep Learning Models

Remote Sensing ◽

10.3390/rs13142822 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2822

Author(s):

Zhe Lin ◽

Wenxuan Guo

Keyword(s):

Deep Learning ◽

Cotton Plant ◽

Unmanned Aerial System ◽

Learning Models ◽

Training Images ◽

Testing Dataset ◽

Cotton Plants ◽

Detection And Counting ◽

Different Dimensions ◽

The Mean

An accurate stand count is a prerequisite to determining the emergence rate, assessing seedling vigor, and facilitating site-specific management for optimal crop production. Traditional manual counting methods in stand assessment are labor intensive and time consuming for large-scale breeding programs or production field operations. This study aimed to apply two deep learning models, the MobileNet and CenterNet, to detect and count cotton plants at the seedling stage with unmanned aerial system (UAS) images. These models were trained with two datasets containing 400 and 900 images with variations in plant size and soil background brightness. The performance of these models was assessed with two testing datasets of different dimensions, testing dataset 1 with 300 by 400 pixels and testing dataset 2 with 250 by 1200 pixels. The model validation results showed that the mean average precision (mAP) and average recall (AR) were 79% and 73% for the CenterNet model, and 86% and 72% for the MobileNet model with 900 training images. The accuracy of cotton plant detection and counting was higher with testing dataset 1 for both CenterNet and MobileNet models. The results showed that the CenterNet model had a better overall performance for cotton plant detection and counting with 900 training images. The results also indicated that more training images are required when applying object detection models on images with different dimensions from training datasets. The mean absolute percentage error (MAPE), coefficient of determination (R2), and the root mean squared error (RMSE) values of the cotton plant counting were 0.07%, 0.98 and 0.37, respectively, with testing dataset 1 for the CenterNet model with 900 training images. Both MobileNet and CenterNet models have the potential to accurately and timely detect and count cotton plants based on high-resolution UAS images at the seedling stage. This study provides valuable information for selecting the right deep learning tools and the appropriate number of training images for object detection projects in agricultural applications.

Download Full-text

FWNet: Semantic Segmentation for Full-Waveform LiDAR Data Using Deep Learning

Sensors ◽

10.3390/s20123568 ◽

2020 ◽

Vol 20 (12) ◽

pp. 3568 ◽

Cited By ~ 2

Author(s):

Takayuki Shinohara ◽

Haoyi Xiu ◽

Masashi Matsuoka

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Point Clouds ◽

Lidar Data ◽

Global Features ◽

Waveform Data ◽

Full Waveform ◽

3D Point Clouds ◽

Waveform Lidar ◽

Full Waveform Lidar

In the computer vision field, many 3D deep learning models that directly manage 3D point clouds (proposed after PointNet) have been published. Moreover, deep learning-based-techniques have demonstrated state-of-the-art performance for supervised learning tasks on 3D point cloud data, such as classification and segmentation tasks for open datasets in competitions. Furthermore, many researchers have attempted to apply these deep learning-based techniques to 3D point clouds observed by aerial laser scanners (ALSs). However, most of these studies were developed for 3D point clouds without radiometric information. In this paper, we investigate the possibility of using a deep learning method to solve the semantic segmentation task of airborne full-waveform light detection and ranging (lidar) data that consists of geometric information and radiometric waveform data. Thus, we propose a data-driven semantic segmentation model called the full-waveform network (FWNet), which handles the waveform of full-waveform lidar data without any conversion process, such as projection onto a 2D grid or calculating handcrafted features. Our FWNet is based on a PointNet-based architecture, which can extract the local and global features of each input waveform data, along with its corresponding geographical coordinates. Subsequently, the classifier consists of 1D convolutional operational layers, which predict the class vector corresponding to the input waveform from the extracted local and global features. Our trained FWNet achieved higher scores in its recall, precision, and F1 score for unseen test data—higher scores than those of previously proposed methods in full-waveform lidar data analysis domain. Specifically, our FWNet achieved a mean recall of 0.73, a mean precision of 0.81, and a mean F1 score of 0.76. We further performed an ablation study, that is assessing the effectiveness of our proposed method, of the above-mentioned metric. Moreover, we investigated the effectiveness of our PointNet based local and global feature extraction method using the visualization of the feature vector. In this way, we have shown that our network for local and global feature extraction allows training with semantic segmentation without requiring expert knowledge on full-waveform lidar data or translation into 2D images or voxels.

Download Full-text

Comparing Machine and Deep Learning Methods for Large 3D Heritage Semantic Segmentation

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9090535 ◽

2020 ◽

Vol 9 (9) ◽

pp. 535

Author(s):

Francesca Matrone ◽

Eleonora Grilli ◽

Massimo Martini ◽

Marina Paolanti ◽

Roberto Pierdicca ◽

...

Keyword(s):

Deep Learning ◽

Cultural Heritage ◽

Laser Scanning ◽

Semantic Segmentation ◽

Point Clouds ◽

Classification Algorithms ◽

Learning Methods ◽

3D Point Clouds ◽

The Subject

In recent years semantic segmentation of 3D point clouds has been an argument that involves different fields of application. Cultural heritage scenarios have become the subject of this study mainly thanks to the development of photogrammetry and laser scanning techniques. Classification algorithms based on machine and deep learning methods allow to process huge amounts of data as 3D point clouds. In this context, the aim of this paper is to make a comparison between machine and deep learning methods for large 3D cultural heritage classification. Then, considering the best performances of both techniques, it proposes an architecture named DGCNN-Mod+3Dfeat that combines the positive aspects and advantages of these two methodologies for semantic segmentation of cultural heritage point clouds. To demonstrate the validity of our idea, several experiments from the ArCH benchmark are reported and commented.

Download Full-text

Deep Learning Applied to Vegetation Identification and Removal Using Multidimensional Aerial Data

Sensors ◽

10.3390/s20216187 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6187

Author(s):

Milena F. Pinto ◽

Aurelio G. Melo ◽

Leonardo M. Honório ◽

André L. M. Marcato ◽

André G. S. Conceição ◽

...

Keyword(s):

Deep Learning ◽

Three Dimensional ◽

Point Clouds ◽

Color Filter ◽

Structural Problems ◽

3D Point Clouds ◽

Common Resource ◽

Complete Inspection ◽

Colored Point ◽

Covering Vegetation

When performing structural inspection, the generation of three-dimensional (3D) point clouds is a common resource. Those are usually generated from photogrammetry or through laser scan techniques. However, a significant drawback for complete inspection is the presence of covering vegetation, hiding possible structural problems, and making difficult the acquisition of proper object surfaces in order to provide a reliable diagnostic. Therefore, this research’s main contribution is developing an effective vegetation removal methodology through the use of a deep learning structure that is capable of identifying and extracting covering vegetation in 3D point clouds. The proposed approach uses pre and post-processing filtering stages that take advantage of colored point clouds, if they are available, or operate independently. The results showed high classification accuracy and good effectiveness when compared with similar methods in the literature. After this step, if color is available, then a color filter is applied, enhancing the results obtained. Besides, the results are analyzed in light of real Structure From Motion (SFM) reconstruction data, which further validates the proposed method. This research also presented a colored point cloud library of bushes built for the work used by other studies in the field.

Download Full-text

EVALUATING UNMANNED AERIAL PLATFORMS FOR CULTURAL HERITAGE LARGE SCALE MAPPING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b5-355-2016 ◽

2016 ◽

Vol XLI-B5 ◽

pp. 355-362 ◽

Cited By ~ 3

Author(s):

A. Georgopoulos ◽

C. Oikonomou ◽

E. Adamopoulos ◽

E. K. Stathopoulou

Keyword(s):

Cultural Heritage ◽

Optical Sensors ◽

Large Scale ◽

Point Clouds ◽

Short Review ◽

Data Set ◽

Heritage Sites ◽

3D Point Clouds ◽

Aerial Platforms ◽

3D Information

When it comes to large scale mapping of limited areas especially for cultural heritage sites, things become critical. Optical and non-optical sensors are developed to such sizes and weights that can be lifted by such platforms, like e.g. LiDAR units. At the same time there is an increase in emphasis on solutions that enable users to get access to 3D information faster and cheaper. Considering the multitude of platforms, cameras and the advancement of algorithms in conjunction with the increase of available computing power this challenge should and indeed is further investigated. In this paper a short review of the UAS technologies today is attempted. A discussion follows as to their applicability and advantages, depending on their specifications, which vary immensely. The on-board cameras available are also compared and evaluated for large scale mapping. Furthermore a thorough analysis, review and experimentation with different software implementations of Structure from Motion and Multiple View Stereo algorithms, able to process such dense and mostly unordered sequence of digital images is also conducted and presented. As test data set, we use a rich optical and thermal data set from both fixed wing and multi-rotor platforms over an archaeological excavation with adverse height variations and using different cameras. Dense 3D point clouds, digital terrain models and orthophotos have been produced and evaluated for their radiometric as well as metric qualities.

Download Full-text

Prediction of EGFR Mutation Status Based on 18F-FDG PET/CT Imaging Using Deep Learning-Based Model in Lung Adenocarcinoma

Frontiers in Oncology ◽

10.3389/fonc.2021.709137 ◽

2021 ◽

Vol 11 ◽

Author(s):

Guotao Yin ◽

Ziyang Wang ◽

Yingchao Song ◽

Xiaofeng Li ◽

Yiwen Chen ◽

...

Keyword(s):

Deep Learning ◽

Lung Adenocarcinoma ◽

Fdg Pet ◽

Egfr Mutation ◽

Learning Models ◽

Data Set ◽

Mutation Status ◽

Pet Ct ◽

Testing Data ◽

Fdg Pet Ct

ObjectiveThe purpose of this study was to develop a deep learning-based system to automatically predict epidermal growth factor receptor (EGFR) mutant lung adenocarcinoma in 18F-fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT).MethodsThree hundred and one lung adenocarcinoma patients with EGFR mutation status were enrolled in this study. Two deep learning models (SECT and SEPET) were developed with Squeeze-and-Excitation Residual Network (SE-ResNet) module for the prediction of EGFR mutation with CT and PET images, respectively. The deep learning models were trained with a training data set of 198 patients and tested with a testing data set of 103 patients. Stacked generalization was used to integrate the results of SECT and SEPET.ResultsThe AUCs of the SECT and SEPET were 0.72 (95% CI, 0.62–0.80) and 0.74 (95% CI, 0.65–0.82) in the testing data set, respectively. After integrating SECT and SEPET with stacked generalization, the AUC was further improved to 0.84 (95% CI, 0.75–0.90), significantly higher than SECT (p<0.05).ConclusionThe stacking model based on 18F-FDG PET/CT images is capable to predict EGFR mutation status of patients with lung adenocarcinoma automatically and non-invasively. The proposed model in this study showed the potential to help clinicians identify suitable advanced patients with lung adenocarcinoma for EGFR‐targeted therapy.

Download Full-text

Sentiment Analysis of Film Reviews Based on BI-GRU +Attention+Capsule Fusion

10.36227/techrxiv.14863401.v1 ◽

2021 ◽

Author(s):

zhifei hu

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Film Review ◽

The Other ◽

Learning Models ◽

Analysis Model ◽

Fusion Model ◽

Data Set ◽

Analysis Task ◽

Film Reviews

In this paper, a sentiment analysis model based on the bi-directional GRU, Attention and Capusle fusion of BI-GRU+Attention+Capsule was designed and implemented based on the sentiment analysis task of the open film review data set IMDB, and combined with the bi-directional GRU, Attention and Capsule. It is compared with six deep learning models, such as LSTM, CNN, GRU, BI-GRU, CNN+GRU and GRU+CNN. The experimental results show that the accuracy of the BI-GRU model combined with Attention and Capusule is higher than the other six models, and the accuracy of the GRU+CNN model is higher than that of the CNN+GRU model, and the accuracy of the CNN+GRU model is higher than that of the CNN model. The accuracy of CNN model was successively higher than that of LSTM, BI-GRU and GRU model. The fusion model of BI-GRU +Attention+Capsule adopted in this paper has the highest accuracy among all the models. In conclusion, the fusion model of BI-GRU+Attention+Capsule effectively improves the accuracy of text sentiment classification.<br>

Download Full-text

RADIOMETRIC CORRECTION OF MULTITEMPORAL HYPERSPECTRAL UAS IMAGE MOSAICS OF SEEDLING STANDS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-w3-113-2017 ◽

2017 ◽

Vol XLII-3/W3 ◽

pp. 113-118 ◽

Cited By ~ 1

Author(s):

L. Markelin ◽

E. Honkavaara ◽

R. Näsi ◽

N. Viljanen ◽

T. Rosnell ◽

...

Keyword(s):

Precision Agriculture ◽

Atmospheric Correction ◽

Point Clouds ◽

Hyperspectral Data ◽

Reflectance Spectra ◽

Data Sets ◽

Data Set ◽

Relative Correction ◽

Small Areas ◽

3D Point Clouds

Novel miniaturized multi- and hyperspectral imaging sensors on board of unmanned aerial vehicles have recently shown great potential in various environmental monitoring and measuring tasks such as precision agriculture and forest management. These systems can be used to collect dense 3D point clouds and spectral information over small areas such as single forest stands or sample plots. Accurate radiometric processing and atmospheric correction is required when data sets from different dates and sensors, collected in varying illumination conditions, are combined. Performance of novel radiometric block adjustment method, developed at Finnish Geospatial Research Institute, is evaluated with multitemporal hyperspectral data set of seedling stands collected during spring and summer 2016. Illumination conditions during campaigns varied from bright to overcast. We use two different methods to produce homogenous image mosaics and hyperspectral point clouds: image-wise relative correction and image-wise relative correction with BRDF. Radiometric datasets are converted to reflectance using reference panels and changes in reflectance spectra is analysed. Tested methods improved image mosaic homogeneity by 5&thinsp;% to 25&thinsp;%. Results show that the evaluated method can produce consistent reflectance mosaics and reflectance spectra shape between different areas and dates.

Download Full-text

The influence of training sample size on the accuracy of deep learning models for the prediction of soil properties with near-infrared spectroscopy data

SOIL ◽

10.5194/soil-6-565-2020 ◽

2020 ◽

Vol 6 (2) ◽

pp. 565-578

Author(s):

Wartini Ng ◽

Budiman Minasny ◽

Wanderson de Sousa Mendes ◽

José Alexandre Melo Demattê

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Soil Properties ◽

Sample Size ◽

Training Sample ◽

Calibration Data ◽

Learning Models ◽

Data Set ◽

Calibration Data Set ◽

Machine Learning Models

Abstract. The number of samples used in the calibration data set affects the quality of the generated predictive models using visible, near and shortwave infrared (VIS–NIR–SWIR) spectroscopy for soil attributes. Recently, the convolutional neural network (CNN) has been regarded as a highly accurate model for predicting soil properties on a large database. However, it has not yet been ascertained how large the sample size should be for CNN model to be effective. This paper investigates the effect of the training sample size on the accuracy of deep learning and machine learning models. It aims at providing an estimate of how many calibration samples are needed to improve the model performance of soil properties predictions with CNN as compared to conventional machine learning models. In addition, this paper also looks at a way to interpret the CNN models, which are commonly labelled as a black box. It is hypothesised that the performance of machine learning models will increase with an increasing number of training samples, but it will plateau when it reaches a certain number, while the performance of CNN will keep improving. The performances of two machine learning models (partial least squares regression – PLSR; Cubist) are compared against the CNN model. A VIS–NIR–SWIR spectra library from Brazil, containing 4251 unique sites with averages of two to three samples per depth (a total of 12 044 samples), was divided into calibration (3188 sites) and validation (1063 sites) sets. A subset of the calibration data set was then created to represent a smaller calibration data set ranging from 125, 300, 500, 1000, 1500, 2000, 2500 and 2700 unique sites, which is equivalent to a sample size of approximately 350, 840, 1400, 2800, 4200, 5600, 7000 and 7650. All three models (PLSR, Cubist and CNN) were generated for each sample size of the unique sites for the prediction of five different soil properties, i.e. cation exchange capacity, organic carbon, sand, silt and clay content. These calibration subset sampling processes and modelling were repeated 10 times to provide a better representation of the model performances. Learning curves showed that the accuracy increased with an increasing number of training samples. At a lower number of samples (< 1000), PLSR and Cubist performed better than CNN. The performance of CNN outweighed the PLSR and Cubist model at a sample size of 1500 and 1800, respectively. It can be recommended that deep learning is most efficient for spectra modelling for sample sizes above 2000. The accuracy of the PLSR and Cubist model seems to reach a plateau above sample sizes of 4200 and 5000, respectively, while the accuracy of CNN has not plateaued. A sensitivity analysis of the CNN model demonstrated its ability to determine important wavelengths region that affected the predictions of various soil attributes.

Download Full-text