Novel Hybrid Neural Network for Dense Depth Estimation using On-Board Monocular Images

Depth information from still 2D images plays an important role in automated driving, driving safety, and robotics. Monocular depth estimation is considered as an ill-posed and inherently ambiguous problem in general, and a tight issue is how to obtain global information efficiently since pure convolutional neural networks (CNNs) merely extract the local information. To end that, some previous works utilized conditional random fields (CRFs) to obtain the global information, but it is notoriously difficult to optimize. In this paper, a novel hybrid neural network is proposed to solve that, and concurrently a dense depth map is predicted from the monocular still image. Specifically: first, the deep residual network is utilized to obtain multi-scale local information and then feature correlation (FCL) blocks are used to correlate these features. Finally, the feature selection attention-based mechanism is adopted to fuse the multi-layer features, and the multi-layer recurrent neural networks (RNNs) are utilized with bidirectional long short-term memory (Bi-LSTM) unit as the output layer. Furthermore, a novel logarithm exponential average error (LEAE) is proposed to overcome over-weighted problem. The multi-scale feature correlation network (MFCN) is evaluated on large-scale KITTI benchmarks (LKT), which is a subset of KITTI raw dataset, and NYU depth v2. The experiments indicate that the proposed unified network outperforms existing methods. This method also updates the state-of-the-art performance on LKT datasets. Importantly, the depth estimation method can be widely used for collision risk assessment and avoidance in driving assistance systems or automated pilot systems to achieve safety in a more economical and convenient way.

Download Full-text

A Study of OWA Operators Learned in Convolutional Neural Networks

Applied Sciences ◽

10.3390/app11167195 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7195

Author(s):

Iris Dominguez-Catena ◽

Daniel Paternain ◽

Mikel Galar

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Practical Method ◽

Local Information ◽

Weighted Averaging ◽

Global Information ◽

Owa Operator ◽

Owa Operators ◽

Ordered Weighted Averaging

Ordered Weighted Averaging (OWA) operators have been integrated in Convolutional Neural Networks (CNNs) for image classification through the OWA layer. This layer lets the CNN integrate global information about the image in the early stages, where most CNN architectures only allow for the exploitation of local information. As a side effect of this integration, the OWA layer becomes a practical method for the determination of OWA operator weights, which is usually a difficult task that complicates the integration of these operators in other fields. In this paper, we explore the weights learned for the OWA operators inside the OWA layer, characterizing them through their basic properties of orness and dispersion. We also compare them to some families of OWA operators, namely the Binomial OWA operator, the Stancu OWA operator and the exponential RIM OWA operator, finding examples that are currently impossible to generalize through these parameterizations.

Download Full-text

Automated vessel segmentation in lung CT and CTA images via deep neural networks

Journal of X-Ray Science and Technology ◽

10.3233/xst-210955 ◽

2021 ◽

pp. 1-15

Author(s):

Wenjun Tan ◽

Luyu Zhou ◽

Xiaoshuo Li ◽

Xiaoyu Yang ◽

Yufei Chen ◽

...

Keyword(s):

Neural Network ◽

Computed Tomography ◽

Neural Networks ◽

Deep Neural Network ◽

Deep Neural Networks ◽

Vessel Segmentation ◽

Multi Scale ◽

Vascular Segmentation ◽

Study Results ◽

Lung Ct

BACKGROUND: The distribution of pulmonary vessels in computed tomography (CT) and computed tomography angiography (CTA) images of lung is important for diagnosing disease, formulating surgical plans and pulmonary research. PURPOSE: Based on the pulmonary vascular segmentation task of International Symposium on Image Computing and Digital Medicine 2020 challenge, this paper reviews 12 different pulmonary vascular segmentation algorithms of lung CT and CTA images and then objectively evaluates and compares their performances. METHODS: First, we present the annotated reference dataset of lung CT and CTA images. A subset of the dataset consisting 7,307 slices for training and 3,888 slices for testing was made available for participants. Second, by analyzing the performance comparison of different convolutional neural networks from 12 different institutions for pulmonary vascular segmentation, the reasons for some defects and improvements are summarized. The models are mainly based on U-Net, Attention, GAN, and multi-scale fusion network. The performance is measured in terms of Dice coefficient, over segmentation ratio and under segmentation rate. Finally, we discuss several proposed methods to improve the pulmonary vessel segmentation results using deep neural networks. RESULTS: By comparing with the annotated ground truth from both lung CT and CTA images, most of 12 deep neural network algorithms do an admirable job in pulmonary vascular extraction and segmentation with the dice coefficients ranging from 0.70 to 0.85. The dice coefficients for the top three algorithms are about 0.80. CONCLUSIONS: Study results show that integrating methods that consider spatial information, fuse multi-scale feature map, or have an excellent post-processing to deep neural network training and optimization process are significant for further improving the accuracy of pulmonary vascular segmentation.

Download Full-text

A multi-scale hybrid neural network retrieval model for dust storm detection, a study in Asia

Atmospheric Research ◽

10.1016/j.atmosres.2015.02.006 ◽

2015 ◽

Vol 158-159 ◽

pp. 89-106 ◽

Cited By ~ 18

Author(s):

Man Sing Wong ◽

Fei Xiao ◽

Janet Nichol ◽

Jimmy Fung ◽

Jhoon Kim ◽

...

Keyword(s):

Neural Network ◽

Dust Storm ◽

Retrieval Model ◽

Hybrid Neural Network ◽

Multi Scale

Download Full-text

Deep Multi-scale Convolutional Neural Network Method for Depth Estimation from a Single Image

2020 Chinese Control And Decision Conference (CCDC) ◽

10.1109/ccdc49329.2020.9164182 ◽

2020 ◽

Author(s):

Zhaowei Ma ◽

Yifeng Niu ◽

Jia Hu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Depth Estimation ◽

Single Image ◽

Neural Network Method ◽

Multi Scale ◽

Network Method

Download Full-text

An Integration Model for Text Classification using Graph Convolutional Network and BERT

Journal of Physics Conference Series ◽

10.1088/1742-6596/2137/1/012052 ◽

2021 ◽

Vol 2137 (1) ◽

pp. 012052

Author(s):

Bingxin Xue ◽

Cui Zhu ◽

Xuan Wang ◽

Wenjun Zhu

Keyword(s):

Neural Network ◽

Text Classification ◽

Semantic Information ◽

Contextual Information ◽

Local Information ◽

Data Sets ◽

Global Information ◽

Convolutional Network ◽

Integration Model ◽

Classification Tasks

Abstract Recently, Graph Convolutional Neural Network (GCN) is widely used in text classification tasks, and has effectively completed tasks that are considered to have a rich relational structure. However, due to the sparse adjacency matrix constructed by GCN, GCN cannot make full use of context-dependent information in text classification, and cannot capture local information. The Bidirectional Encoder Representation from Transformers (BERT) has been shown to have the ability to capture the contextual information in a sentence or document, but its ability to capture global information about the vocabulary of a language is relatively limited. The latter is the advantage of GCN. Therefore, in this paper, Mutual Graph Convolution Networks (MGCN) is proposed to solve the above problems. It introduces semantic dictionary (WordNet), dependency and BERT. MGCN uses dependency to solve the problem of context dependence and WordNet to obtain more semantic information. Then the local information generated by BERT and the global information generated by GCN are interacted through the attention mechanism, so that they can influence each other and improve the classification effect of the model. The experimental results show that our model is more effective than previous research reports on three text classification data sets.

Download Full-text

Online Capacity Estimation for Lithium-Ion Batteries Based on Semi-Supervised Convolutional Neural Network

World Electric Vehicle Journal ◽

10.3390/wevj12040256 ◽

2021 ◽

Vol 12 (4) ◽

pp. 256

Author(s):

Yi Wu ◽

Wei Li

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Lithium Ion Batteries ◽

Estimation Method ◽

Lithium Ion ◽

Estimation Methods ◽

Estimation Accuracy ◽

Capacity Estimation ◽

Practical Applications

Accurate capacity estimation can ensure the safe and reliable operation of lithium-ion batteries in practical applications. Recently, deep learning-based capacity estimation methods have demonstrated impressive advances. However, such methods suffer from limited labeled data for training, i.e., the capacity ground-truth of lithium-ion batteries. A capacity estimation method is proposed based on a semi-supervised convolutional neural network (SS-CNN). This method can automatically extract features from battery partial-charge information for capacity estimation. Furthermore, a semi-supervised training strategy is developed to take advantage of the extra unlabeled sample, which can improve the generalization of the model and the accuracy of capacity estimation even in the presence of limited labeled data. Compared with artificial neural networks and convolutional neural networks, the proposed method is demonstrated to improve capacity estimation accuracy.

Download Full-text

CrowdTC: Crowd-powered Learning for Text Classification

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3457216 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-23

Author(s):

Keyu Yang ◽

Yunjun Gao ◽

Lei Liang ◽

Song Bian ◽

Lu Chen ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Text Classification ◽

Deep Neural Networks ◽

Semantic Information ◽

Human Beings ◽

Hybrid Neural Network ◽

Public Datasets ◽

Almost All ◽

The Cost

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.

Download Full-text

Multi-stream neural network fused with local information and global information for HOI detection

Applied Intelligence ◽

10.1007/s10489-020-01794-1 ◽

2020 ◽

Vol 50 (12) ◽

pp. 4495-4505

Author(s):

Limin Xia ◽

Rui Li

Keyword(s):

Neural Network ◽

Local Information ◽

Global Information

Download Full-text

A Multi-Scale Water Extraction Convolutional Neural Network (MWEN) Method for GaoFen-1 Remote Sensing Images

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9040189 ◽

2020 ◽

Vol 9 (4) ◽

pp. 189 ◽

Cited By ~ 3

Author(s):

Hongxiang Guo ◽

Guojin He ◽

Wei Jiang ◽

Ranyu Yin ◽

Lei Yan ◽

...

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Convolutional Neural Network ◽

Water Body ◽

Semantic Segmentation ◽

Water Bodies ◽

Water Extraction ◽

Remote Sensing Images ◽

Multi Scale

Automatic water body extraction method is important for monitoring floods, droughts, and water resources. In this study, a new semantic segmentation convolutional neural network named the multi-scale water extraction convolutional neural network (MWEN) is proposed to automatically extract water bodies from GaoFen-1 (GF-1) remote sensing images. Three convolutional neural networks for semantic segmentation (fully convolutional network (FCN), Unet, and Deeplab V3+) are employed to compare with the water bodies extraction performance of MWEN. Visual comparison and five evaluation metrics are used to evaluate the performance of these convolutional neural networks (CNNs). The results show the following. (1) The results of water body extraction in multiple scenes using the MWEN are better than those of the other comparison methods based on the indicators. (2) The MWEN method has the capability to accurately extract various types of water bodies, such as urban water bodies, open ponds, and plateau lakes. (3) By fusing features extracted at different scales, the MWEN has the capability to extract water bodies with different sizes and suppress noise, such as building shadows and highways. Therefore, MWEN is a robust water extraction algorithm for GaoFen-1 satellite images and has the potential to conduct water body mapping with multisource high-resolution satellite remote sensing data.

Download Full-text

Online supervised attention-based recurrent depth estimation from monocular video

PeerJ Computer Science ◽

10.7717/peerj-cs.317 ◽

2020 ◽

Vol 6 ◽

pp. e317

Author(s):

Dmitrii Maslov ◽

Ilya Makarov

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Short Term Memory ◽

Depth Estimation ◽

Autonomous Driving ◽

Temporal Information ◽

Depth Information ◽

Safe Driving ◽

Monocular Video ◽

Depth Reconstruction

Autonomous driving highly depends on depth information for safe driving. Recently, major improvements have been taken towards improving both supervised and self-supervised methods for depth reconstruction. However, most of the current approaches focus on single frame depth estimation, where quality limit is hard to beat due to limitations of supervised learning of deep neural networks in general. One of the way to improve quality of existing methods is to utilize temporal information from frame sequences. In this paper, we study intelligent ways of integrating recurrent block in common supervised depth estimation pipeline. We propose a novel method, which takes advantage of the convolutional gated recurrent unit (convGRU) and convolutional long short-term memory (convLSTM). We compare use of convGRU and convLSTM blocks and determine the best model for real-time depth estimation task. We carefully study training strategy and provide new deep neural networks architectures for the task of depth estimation from monocular video using information from past frames based on attention mechanism. We demonstrate the efficiency of exploiting temporal information by comparing our best recurrent method with existing image-based and video-based solutions for monocular depth reconstruction.

Download Full-text