Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node Classification

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.775688 ◽

2021 ◽

Vol 15 ◽

Author(s):

Shuhao Shi ◽

Kai Qiao ◽

Shuai Yang ◽

Linyuan Wang ◽

Jian Chen ◽

...

Keyword(s):

Computational Cost ◽

Data Representation ◽

Imbalanced Datasets ◽

Convolutional Network ◽

Resampling Methods ◽

Convolutional Networks ◽

Multi Scale ◽

Training Samples ◽

Higher Weights ◽

Boosting Algorithm

The graph neural network (GNN) has been widely used for graph data representation. However, the existing researches only consider the ideal balanced dataset, and the imbalanced dataset is rarely considered. Traditional methods such as resampling, reweighting, and synthetic samples that deal with imbalanced datasets are no longer applicable in GNN. This study proposes an ensemble model called Boosting-GNN, which uses GNNs as the base classifiers during boosting. In Boosting-GNN, higher weights are set for the training samples that are not correctly classified by the previous classifiers, thus achieving higher classification accuracy and better reliability. Besides, transfer learning is used to reduce computational cost and increase fitting ability. Experimental results indicate that the proposed Boosting-GNN model achieves better performance than graph convolutional network (GCN), GraphSAGE, graph attention network (GAT), simplifying graph convolutional networks (SGC), multi-scale graph convolution networks (N-GCN), and most advanced reweighting and resampling methods on synthetic imbalanced datasets, with an average performance improvement of 4.5%.

Download Full-text

Shallow Graph Convolutional Network for Skeleton-Based Action Recognition

Sensors ◽

10.3390/s21020452 ◽

2021 ◽

Vol 21 (2) ◽

pp. 452

Author(s):

Wenjie Yang ◽

Jianlin Zhang ◽

Jingju Cai ◽

Zhiyong Xu

Keyword(s):

Action Recognition ◽

State Of The Art ◽

Computational Cost ◽

Receptive Fields ◽

Recognition Task ◽

Convolutional Network ◽

Convolutional Networks ◽

Spatial Graph ◽

Graph Size ◽

Skeleton Graph

Graph convolutional networks (GCNs) have brought considerable improvement to the skeleton-based action recognition task. Existing GCN-based methods usually use the fixed spatial graph size among all the layers. It severely affects the model’s abilities to exploit the global and semantic discriminative information due to the limits of receptive fields. Furthermore, the fixed graph size would cause many redundancies in the representation of actions, which is inefficient for the model. The redundancies could also hinder the model from focusing on beneficial features. To address those issues, we proposed a plug-and-play channel adaptive merging module (CAMM) specific for the human skeleton graph, which can merge the vertices from the same part of the skeleton graph adaptively and efficiently. The merge weights are different across the channels, so every channel has its flexibility to integrate the joints. Then, we build a novel shallow graph convolutional network (SGCN) based on the module, which achieves state-of-the-art performance with less computational cost. Experimental results on NTU-RGB+D and Kinetics-Skeleton illustrates the superiority of our methods.

Download Full-text

Flood Detection in Gaofen-3 SAR Images via Fully Convolutional Networks

Sensors ◽

10.3390/s18092915 ◽

2018 ◽

Vol 18 (9) ◽

pp. 2915 ◽

Cited By ~ 11

Author(s):

Wenchao Kang ◽

Yuming Xiang ◽

Feng Wang ◽

Ling Wan ◽

Hongjian You

Keyword(s):

Detection Method ◽

State Of The Art ◽

Sar Images ◽

Convolutional Network ◽

Training Time ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Training Samples ◽

Flood Detection ◽

Fine Tune

Emergency flood monitoring and rescue need to first detect flood areas. This paper provides a fast and novel flood detection method and applies it to Gaofen-3 SAR images. The fully convolutional network (FCN), a variant of VGG16, is utilized for flood mapping in this paper. Considering the requirement of flood detection, we fine-tune the model to get higher accuracy results with shorter training time and fewer training samples. Compared with state-of-the-art methods, our proposed algorithm not only gives robust and accurate detection results but also significantly reduces the detection time.

Download Full-text

Partial Least Squares: A Deep Space Odyssey

10.5753/ctd.2021.15753 ◽

2021 ◽

Author(s):

Artur Jordão Lima Correia ◽

William Robson Schwartz

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

High Performance ◽

Computational Cost ◽

Low Complexity ◽

Data Representation ◽

Large Datasets ◽

Compact Representation ◽

Convolutional Networks ◽

Memory Constraints

Modern visual pattern recognition models are based on deep convolutional networks. Such models are computationally expensive, hindering applicability on resource-constrained devices. To handle this problem, we propose three strategies. The first removes unimportant structures (neurons or layers) of convolutional networks, reducing their computational cost. The second inserts structures to design architectures automatically, enabling us to build high-performance networks. The third combines multiple layers of convolutional networks, enhancing data representation at negligible additional cost. These strategies are based on Partial Least Squares (PLS) which, despite promising results, is infeasible on large datasets due to memory constraints. To address this issue, we also propose a discriminative and low-complexity incremental PLS that learns a compact representation of the data using a single sample at a time, thus enabling applicability on large datasets.

Download Full-text

Lithology identification from well log curves via neural networks with additional geological constraint

Geophysics ◽

10.1190/geo2020-0676.1 ◽

2021 ◽

pp. 1-77

Author(s):

Chunbi Jiang ◽

Dongxiao Zhang ◽

Shifeng Chen

Keyword(s):

Neural Networks ◽

Identification Problem ◽

Classification Problem ◽

Well Log ◽

Convolutional Network ◽

Stratigraphic Unit ◽

Convolutional Networks ◽

Multi Scale ◽

The North ◽

Lithology Identification

We propose a machine learning framework to solve the lithology classification problem from well log curves by incorporating an additional geological constraint. The constraint is a stratigraphic unit, and we use it as an dditional feature. This method demonstrates the possibility of solving the lithology identification problem from a multi-scale data source because stratigraphic unit information can be obtained through tying well logs to seismic data. Our experiments show that adding an additional geological constraint improves the performance of models significantly. Currently, most researchers use their own well log curves to solve the lithology classification problem. The well log data used in our experiment, which are from the North Sea area, are publicly available, and thus future studies can continue to utilize them to perform further comparisons.We evaluated different types of recurrent neural networks, i.e., bidirectional long shortterm memory (Bi-LSTM), bidirectional gated recurrent unit (Bi-GRU), GRU-based encoderdecoder architecture with attention (ABi-GRU), one-dimensional convolutional networks, i.e., temporal convolutional network (TCN) and multi-scale residual network (MsRNet), and multi-layer perceptron (MLP) on the task. Our experiments revealed that the overall performance of RNN-based networks is better and more consistent. Since our experiments are based on one single dataset, additional experiments are required in the future to better elucidate how each network works on the lithofacies classification problem.

Download Full-text

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks

Applied Sciences ◽

10.3390/app11156975 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6975

Author(s):

Tao Zhang ◽

Lun He ◽

Xudong Li ◽

Guoqing Feng

Keyword(s):

Performance Improvement ◽

State Of The Art ◽

Error Rates ◽

Convolutional Network ◽

Convolutional Networks ◽

Sentence Level ◽

End To End ◽

High Level ◽

Improved Accuracy ◽

Talking Face

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.

Download Full-text

Semantic Relation Model and Dataset for Remote Sensing Scene Understanding

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070488 ◽

2021 ◽

Vol 10 (7) ◽

pp. 488

Author(s):

Peng Li ◽

Dezheng Zhang ◽

Aziguli Wulamu ◽

Xin Liu ◽

Peng Chen

Keyword(s):

Remote Sensing ◽

Scene Understanding ◽

Deep Understanding ◽

Remote Sensing Images ◽

Convolutional Network ◽

Scene Graph ◽

Multi Scale ◽

Relationship Extraction ◽

High Level ◽

Graph Generation

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.

Download Full-text

Data-Informed Decomposition for Localized Uncertainty Quantification of Dynamical Systems

Vibration ◽

10.3390/vibration4010004 ◽

2020 ◽

Vol 4 (1) ◽

pp. 49-63

Author(s):

Waad Subber ◽

Sayan Ghosh ◽

Piyush Pandita ◽

Yiming Zhang ◽

Liping Wang

Keyword(s):

Dynamical Systems ◽

Computational Cost ◽

Three Dimensional ◽

Region Of Interest ◽

System Response ◽

Computational Domain ◽

User Interest ◽

Numerical Resolution ◽

Multi Scale ◽

Operation Conditions

Industrial dynamical systems often exhibit multi-scale responses due to material heterogeneity and complex operation conditions. The smallest length-scale of the systems dynamics controls the numerical resolution required to resolve the embedded physics. In practice however, high numerical resolution is only required in a confined region of the domain where fast dynamics or localized material variability is exhibited, whereas a coarser discretization can be sufficient in the rest majority of the domain. Partitioning the complex dynamical system into smaller easier-to-solve problems based on the localized dynamics and material variability can reduce the overall computational cost. The region of interest can be specified based on the localized features of the solution, user interest, and correlation length of the material properties. For problems where a region of interest is not evident, Bayesian inference can provide a feasible solution. In this work, we employ a Bayesian framework to update the prior knowledge of the localized region of interest using measurements of the system response. Once, the region of interest is identified, the localized uncertainty is propagate forward through the computational domain. We demonstrate our framework using numerical experiments on a three-dimensional elastodynamic problem.

Download Full-text

Attention Multi-Scale Network for Automatic Layer Extraction of Ice Radar Topological Sequences

Remote Sensing ◽

10.3390/rs13122425 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2425

Author(s):

Yiheng Cai ◽

Dan Liu ◽

Jin Xie ◽

Jingxian Yang ◽

Xiangbin Cui ◽

...

Keyword(s):

Deep Learning ◽

Global Climate ◽

Ice Sheets ◽

Ice Sheet ◽

Sheet Thickness ◽

Learning Methods ◽

Convolutional Network ◽

Multi Scale ◽

Ice Surface ◽

Layer Extraction

Analyzing the surface and bedrock locations in radar imagery enables the computation of ice sheet thickness, which is important for the study of ice sheets, their volume and how they may contribute to global climate change. However, the traditional handcrafted methods cannot quickly provide quantitative, objective and reliable extraction of information from radargrams. Most traditional handcrafted methods, designed to detect ice-surface and ice-bed layers from ice sheet radargrams, require complex human involvement and are difficult to apply to large datasets, while deep learning methods can obtain better results in a generalized way. In this study, an end-to-end multi-scale attention network (MsANet) is proposed to realize the estimation and reconstruction of layers in sequences of ice sheet radar tomographic images. First, we use an improved 3D convolutional network, C3D-M, whose first full connection layer is replaced by a convolution unit to better maintain the spatial relativity of ice layer features, as the backbone. Then, an adjustable multi-scale module uses different scale filters to learn scale information to enhance the feature extraction capabilities of the network. Finally, an attention module extended to 3D space removes a redundant bottleneck unit to better fuse and refine ice layer features. Radar sequential images collected by the Center of Remote Sensing of Ice Sheets in 2014 are used as training and testing data. Compared with state-of-the-art deep learning methods, the MsANet shows a 10% reduction (2.14 pixels) on the measurement of average mean absolute column-wise error for detecting the ice-surface and ice-bottom layers, runs faster and uses approximately 12 million fewer parameters.

Download Full-text

A Multi-Scale Feature Extraction-Based Normalized Attention Neural Network for Image Denoising

Electronics ◽

10.3390/electronics10030319 ◽

2021 ◽

Vol 10 (3) ◽

pp. 319

Author(s):

Yi Wang ◽

Xiao Song ◽

Guanghong Gong ◽

Ni Li

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Image Denoising ◽

Color Image ◽

Rapid Development ◽

Similarity Index ◽

Structural Similarity ◽

Convolutional Network ◽

Scale Feature ◽

Multi Scale

Due to the rapid development of deep learning and artificial intelligence techniques, denoising via neural networks has drawn great attention due to their flexibility and excellent performances. However, for most convolutional network denoising methods, the convolution kernel is only one layer deep, and features of distinct scales are neglected. Moreover, in the convolution operation, all channels are treated equally; the relationships of channels are not considered. In this paper, we propose a multi-scale feature extraction-based normalized attention neural network (MFENANN) for image denoising. In MFENANN, we define a multi-scale feature extraction block to extract and combine features at distinct scales of the noisy image. In addition, we propose a normalized attention network (NAN) to learn the relationships between channels, which smooths the optimization landscape and speeds up the convergence process for training an attention model. Moreover, we introduce the NAN to convolutional network denoising, in which each channel gets gain; channels can play different roles in the subsequent convolution. To testify the effectiveness of the proposed MFENANN, we used both grayscale and color image sets whose noise levels ranged from 0 to 75 to do the experiments. The experimental results show that compared with some state-of-the-art denoising methods, the restored images of MFENANN have larger peak signal-to-noise ratios (PSNR) and structural similarity index measure (SSIM) values and get better overall appearance.

Download Full-text