STRUCTURALLY ENHANCED INCREMENTAL NEURAL LEARNING FOR IMAGE CLASSIFICATION WITH SUBGRAPH EXTRACTION

2014 ◽  
Vol 24 (07) ◽  
pp. 1450024 ◽  
Author(s):  
YU-BIN YANG ◽  
YA-NAN LI ◽  
YANG GAO ◽  
HUJUN YIN ◽  
YE TANG

In this paper, a structurally enhanced incremental neural learning technique is proposed to learn a discriminative codebook representation of images for effective image classification applications. In order to accommodate the relationships such as structures and distributions among visual words into the codebook learning process, we develop an online codebook graph learning method based on a novel structurally enhanced incremental learning technique, called as "visualization-induced self-organized incremental neural network (ViSOINN)". The hidden structural information in the images is embedded into the graph representation evolving dynamically with the adaptive and competitive learning mechanism. Afterwards, image features can be coded using a sub-graph extraction process based on the learned codebook graph, and a classifier is subsequently used to complete the image classification task. Compared with other codebook learning algorithms originated from the classical Bag-of-Features (BoF) model, ViSOINN holds the following advantages: (1) it learns codebook efficiently and effectively from a small training set; (2) it models the relationships among visual words in metric scaling fashion, so preserving high discriminative power; (3) it automatically learns the codebook without a fixed pre-defined size; and (4) it enhances and preserves better the structure of the data. These characteristics help to improve image classification performance and make it more suitable for handling large-scale image classification tasks. Experimental results on the widely used Caltech-101 and Caltech-256 benchmark datasets demonstrate that ViSOINN achieves markedly improved performance and reduces the computational cost considerably.

Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2021 ◽  
Vol 24 (2) ◽  
pp. 78-86
Author(s):  
Zainab N. Sultani ◽  
◽  
Ban N. Dhannoon ◽  

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.


2018 ◽  
Vol 2018 ◽  
pp. 1-14 ◽  
Author(s):  
Zhihang Ji ◽  
Sining Wu ◽  
Fan Wang ◽  
Lijuan Xu ◽  
Yan Yang ◽  
...  

In the context of image classification, bag-of-visual-words mode is widely used for image representation. In recent years several works have aimed at exploiting color or spatial information to improve the representation. In this paper two kinds of representation vectors, namely, Global Color Co-occurrence Vector (GCCV) and Local Color Co-occurrence Vector (LCCV), are proposed. Both of them make use of the color and co-occurrence information of the superpixels in an image. GCCV describes the global statistical distribution of the colorful superpixels with embedding the spatial information between them. By this way, it is capable of capturing the color and structure information in large scale. Unlike the GCCV, LCCV, which is embedded in the Riemannian manifold space, reflects the color information within the superpixels in detail. It records the higher-order distribution of the color between the superpixels within a neighborhood by aggregating the co-occurrence information in the second-order pooling way. In the experiment, we incorporate the two proposed representation vectors with feature vector like LLC or CNN by Multiple Kernel Learning (MKL) technology. Several challenging datasets for visual classification are tested on the novel framework, and experimental results demonstrate the effectiveness of the proposed method.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yang He ◽  
Ling Tian ◽  
Lizong Zhang ◽  
Xi Zeng

Autonomous object detection powered by cutting-edge artificial intelligent techniques has been an essential component for sustaining complex smart city systems. Fine-grained image classification focuses on recognizing subcategories of specific levels of images. As a result of the high similarity between images in the same category and the high dissimilarity in the same subcategories, it has always been a challenging problem in computer vision. Traditional approaches usually rely on exploring only the visual information in images. Therefore, this paper proposes a novel Knowledge Graph Representation Fusion (KGRF) framework to introduce prior knowledge into fine-grained image classification task. Specifically, the Graph Attention Network (GAT) is employed to learn the knowledge representation from the constructed knowledge graph modeling the categories-subcategories and subcategories-attributes associations. By introducing the Multimodal Compact Bilinear (MCB) module, the framework can fully integrate the knowledge representation and visual features for learning the high-level image features. Extensive experiments on the Caltech-UCSD Birds-200-2011 dataset verify the superiority of our proposed framework over several existing state-of-the-art methods.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Honghong Liao ◽  
Jinhai Xiang ◽  
Weiping Sun ◽  
Shengsheng Yu

The Bag of Visual Words (BoW) model is one of the most popular and effective image classification frameworks in the recent literature. The optimal formation of a visual vocabulary remains unclear, and the size of the vocabulary also affects the performance of image classification. Empirically, larger vocabulary leads to higher classification accuracy. However, larger vocabulary needs more memory and intensive computational resources. In this paper, we propose a multiresolution feature coding (MFC) framework via aggregating feature codings obtained from a set of small visual vocabularies with different sizes, where each vocabulary is obtained by a clustering algorithm, and different clustering algorithm discovers different aspect of image features. In MFC, feature codings from different visual vocabularies are aggregated adaptively by a modified Online Passive-Aggressive Algorithm under the histogram intersection kernel, which lead to a closed-form solution. Experiments demonstrate that the proposed method (1) obtains the same if not higher classification accuracy than the BoW model with a large visual vocabulary; and (2) needs much less memory and computational resources.


2021 ◽  
Author(s):  
Min Chen

Abstract Deep learning (DL) techniques, more specifically Convolutional Neural Networks (CNNs), have become increasingly popular in advancing the field of data science and have had great successes in a wide array of applications including computer vision, speech, natural language processing and etc. However, the training process of CNNs is computationally intensive and high computational cost, especially when the dataset is huge. To overcome these obstacles, this paper takes advantage of distributed frameworks and cloud computing to develop a parallel CNN algorithm. MapReduce is a scalable and fault-tolerant data processing tool that was developed to provide significant improvements in large-scale data-intensive applications in clusters. A MapReduce-based CNN (MCNN) is developed in this work to tackle the task of image classification. In addition, the proposed MCNN adopted the idea of adding dropout layers in the networks to tackle the overfitting problem. Close examination of the implementation of MCNN as well as how the proposed algorithm accelerates learning are discussed and demonstrated through experiments. Results reveal high classification accuracy and significant improvements in speedup, scaleup and sizeup compared to the standard algorithms.


2012 ◽  
Vol 263-266 ◽  
pp. 2627-2630
Author(s):  
Yu Feng Chen ◽  
Zhi Zhong Yang ◽  
Gao Jie Yan ◽  
Fei Fei Li

Bag of words algorithm is to integrate visual words, described the local features of the object, together to form a bag model. It does not consider the structural information of the object, considering only the local features. This paper presents an image classification algorithm based on structural information. Algorithm adds the structural information of the object according to the ideas of generalized Hough transform. We use the structural information on the local features of the test image with generalized Hough transform and get the possible position of the center of the object. Then, we have to analyze the template size on the basis of the dispersion degree of the voting points, and finally optimize the voting results. Experimental results demonstrate that our proposed method is superior to the method which does not consider the structural information and uses the low-level features to classification.


2017 ◽  
Vol 8 (4) ◽  
pp. 45-58 ◽  
Author(s):  
Mohammed Amin Belarbi ◽  
Saïd Mahmoudi ◽  
Ghalem Belalem

Dimensionality reduction in large-scale image research plays an important role for their performance in different applications. In this paper, we explore Principal Component Analysis (PCA) as a dimensionality reduction method. For this purpose, first, the Scale Invariant Feature Transform (SIFT) features and Speeded Up Robust Features (SURF) are extracted as image features. Second, the PCA is applied to reduce the dimensions of SIFT and SURF feature descriptors. By comparing multiple sets of experimental data with different image databases, we have concluded that PCA with a reduction in the range, can effectively reduce the computational cost of image features, and maintain the high retrieval performance as well


Author(s):  
Guangtao Wang ◽  
Rex Ying ◽  
Jing Huang ◽  
Jure Leskovec

Self-attention mechanism in graph neural networks (GNNs) led to state-of-the-art performance on many graph representation learning tasks. Currently, at every layer, attention is computed between connected pairs of nodes and depends solely on the representation of the two nodes. However, such attention mechanism does not account for nodes that are not directly connected but provide important network context. Here we propose Multi-hop Attention Graph Neural Network (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA diffuses the attention scores across the network, which increases the receptive field for every layer of the GNN. Unlike previous approaches, MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. We demonstrate in theory and experiments that MAGNA captures large-scale structural information in every layer, and has a low-pass effect that eliminates noisy high-frequency information from graph data. Experimental results on node classification as well as the knowledge graph completion benchmarks show that MAGNA achieves state-of-the-art results: MAGNA achieves up to 5.7% relative error reduction over the previous state-of-the-art on Cora, Citeseer, and Pubmed. MAGNA also obtains the best performance on a large-scale Open Graph Benchmark dataset. On knowledge graph completion MAGNA advances state-of-the-art on WN18RR and FB15k-237 across four different performance metrics.


Sign in / Sign up

Export Citation Format

Share Document