Research on Indoor Scene Classification Mechanism Based on Multiple Descriptors Fusion

This study aims at the great limitations caused by the non-ROI (region of interest) information interference in traditional scene classification algorithms, including the changes of multiscale or various visual angles and the high similarity between classes and other factors. An effective indoor scene classification mechanism based on multiple descriptors fusion is proposed, which introduces the depth images to improve descriptor efficiency. The greedy descriptor filter algorithm (GDFA) is proposed to obtain valuable descriptors, and the multiple descriptor combination method is also given to further improve descriptor performance. Performance analysis and simulation results show that multiple descriptors fusion not only can achieve higher classification accuracy than principal components analysis (PCA) in the condition with medium and large size of descriptors but also can improve the classification accuracy than the other existing algorithms effectively.

Download Full-text

Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification

Sensors ◽

10.3390/s20041188 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1188 ◽

Cited By ~ 10

Author(s):

Jianming Zhang ◽

Chaoquan Lu ◽

Jin Wang ◽

Xiao-Guang Yue ◽

Se-Jung Lim ◽

...

Keyword(s):

Remote Sensing ◽

Classification Accuracy ◽

Model Parameters ◽

Classification Algorithms ◽

Scene Classification ◽

Training Strategy ◽

Network Training ◽

Training Stage ◽

Triplet Loss ◽

And Training

Many remote sensing scene classification algorithms improve their classification accuracy by additional modules, which increases the parameters and computing overhead of the model at the inference stage. In this paper, we explore how to improve the classification accuracy of the model without adding modules at the inference stage. First, we propose a network training strategy of training with multi-size images. Then, we introduce more supervision information by triplet loss and design a branch for the triplet loss. In addition, dropout is introduced between the feature extractor and the classifier to avoid over-fitting. These modules only work at the training stage and will not bring about the increase in model parameters at the inference stage. We use Resnet18 as the baseline and add the three modules to the baseline. We perform experiments on three datasets: AID, NWPU-RESISC45, and OPTIMAL. Experimental results show that our model combined with the three modules is more competitive than many existing classification algorithms. In addition, ablation experiments on OPTIMAL show that dropout, triplet loss, and training with multi-size images improve the overall accuracy of the model on the test set by 0.53%, 0.38%, and 0.7%, respectively. The combination of the three modules improves the overall accuracy of the model by 1.61%. It can be seen that the three modules can improve the classification accuracy of the model without increasing model parameters at the inference stage, and training with multi-size images brings a greater gain in accuracy than the other two modules, but the combination of the three modules will be better.

Download Full-text

Combining RGB and Depth Images for Indoor Scene Classification Using Deep Learning

2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) ◽

10.1109/iccic.2017.8524231 ◽

2017 ◽

Cited By ~ 2

Author(s):

Karthik Pujar ◽

Satyadhyan Chickerur ◽

Mahesh S. Patil

Keyword(s):

Deep Learning ◽

Scene Classification ◽

Depth Images ◽

Indoor Scene

Download Full-text

A Multi-Branch Feature Fusion Strategy Based on an Attention Mechanism for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13101950 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1950

Author(s):

Cuiping Shi ◽

Xin Zhao ◽

Liguo Wang

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Classification Accuracy ◽

Feature Fusion ◽

State Of The Art ◽

Rapid Development ◽

Remote Sensing Image ◽

Classification Performance ◽

Attention Mechanism ◽

Scene Classification

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.

Download Full-text

An Improved Method for CT Image Coding Using PSNR Prediction Model Based on ELM

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.738-739.598 ◽

2015 ◽

Vol 738-739 ◽

pp. 598-601

Author(s):

Han Yang Zhu ◽

Xin Yu Jin ◽

Jian Feng Shen

Keyword(s):

Computational Complexity ◽

Prediction Model ◽

Image Coding ◽

Region Of Interest ◽

Image Encoding ◽

Large Size ◽

Learning Machine ◽

Intractable Problem ◽

Encoding Efficiency ◽

Better Than

In telemedicine, medical images are always considered very important telemedicine diagnostic evidences. High transmission delay in a bandwidth limited network becomes an intractable problem because of its large size. It’s important to achieve a quality balance between Region of Interest (ROI) and Background Region (BR) when ROI-based image encoding is being used. In this paper, a research made on balancing method of LS-SVM based ROI/BR PSNR prediction model to optimize the ROI encoding shows it’s much better than conventional methods but with very high computational complexity. We propose a new method using extreme learning machine (ELM) with lower computational complexity to improve encoding efficiency compared to LS-SVM based model. Besides, it also achieves the same effect of balancing ROI and BR.

Download Full-text

Analysing Temporal Effects on Classification of SAR and Optical Images

10.5194/egusphere-egu21-14386 ◽

2021 ◽

Author(s):

Ahmet Batuhan Polat ◽

Ozgun Akcay ◽

Fusun Balik Sanli

Keyword(s):

Rural Areas ◽

Classification Accuracy ◽

Winter Season ◽

Support Vector ◽

Classification Algorithms ◽

Sar Images ◽

Optical Images ◽

Object Based ◽

Training Samples

<p>Obtaining high accuracy in land cover classification is a non-trivial problem in geosciences for monitoring urban and rural areas. In this study, different classification algorithms were tested with different types of data, and besides the effects of seasonal changes on these classification algorithms and the evaluation of the data used are investigated. In addition, the effect of increasing classification training samples on classification accuracy has been revealed as a result of the study. Sentinel-1 Synthetic Aperture Radar (SAR) images and Sentinel-2 multispectral optical images were used as datasets. Object-based approach was used for the classification of various fused image combinations. The classification algorithms Support Vector Machines (SVM), Random Forest (RF) and K-Nearest Neighborhood (kNN) methods were used for this process. In addition, Normalized Difference Vegetation Index (NDVI) was examined separately to define the exact contribution to the classification accuracy. &#160;As a result, the overall accuracies were compared by classifying the fused data generated by combining optical and SAR images. It has been determined that the increase in the number of training samples improve the classification accuracy. Moreover, it was determined that the object-based classification obtained from single SAR imagery produced the lowest classification accuracy among the used different dataset combinations in this study. In addition, it has been shown that NDVI data does not increase the accuracy of the classification in the winter season as the trees shed their leaves due to climate conditions.</p>

Download Full-text

A Deep Learning-based Indoor Scene Classification Approach Enhanced with Inter-Object Distance Semantic Features

10.1109/iros51168.2021.9636242 ◽

2021 ◽

Author(s):

Ricardo Pereira ◽

Luis Garrote ◽

Tiago Barros ◽

Ana Lopes ◽

Urbano J. Nunes

Keyword(s):

Deep Learning ◽

Semantic Features ◽

Scene Classification ◽

Classification Approach ◽

Object Distance ◽

Indoor Scene

Download Full-text

Indoor scene classification model based on multi-modal fusion*

10.1109/iccais52680.2021.9624487 ◽

2021 ◽

Author(s):

Yaning Wang ◽

Weifeng Liu ◽

Jianning Li ◽

Zhangming Peng

Keyword(s):

Classification Model ◽

Scene Classification ◽

Model Based ◽

Indoor Scene

Download Full-text

Indoor Scene Classification Using Combined 3D and Gist Features

Computer Vision – ACCV 2010 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-19309-5_16 ◽

2011 ◽

pp. 201-215 ◽

Cited By ~ 4

Author(s):

Agnes Swadzba ◽

Sven Wachsmuth

Keyword(s):

Scene Classification ◽

Indoor Scene

Download Full-text

Towards accurate and unbiased imaging-based differentiation of Parkinson’s disease, progressive supranuclear palsy and corticobasal syndrome

Brain Communications ◽

10.1093/braincomms/fcaa051 ◽

2020 ◽

Vol 2 (1) ◽

Author(s):

Marta M Correia ◽

Timothy Rittman ◽

Christopher L Barnes ◽

Ian T Coyle-Gilchrist ◽

Boyd Ghosh ◽

...

Keyword(s):

Magnetic Resonance Imaging ◽

Magnetic Resonance ◽

Principal Components Analysis ◽

Principal Components ◽

Cross Validation ◽

Region Of Interest ◽

Imaging Data ◽

Resonance Imaging ◽

Magnetic Resonance Imaging Data ◽

Components Analysis

Abstract The early and accurate differential diagnosis of parkinsonian disorders is still a significant challenge for clinicians. In recent years, a number of studies have used magnetic resonance imaging data combined with machine learning and statistical classifiers to successfully differentiate between different forms of Parkinsonism. However, several questions and methodological issues remain, to minimize bias and artefact-driven classification. In this study, we compared different approaches for feature selection, as well as different magnetic resonance imaging modalities, with well-matched patient groups and tightly controlling for data quality issues related to patient motion. Our sample was drawn from a cohort of 69 healthy controls, and patients with idiopathic Parkinson’s disease (n = 35), progressive supranuclear palsy Richardson’s syndrome (n = 52) and corticobasal syndrome (n = 36). Participants underwent standardized T1-weighted and diffusion-weighted magnetic resonance imaging. Strict data quality control and group matching reduced the control and patient numbers to 43, 32, 33 and 26, respectively. We compared two different methods for feature selection and dimensionality reduction: whole-brain principal components analysis, and an anatomical region-of-interest based approach. In both cases, support vector machines were used to construct a statistical model for pairwise classification of healthy controls and patients. The accuracy of each model was estimated using a leave-two-out cross-validation approach, as well as an independent validation using a different set of subjects. Our cross-validation results suggest that using principal components analysis for feature extraction provides higher classification accuracies when compared to a region-of-interest based approach. However, the differences between the two feature extraction methods were significantly reduced when an independent sample was used for validation, suggesting that the principal components analysis approach may be more vulnerable to overfitting with cross-validation. Both T1-weighted and diffusion magnetic resonance imaging data could be used to successfully differentiate between subject groups, with neither modality outperforming the other across all pairwise comparisons in the cross-validation analysis. However, features obtained from diffusion magnetic resonance imaging data resulted in significantly higher classification accuracies when an independent validation cohort was used. Overall, our results support the use of statistical classification approaches for differential diagnosis of parkinsonian disorders. However, classification accuracy can be affected by group size, age, sex and movement artefacts. With appropriate controls and out-of-sample cross validation, diagnostic biomarker evaluation including magnetic resonance imaging based classifiers may be an important adjunct to clinical evaluation.

Download Full-text

A Comprehensive Evaluation of Approaches for Built-Up Area Extraction from Landsat OLI Images Using Massive Samples

Remote Sensing ◽

10.3390/rs11010002 ◽

2018 ◽

Vol 11 (1) ◽

pp. 2 ◽

Cited By ~ 11

Author(s):

Tao Zhang ◽

Hong Tang

Keyword(s):

Learning Strategies ◽

Classification Accuracy ◽

Feature Learning ◽

Automatic Generation ◽

Experimental Results ◽

Support Vector ◽

Feature Engineering ◽

Classification Algorithms ◽

Sample Points ◽

Better Than

Detailed information about built-up areas is valuable for mapping complex urban environments. Although a large number of classification algorithms for such areas have been developed, they are rarely tested from the perspective of feature engineering and feature learning. Therefore, we launched a unique investigation to provide a full test of the Operational Land Imager (OLI) imagery for 15-m resolution built-up area classification in 2015, in Beijing, China. Training a classifier requires many sample points, and we proposed a method based on the European Space Agency’s (ESA) 38-m global built-up area data of 2014, OpenStreetMap, and MOD13Q1-NDVI to achieve the rapid and automatic generation of a large number of sample points. Our aim was to examine the influence of a single pixel and image patch under traditional feature engineering and modern feature learning strategies. In feature engineering, we consider spectra, shape, and texture as the input features, and support vector machine (SVM), random forest (RF), and AdaBoost as the classification algorithms. In feature learning, the convolutional neural network (CNN) is used as the classification algorithm. In total, 26 built-up land cover maps were produced. The experimental results show the following: (1) The approaches based on feature learning are generally better than those based on feature engineering in terms of classification accuracy, and the performance of ensemble classifiers (e.g., RF) are comparable to that of CNN. Two-dimensional CNN and the 7-neighborhood RF have the highest classification accuracies at nearly 91%; (2) Overall, the classification effect and accuracy based on image patches are better than those based on single pixels. The features that can highlight the information of the target category (e.g., PanTex (texture-derived built-up presence index) and enhanced morphological building index (EMBI)) can help improve classification accuracy. The code and experimental results are available at https://github.com/zhangtao151820/CompareMethod.

Download Full-text