scholarly journals Indoor versus outdoor scene recognition for navigation of a micro aerial vehicle using spatial color gist wavelet descriptors

Author(s):  
Anitha Ganesan ◽  
Anbarasu Balasubramanian

AbstractIn the context of improved navigation for micro aerial vehicles, a new scene recognition visual descriptor, called spatial color gist wavelet descriptor (SCGWD), is proposed. SCGWD was developed by combining proposed Ohta color-GIST wavelet descriptors with census transform histogram (CENTRIST) spatial pyramid representation descriptors for categorizing indoor versus outdoor scenes. A binary and multiclass support vector machine (SVM) classifier with linear and non-linear kernels was used to classify indoor versus outdoor scenes and indoor scenes, respectively. In this paper, we have also discussed the feature extraction methodology of several, state-of-the-art visual descriptors, and four proposed visual descriptors (Ohta color-GIST descriptors, Ohta color-GIST wavelet descriptors, enhanced Ohta color histogram descriptors, and SCGWDs), in terms of experimental perspectives. The proposed enhanced Ohta color histogram descriptors, Ohta color-GIST descriptors, Ohta color-GIST wavelet descriptors, SCGWD, and state-of-the-art visual descriptors were evaluated, using the Indian Institute of Technology Madras Scene Classification Image Database two, an Indoor-Outdoor Dataset, and the Massachusetts Institute of Technology indoor scene classification dataset [(MIT)-67]. Experimental results showed that the indoor versus outdoor scene recognition algorithm, employing SVM with SCGWDs, produced the highest classification rates (CRs)—95.48% and 99.82% using radial basis function kernel (RBF) kernel and 95.29% and 99.45% using linear kernel for the IITM SCID2 and Indoor-Outdoor datasets, respectively. The lowest CRs—2.08% and 4.92%, respectively—were obtained when RBF and linear kernels were used with the MIT-67 dataset. In addition, higher CRs, precision, recall, and area under the receiver operating characteristic curve values were obtained for the proposed SCGWDs, in comparison with state-of-the-art visual descriptors.

2019 ◽  
Vol 73 (1) ◽  
pp. 37-55 ◽  
Author(s):  
B. Anbarasu ◽  
G. Anitha

In this paper, a new scene recognition visual descriptor called Enhanced Scale Invariant Feature Transform-based Sparse coding Spatial Pyramid Matching (Enhanced SIFT-ScSPM) descriptor is proposed by combining a Bag of Words (BOW)-based visual descriptor (SIFT-ScSPM) and Gist-based descriptors (Enhanced Gist-Enhanced multichannel Gist (Enhanced mGist)). Indoor scene classification is carried out by multi-class linear and non-linear Support Vector Machine (SVM) classifiers. Feature extraction methodology and critical review of several visual descriptors used for indoor scene recognition in terms of experimental perspectives have been discussed in this paper. An empirical study is conducted on the Massachusetts Institute of Technology (MIT) 67 indoor scene classification data set and assessed the classification accuracy of state-of-the-art visual descriptors and the proposed Enhanced mGist, Speeded Up Robust Features-Spatial Pyramid Matching (SURF-SPM) and Enhanced SIFT-ScSPM visual descriptors. Experimental results show that the proposed Enhanced SIFT-ScSPM visual descriptor performs better with higher classification rate, precision, recall and area under the Receiver Operating Characteristic (ROC) curve values with respect to the state-of-the-art and the proposed Enhanced mGist and SURF-SPM visual descriptors.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4283
Author(s):  
Md.-Habibur Rahman ◽  
Md. Shahjalal ◽  
Moh. Khalid Hasan ◽  
Md.-Osman Ali ◽  
Yeong-Min Jang

Embedding optical camera communication (OCC) commercially as a favorable complement of radio-frequency technology has led to the desire for an intelligent receiver system that is eligible to communicate with an accurate light-emitting diode (LED) transmitter. To shed light on this issue, a novel scheme for detecting and recognizing data transmitting LEDs has been elucidated in this paper. Since the optically modulated signal is captured wirelessly by a camera that plays the role of the receiver for the OCC technology, the process to detect LED region and retrieval of exact information from the image sensor is required to be intelligent enough to achieve a low bit error rate (BER) and high data rate to ensure reliable optical communication within limited computational abilities of the most used commercial cameras such as those in smartphones, vehicles, and mobile robots. In the proposed scheme, we have designed an intelligent camera receiver system that is capable of separating accurate data transmitting LED regions removing other unwanted LED regions employing a support vector machine (SVM) classifier along with a convolutional neural network (CNN) in the camera receiver. CNN is used to detect every LED region from the image frame and then essential features are extracted to feed into an SVM classifier for further accurate classification. The receiver operating characteristic curve and other key performance parameters of the classifier have been analyzed broadly to evaluate the performance, justify the assistance of the SVM classifier in recognizing the accurate LED region, and decode data with low BER. To investigate communication performances, BER analysis, data rate, and inter-symbol interference have been elaborately demonstrated for the proposed intelligent receiver. In addition, BER against distance and BER against data rate have also been exhibited to validate the effectiveness of our proposed scheme comparing with only CNN and only SVM classifier based receivers individually. Experimental results have ensured the robustness and applicability of the proposed scheme both in the static and mobile scenarios.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2038
Author(s):  
Xi Shao ◽  
Xuan Zhang ◽  
Guijin Tang ◽  
Bingkun Bao

We propose a new end-to-end scene recognition framework, called a Recurrent Memorized Attention Network (RMAN) model, which performs object-based scene classification by recurrently locating and memorizing objects in the image. Based on the proposed framework, we introduce a multi-task mechanism that contiguously attends on the different essential objects in a scene image and recurrently performs memory fusion of the features of object focused by an attention model to improve the scene recognition accuracy. The experimental results show that the RMAN model has achieved better classification performance on the constructed dataset and two public scene datasets, surpassing state-of-the-art image scene recognition approaches.


2019 ◽  
Vol 11 (16) ◽  
pp. 1943 ◽  
Author(s):  
Omid Rahmati ◽  
Saleh Yousefi ◽  
Zahra Kalantari ◽  
Evelyn Uuemaa ◽  
Teimur Teimurian ◽  
...  

Mountainous areas are highly prone to a variety of nature-triggered disasters, which often cause disabling harm, death, destruction, and damage. In this work, an attempt was made to develop an accurate multi-hazard exposure map for a mountainous area (Asara watershed, Iran), based on state-of-the art machine learning techniques. Hazard modeling for avalanches, rockfalls, and floods was performed using three state-of-the-art models—support vector machine (SVM), boosted regression tree (BRT), and generalized additive model (GAM). Topo-hydrological and geo-environmental factors were used as predictors in the models. A flood dataset (n = 133 flood events) was applied, which had been prepared using Sentinel-1-based processing and ground-based information. In addition, snow avalanche (n = 58) and rockfall (n = 101) data sets were used. The data set of each hazard type was randomly divided to two groups: Training (70%) and validation (30%). Model performance was evaluated by the true skill score (TSS) and the area under receiver operating characteristic curve (AUC) criteria. Using an exposure map, the multi-hazard map was converted into a multi-hazard exposure map. According to both validation methods, the SVM model showed the highest accuracy for avalanches (AUC = 92.4%, TSS = 0.72) and rockfalls (AUC = 93.7%, TSS = 0.81), while BRT demonstrated the best performance for flood hazards (AUC = 94.2%, TSS = 0.80). Overall, multi-hazard exposure modeling revealed that valleys and areas close to the Chalous Road, one of the most important roads in Iran, were associated with high and very high levels of risk. The proposed multi-hazard exposure framework can be helpful in supporting decision making on mountain social-ecological systems facing multiple hazards.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 4024 ◽  
Author(s):  
David Santos ◽  
Eric Lopez-Lopez ◽  
Xosé M. Pardo ◽  
Roberto Iglesias ◽  
Senén Barro ◽  
...  

Scene recognition is still a very important topic in many fields, and that is definitely the case in robotics. Nevertheless, this task is view-dependent, which implies the existence of preferable directions when recognizing a particular scene. Both in human and computer vision-based classification, this actually often turns out to be biased. In our case, instead of trying to improve the generalization capability for different view directions, we have opted for the development of a system capable of filtering out noisy or meaningless images while, on the contrary, retaining those views from which is likely feasible that the correct identification of the scene can be made. Our proposal works with a heuristic metric based on the detection of key points in 3D meshes (Harris 3D). This metric is later used to build a model that combines a Minimum Spanning Tree and a Support Vector Machine (SVM). We have performed an extensive number of experiments through which we have addressed (a) the search for efficient visual descriptors, (b) the analysis of the extent to which our heuristic metric resembles the human criteria for relevance and, finally, (c) the experimental validation of our complete proposal. In the experiments, we have used both a public image database and images collected at our research center.


2014 ◽  
Vol 14 (05) ◽  
pp. 1450066 ◽  
Author(s):  
MANAB KUMAR DAS ◽  
SAMIT ARI

In this paper, the conventional Stockwell transform is effectively used to classify the ECG arrhythmias. The performance of ECG classification mainly depends on feature extraction based on an efficient formation of morphological and temporal features and the design of the classifier. Feature extraction is the important component of designing the system based on pattern recognition since even the best classifier will not perform better if the good features are not selected properly. Here, the S-transform (ST) is used to extract the morphological features which is appended with temporal features. This feature set is independently classified using artificial neural network (NN) and support vector machine (SVM). In this work, five classes of ECG beats (normal, ventricular, supra ventricular, fusion and unknown beats) from Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia database are classified according to AAMI EC57 1998 standard (Association for the Advancement of Medical Instrumentation). Performance is evaluated on several normal and abnormal ECG signals of MIT-BIH arrhythmias database using two classifier techniques: ST with NN classifier (ST-NN) and other proposed ST with SVM classifier (ST-SVM). The proposed method achieves accuracy of 98.47%. The performance of the proposed technique is compared with ST-NN and earlier reported technique.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Loris Nanni ◽  
Sheryl Brahnam ◽  
Stefano Ghidoni ◽  
Alessandra Lumini

We perform an extensive study of the performance of different classification approaches on twenty-five datasets (fourteen image datasets and eleven UCI data mining datasets). The aim is to find General-Purpose (GP) heterogeneous ensembles (requiring little to no parameter tuning) that perform competitively across multiple datasets. The state-of-the-art classifiers examined in this study include the support vector machine, Gaussian process classifiers, random subspace of adaboost, random subspace of rotation boosting, and deep learning classifiers. We demonstrate that a heterogeneous ensemble based on the simple fusion by sum rule of different classifiers performs consistently well across all twenty-five datasets. The most important result of our investigation is demonstrating that some very recent approaches, including the heterogeneous ensemble we propose in this paper, are capable of outperforming an SVM classifier (implemented with LibSVM), even when both kernel selection and SVM parameters are carefully tuned for each dataset.


Author(s):  
Weiwei Yang ◽  
Haifeng Song

Recent research has shown that integration of spatial information has emerged as a powerful tool in improving the classification accuracy of hyperspectral image (HSI). However, partitioning homogeneous regions of the HSI remains a challenging task. This paper proposes a novel spectral-spatial classification method inspired by the support vector machine (SVM). The model consists of spectral-spatial feature extraction channel (SSC) and SVM classifier. SSC is mainly used to extract spatial-spectral features of HSI. SVM is mainly used to classify the extracted features. The model can automatically extract the features of HSI and classify them. Experiments are conducted on benchmark HSI dataset (Indian Pines). It is found that the proposed method yields more accurate classification results compared to the state-of-the-art techniques.


Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 205
Author(s):  
Hamoud Younes ◽  
Ali Ibrahim ◽  
Mostafa Rizk ◽  
Maurizio Valle

Approximate Computing Techniques (ACT) are promising solutions towards the achievement of reduced energy, time latency and hardware size for embedded implementations of machine learning algorithms. In this paper, we present the first FPGA implementation of an approximate tensorial Support Vector Machine (SVM) classifier with algorithmic level ACTs using High-Level Synthesis (HLS). A touch modality classification framework was adopted to validate the effectiveness of the proposed implementation. When compared to exact implementation presented in the state-of-the-art, the proposed implementation achieves a reduction in power consumption by up to 49% with a speedup of 3.2×. Moreover, the hardware resources are reduced by 40% while consuming 82% less energy in classifying an input touch with an accuracy loss less than 5%.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4802
Author(s):  
Roger Resmini ◽  
Lincoln Silva ◽  
Adriel S. Araujo ◽  
Petrucio Medeiros ◽  
Débora Muchaluat-Saade ◽  
...  

Breast cancer is one of the leading causes of mortality globally, but early diagnosis and treatment can increase the cancer survival rate. In this context, thermography is a suitable approach to help early diagnosis due to the temperature difference between cancerous tissues and healthy neighboring tissues. This work proposes an ensemble method for selecting models and features by combining a Genetic Algorithm (GA) and the Support Vector Machine (SVM) classifier to diagnose breast cancer. Our evaluation demonstrates that the approach presents a significant contribution to the early diagnosis of breast cancer, presenting results with 94.79% Area Under the Receiver Operating Characteristic Curve and 97.18% of Accuracy.


Sign in / Sign up

Export Citation Format

Share Document