scholarly journals Person Re-Identification by Low-Dimensional Features and Metric Learning

2021 ◽  
Vol 13 (11) ◽  
pp. 289
Author(s):  
Xingyuan Chen ◽  
Huahu Xu ◽  
Yang Li ◽  
Minjie Bian

Person re-identification (Re-ID) has attracted attention due to its wide range of applications. Most recent studies have focused on the extraction of deep features, while ignoring color features that can remain stable, even for illumination variations and the variation in person pose. There are also few studies that combine the powerful learning capabilities of deep learning with color features. Therefore, we hope to use the advantages of both to design a model with low computational resource consumption and excellent performance to solve the task of person re-identification. In this paper, we designed a color feature containing relative spatial information, namely the color feature with spatial information. Then, bidirectional long short-term memory (BLSTM) networks with an attention mechanism are used to obtain the contextual relationship contained in the hand-crafted color features. Finally, experiments demonstrate that the proposed model can improve the recognition performance compared with traditional methods. At the same time, hand-crafted features based on human prior knowledge not only reduce computational consumption compared with deep learning methods but also make the model more interpretable.

2018 ◽  
Vol 10 (11) ◽  
pp. 1827 ◽  
Author(s):  
Ahram Song ◽  
Jaewan Choi ◽  
Youkyung Han ◽  
Yongil Kim

Hyperspectral change detection (CD) can be effectively performed using deep-learning networks. Although these approaches require qualified training samples, it is difficult to obtain ground-truth data in the real world. Preserving spatial information during training is difficult due to structural limitations. To solve such problems, our study proposed a novel CD method for hyperspectral images (HSIs), including sample generation and a deep-learning network, called the recurrent three-dimensional (3D) fully convolutional network (Re3FCN), which merged the advantages of a 3D fully convolutional network (FCN) and a convolutional long short-term memory (ConvLSTM). Principal component analysis (PCA) and the spectral correlation angle (SCA) were used to generate training samples with high probabilities of being changed or unchanged. The strategy assisted in training fewer samples of representative feature expression. The Re3FCN was mainly comprised of spectral–spatial and temporal modules. Particularly, a spectral–spatial module with a 3D convolutional layer extracts the spectral–spatial features from the HSIs simultaneously, whilst a temporal module with ConvLSTM records and analyzes the multi-temporal HSI change information. The study first proposed a simple and effective method to generate samples for network training. This method can be applied effectively to cases with no training samples. Re3FCN can perform end-to-end detection for binary and multiple changes. Moreover, Re3FCN can receive multi-temporal HSIs directly as input without learning the characteristics of multiple changes. Finally, the network could extract joint spectral–spatial–temporal features and it preserved the spatial structure during the learning process through the fully convolutional structure. This study was the first to use a 3D FCN and a ConvLSTM for the remote-sensing CD. To demonstrate the effectiveness of the proposed CD method, we performed binary and multi-class CD experiments. Results revealed that the Re3FCN outperformed the other conventional methods, such as change vector analysis, iteratively reweighted multivariate alteration detection, PCA-SCA, FCN, and the combination of 2D convolutional layers-fully connected LSTM.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1010
Author(s):  
Nouar AlDahoul ◽  
Hezerul Abdul Karim ◽  
Abdulaziz Saleh Ba Wazir ◽  
Myles Joshua Toledo Tan ◽  
Mohammad Faizal Ahmad Fauzi

Background: Laparoscopy is a surgery performed in the abdomen without making large incisions in the skin and with the aid of a video camera, resulting in laparoscopic videos. The laparoscopic video is prone to various distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur. One of the main components in the feedback loop of video enhancement systems is distortion identification, which automatically classifies the distortions affecting the videos and selects the video enhancement algorithm accordingly. This paper aims to address the laparoscopic video distortion identification problem by developing fast and accurate multi-label distortion classification using a deep learning model. Current deep learning solutions based on convolutional neural networks (CNNs) can address laparoscopic video distortion classification, but they learn only spatial information. Methods: In this paper, utilization of both spatial and temporal features in a CNN-long short-term memory (CNN-LSTM) model is proposed as a novel solution to enhance the classification. First, pre-trained ResNet50 CNN was used to extract spatial features from each video frame by transferring representation from large-scale natural images to laparoscopic images. Next, LSTM was utilized to consider the temporal relation between the features extracted from the laparoscopic video frames to produce multi-label categories. A novel laparoscopic video dataset proposed in the ICIP2020 challenge was used for training and evaluation of the proposed method. Results: The experiments conducted show that the proposed CNN-LSTM outperforms the existing solutions in terms of accuracy (85%), and F1-score (94.2%). Additionally, the proposed distortion identification model is able to run in real-time with low inference time (0.15 sec). Conclusions: The proposed CNN-LSTM model is a feasible solution to be utilized in laparoscopic videos for distortion identification.


2020 ◽  
Vol 10 (2) ◽  
pp. 615 ◽  
Author(s):  
Tomas Iesmantas ◽  
Agne Paulauskaite-Taraseviciene ◽  
Kristina Sutiene

(1) Background: The segmentation of cell nuclei is an essential task in a wide range of biomedical studies and clinical practices. The full automation of this process remains a challenge due to intra- and internuclear variations across a wide range of tissue morphologies, differences in staining protocols and imaging procedures. (2) Methods: A deep learning model with metric embeddings such as contrastive loss and triplet loss with semi-hard negative mining is proposed in order to accurately segment cell nuclei in a diverse set of microscopy images. The effectiveness of the proposed model was tested on a large-scale multi-tissue collection of microscopy image sets. (3) Results: The use of deep metric learning increased the overall segmentation prediction by 3.12% in the average value of Dice similarity coefficients as compared to no metric learning. In particular, the largest gain was observed for segmenting cell nuclei in H&E -stained images when deep learning network and triplet loss with semi-hard negative mining were considered for the task. (4) Conclusion: We conclude that deep metric learning gives an additional boost to the overall learning process and consequently improves the segmentation performance. Notably, the improvement ranges approximately between 0.13% and 22.31% for different types of images in the terms of Dice coefficients when compared to no metric deep learning.


2020 ◽  
Vol 8 (10) ◽  
pp. 805
Author(s):  
Ki-Su Kim ◽  
June-Beom Lee ◽  
Myung-Il Roh ◽  
Ki-Min Han ◽  
Gap-Heon Lee

The path planning of a ship requires much information, and one of the essential factors is predicting the ocean environment. Ocean weather can generally be gathered from forecasting information provided by weather centers. However, these data are difficult to obtain when satellite communication is unstable during voyages, or there are cases where forecast data for a more extended period of time are needed for the operation of the fleet. Therefore, shipping companies and classification societies have attempted to establish a model for predicting the ocean weather on its own. Historically, ocean weather has been primarily predicted using empirical and numerical methods. Recently, a method for predicting ocean weather using deep learning has emerged. In this study, a deep learning model combining a denoising AutoEncoder and convolutional long short-term memory (LSTM) was proposed to predict the ocean weather worldwide. The denoising AutoEncoder is effective for removing noise that hinders the training of deep learning models. While the LSTM could be used as time-series inputs at specific points, the convolutional LSTM can use time-series images as inputs, making them suitable for predicting a wide range of ocean weather. Herein, using the proposed model, eight parameters of ocean weather were predicted. The proposed learning model predicted ocean weather after one week, showing an average error of 6.7%. The results show the applicability of the proposed learning model for predicting ocean weather.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Cach N. Dang ◽  
María N. Moreno-García ◽  
Fernando De la Prieta

Sentiment analysis on public opinion expressed in social networks, such as Twitter or Facebook, has been developed into a wide range of applications, but there are still many challenges to be addressed. Hybrid techniques have shown to be potential models for reducing sentiment errors on increasingly complex training data. This paper aims to test the reliability of several hybrid techniques on various datasets of different domains. Our research questions are aimed at determining whether it is possible to produce hybrid models that outperform single models with different domains and types of datasets. Hybrid deep sentiment analysis learning models that combine long short-term memory (LSTM) networks, convolutional neural networks (CNN), and support vector machines (SVM) are built and tested on eight textual tweets and review datasets of different domains. The hybrid models are compared against three single models, SVM, LSTM, and CNN. Both reliability and computation time were considered in the evaluation of each technique. The hybrid models increased the accuracy for sentiment analysis compared with single models on all types of datasets, especially the combination of deep learning models with SVM. The reliability of the latter was significantly higher.


2021 ◽  
Vol 13 (21) ◽  
pp. 4348
Author(s):  
Ghulam Farooque ◽  
Liang Xiao ◽  
Jingxiang Yang ◽  
Allah Bux Sargano

In recent years, deep learning-based models have produced encouraging results for hyperspectral image (HSI) classification. Specifically, Convolutional Long Short-Term Memory (ConvLSTM) has shown good performance for learning valuable features and modeling long-term dependencies in spectral data. However, it is less effective for learning spatial features, which is an integral part of hyperspectral images. Alternatively, convolutional neural networks (CNNs) can learn spatial features, but they possess limitations in handling long-term dependencies due to the local feature extraction in these networks. Considering these factors, this paper proposes an end-to-end Spectral-Spatial 3D ConvLSTM-CNN based Residual Network (SSCRN), which combines 3D ConvLSTM and 3D CNN for handling both spectral and spatial information, respectively. The contribution of the proposed network is twofold. Firstly, it addresses the long-term dependencies of spectral dimension using 3D ConvLSTM to capture the information related to various ground materials effectively. Secondly, it learns the discriminative spatial features using 3D CNN by employing the concept of the residual blocks to accelerate the training process and alleviate the overfitting. In addition, SSCRN uses batch normalization and dropout to regularize the network for smooth learning. The proposed framework is evaluated on three benchmark datasets widely used by the research community. The results confirm that SSCRN outperforms state-of-the-art methods with an overall accuracy of 99.17%, 99.67%, and 99.31% over Indian Pines, Salinas, and Pavia University datasets, respectively. Moreover, it is worth mentioning that these excellent results were achieved with comparatively fewer epochs, which also confirms the fast learning capabilities of the SSCRN.


2020 ◽  
Vol 14 (04) ◽  
pp. 501-516
Author(s):  
Joseph R. Barr ◽  
Peter Shaw ◽  
Faisal N. Abu-Khzam ◽  
Tyler Thatcher ◽  
Sheng Yu

We present an empirical analysis of the source code of the Fluoride Bluetooth module, which is a part of standard Android OS distribution, by exhibiting a novel approach for classifying and scoring source code and vulnerability rating. Our workflow combines deep learning, combinatorial optimization, heuristics and machine learning. A combination of heuristics and deep learning is used to embed function (and method) labels into a low-dimensional Euclidean space. Because the corpus of the Fluoride source code is rather limited (containing approximately 12,000 functions), a straightforward embedding (using, e.g. code2vec) is untenable. To overcome the challenge of dearth of data, it is necessary to go through an intermediate step of Byte-Pair Encoding. Subsequently, we embed the tokens from which we assemble an embedding of function/method labels. Long short-term memory network (LSTM) is used to embed tokens. The next step is to form a distance matrix consisting of the cosines between every pairs of vectors (function embedding) which in turn is interpreted as a (combinatorial) graph whose vertices represent functions, and edges correspond to entries whose value exceed some given threshold. Cluster-Editing is then applied to partition the vertex set of the graph into subsets representing “dense graphs,” that are nearly complete subgraphs. Finally, the vectors representing the components, plus additional heuristic-based features are used as features to model the components for vulnerability risk.


Author(s):  
Long Yu ◽  
Zhiyin Wang ◽  
Shengwei Tian ◽  
Feiyue Ye ◽  
Jianli Ding ◽  
...  

Traditional machine learning methods for water body extraction need complex spectral analysis and feature selection which rely on wealth of prior knowledge. They are time-consuming and hard to satisfy our request for accuracy, automation level and a wide range of application. We present a novel deep learning framework for water body extraction from Landsat imagery considering both its spectral and spatial information. The framework is a hybrid of convolutional neural networks (CNN) and logistic regression (LR) classifier. CNN, one of the deep learning methods, has acquired great achievements on various visual-related tasks. CNN can hierarchically extract deep features from raw images directly, and distill the spectral–spatial regularities of input data, thus improving the classification performance. Experimental results based on three Landsat imagery datasets show that our proposed model achieves better performance than support vector machine (SVM) and artificial neural network (ANN).


2020 ◽  
Vol 10 (15) ◽  
pp. 5293 ◽  
Author(s):  
Rebeen Ali Hamad ◽  
Longzhi Yang ◽  
Wai Lok Woo ◽  
Bo Wei

Human activity recognition has become essential to a wide range of applications, such as smart home monitoring, health-care, surveillance. However, it is challenging to deliver a sufficiently robust human activity recognition system from raw sensor data with noise in a smart environment setting. Moreover, imbalanced human activity datasets with less frequent activities create extra challenges for accurate activity recognition. Deep learning algorithms have achieved promising results on balanced datasets, but their performance on imbalanced datasets without explicit algorithm design cannot be promised. Therefore, we aim to realise an activity recognition system using multi-modal sensors to address the issue of class imbalance in deep learning and improve recognition accuracy. This paper proposes a joint diverse temporal learning framework using Long Short Term Memory and one-dimensional Convolutional Neural Network models to improve human activity recognition, especially for less represented activities. We extensively evaluate the proposed method for Activities of Daily Living recognition using binary sensors dataset. A comparative study on five smart home datasets demonstrate that our proposed approach outperforms the existing individual temporal models and their hybridization. Furthermore, this is particularly the case for minority classes in addition to reasonable improvement on the majority classes of human activities.


2021 ◽  
Author(s):  
Mohammed Jarbou ◽  
Daehan Won ◽  
Jennifer Gillis ◽  
Raymond Romanczyk

Abstract Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that affects the areas of social communication and behavior. The term “spectrum” refers to the wide range of symptoms observed across individuals with ASD. Many children with ASD experience difficulty with daily functioning at school andhome. ASD prevalenceincreases in the United States, with the most recent prevalence of 1.9%. Given the wide range of social and learning, difficulties experienced by children with ASD, it is paramount that they are able to attend school to receive the appropriate range of interventions. School absenteeism (SA) is a significant concern given its association with many negativeconsequences such as school drop-out.Early prediction of SA would help school districtto implement effective interventions to ameliorate this issue. Due to its heterogeneity, students with ASD show within-group differences concerning their SA. This research introduces a deep learning-based framework for predicting short-and long-term SA of students with ASD. The Long Short-Term Memory (LSTM) algorithm is used to predict short-term SA. Similarly, Multilayer Perceptron(MLP) and Random Forest (RF) algorithms are used to predict long-term SA. The proposed framework achieves a high accuracy of 89% and 90% to predict short-term and long-term SA, respectively.


Sign in / Sign up

Export Citation Format

Share Document