Deep supervised hashing for gait retrieval

Background: Gait recognition is perceived as the most promising biometric approach for future decades especially because of its efficient applicability in surveillance systems. Due to recent growth in the use of gait biometrics across surveillance systems, the ability to rapidly search for the required data has become an emerging need. Therefore, we addressed the gait retrieval problem, which retrieves people with gaits similar to a query subject from a large-scale dataset. Methods: This paper presents the deep gait retrieval hashing (DGRH) model to address the gait retrieval problem for large-scale datasets. Our proposed method is based on a supervised hashing method with a deep convolutional network. We use the ability of the convolutional neural network (CNN) to capture the semantic gait features for feature representation and learn the compact hash codes with the compatible hash function. Therefore, our DGRH model combines gait feature learning with binary hash codes. In addition, the learning loss is designed with a classification loss function that learns to preserve similarity and a quantization loss function that controls the quality of the hash codes Results: The proposed method was evaluated against the CASIA-B, OUISIR-LP, and OUISIR-MVLP benchmark datasets and received the promising result for gait retrieval tasks. Conclusions: The end-to-end deep supervised hashing model is able to learn discriminative gait features and is efficient in terms of the storage memory and speed for gait retrieval.

Download Full-text

Panchromatic Image Super-Resolution Via Self Attention-Augmented Wasserstein Generative Adversarial Network

Sensors ◽

10.3390/s21062158 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2158

Author(s):

Juan Du ◽

Kuanhong Cheng ◽

Yue Yu ◽

Dabao Wang ◽

Huixin Zhou

Keyword(s):

Large Scale ◽

Spatial Information ◽

Super Resolution ◽

Attention Mechanism ◽

Feature Representation ◽

Similarity Function ◽

Feature Maps ◽

Generative Adversarial Network ◽

Convolutional Network ◽

Adversarial Network

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution ( LR) due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved SR model which involves the self-attention augmented Wasserstein generative adversarial network ( SAA-WGAN) is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the High-resolution (HR) results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we, therefore, designed a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features; this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.

Download Full-text

Temporal Convolutional Neural Networks for Radar Micro-Doppler Based Gait Recognition

Sensors ◽

10.3390/s21020381 ◽

2021 ◽

Vol 21 (2) ◽

pp. 381

Author(s):

Pia Addabbo ◽

Mario Luca Bernardi ◽

Filippo Biondi ◽

Marta Cimitile ◽

Carmine Clemente ◽

...

Keyword(s):

Neural Networks ◽

Large Scale ◽

Traditional Approach ◽

Gait Recognition ◽

Human Identification ◽

Real Data ◽

Surveillance Systems ◽

Power Profile ◽

Improved Performance ◽

Specific Scenario

The capability of sensors to identify individuals in a specific scenario is a topic of high relevance for sensitive sectors such as public security. A traditional approach involves cameras; however, camera-based surveillance systems lack discretion and have high computational and storing requirements in order to perform human identification. Moreover, they are strongly influenced by external factors (e.g., light and weather). This paper proposes an approach based on a temporal convolutional deep neural networks classifier applied to radar micro-Doppler signatures in order to identify individuals. Both sensor and processing requirements ensure a low size weight and power profile, enabling large scale deployment of discrete human identification systems. The proposed approach is assessed on real data concerning 106 individuals. The results show good accuracy of the classifier (the best obtained accuracy is 0.89 with an F1-score of 0.885) and improved performance when compared to other standard approaches.

Download Full-text

TUCH: Turning Cross-view Hashing into Single-view Hashing via Generative Adversarial Nets

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/491 ◽

2017 ◽

Cited By ~ 6

Author(s):

Xin Zhao ◽

Guiguang Ding ◽

Yuchen Guo ◽

Jungong Han ◽

Yue Gao

Keyword(s):

Large Scale ◽

State Of The Art ◽

Language Model ◽

Adversarial Learning ◽

Single View ◽

Compact Binary ◽

Deep Architecture ◽

Text Feature ◽

Retrieval Problem ◽

Hash Codes

Cross-view retrieval, which focuses on searching images as response to text queries or vice versa, has received increasing attention recently. Cross-view hashing is to efficiently solve the cross-view retrieval problem with binary hash codes. Most existing works on cross-view hashing exploit multi-view embedding method to tackle this problem, which inevitably causes the information loss in both image and text domains. Inspired by the Generative Adversarial Nets (GANs), this paper presents a new model that is able to Turn Cross-view Hashing into single-view hashing (TUCH), thus enabling the information of image to be preserved as much as possible. TUCH is a novel deep architecture that integrates a language model network T for text feature extraction, a generator network G to generate fake images from text feature and a hashing network H for learning hashing functions to generate compact binary codes. Our architecture effectively unifies joint generative adversarial learning and cross-view hashing. Extensive empirical evidence shows that our TUCH approach achieves state-of-the-art results, especially on text to image retrieval, based on image-sentences datasets, i.e. standard IAPRTC-12 and large-scale Microsoft COCO.

Download Full-text

Panchromatic Image Super-Resolution via Self Attention-augmented WGAN

10.20944/preprints202012.0592.v1 ◽

2020 ◽

Author(s):

Juan Du ◽

Kuanhong Cheng ◽

Yue Yu ◽

Dabao Wang ◽

Huixin Zhou

Keyword(s):

Large Scale ◽

Spatial Information ◽

Super Resolution ◽

Objective Evaluation ◽

Attention Mechanism ◽

Feature Representation ◽

Similarity Function ◽

Feature Maps ◽

Convolutional Network ◽

Benchmark Datasets

Panchromatic (PAN) images contain abundant spatial information that is useful for earth observation, but always suffer from low-resolution due to the sensor limitation and large-scale view field. The current super-resolution (SR) methods based on traditional attention mechanism have shown remarkable advantages but remain imperfect to reconstruct the edge details of SR images. To address this problem, an improved super-resolution model which involves the self-attention augmented WGAN is designed to dig out the reference information among multiple features for detail enhancement. We use an encoder-decoder network followed by a fully convolutional network (FCN) as the backbone to extract multi-scale features and reconstruct the HR results. To exploit the relevance between multi-layer feature maps, we first integrate a convolutional block attention module (CBAM) into each skip-connection of the encoder-decoder subnet, generating weighted maps to enhance both channel-wise and spatial-wise feature representation automatically. Besides, considering that the HR results and LR inputs are highly similar in structure, yet cannot be fully reflected in traditional attention mechanism, we therefore design a self augmented attention (SAA) module, where the attention weights are produced dynamically via a similarity function between hidden features, this design allows the network to flexibly adjust the fraction relevance among multi-layer features and keep the long-range inter information, which is helpful to preserve details. In addition, the pixel-wise loss is combined with perceptual and gradient loss to achieve comprehensive supervision. Experiments on benchmark datasets demonstrate that the proposed method outperforms other SR methods in terms of both objective evaluation and visual effect.

Download Full-text

NPU RGB+D Dataset and a Feature-Enhanced LSTM-DGCN Method for Action Recognition of Basketball Players

Applied Sciences ◽

10.3390/app11104426 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4426

Author(s):

Chunyan Ma ◽

Ji Fan ◽

Jinghao Yao ◽

Tao Zhang

Keyword(s):

Action Recognition ◽

Large Scale ◽

Short Term Memory ◽

Evaluation Criteria ◽

Image Data ◽

Basketball Player ◽

Basketball Players ◽

Convolutional Network ◽

Atomic Actions ◽

New Feature

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.

Download Full-text

A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Cancers ◽

10.3390/cancers13092111 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2111

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Lun Hu ◽

Zhen-Hao Guo ◽

Lei Wang ◽

...

Keyword(s):

Drug Target ◽

Large Scale ◽

Computational Models ◽

Structural Information ◽

Characteristic Curve ◽

Representation Learning ◽

Graph Representation ◽

Convolutional Network ◽

Novel Method

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

Graph Convolutional Network for Drug Response Prediction Using Gene Expression Data

Mathematics ◽

10.3390/math9070772 ◽

2021 ◽

Vol 9 (7) ◽

pp. 772

Author(s):

Seonghun Kim ◽

Seockhun Bae ◽

Yinhua Piao ◽

Kyuri Jo

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Drug Response ◽

Response Prediction ◽

Biological Data ◽

Expression Data ◽

Convolutional Network ◽

Essential Information ◽

Protein Protein Interaction

Genomic profiles of cancer patients such as gene expression have become a major source to predict responses to drugs in the era of personalized medicine. As large-scale drug screening data with cancer cell lines are available, a number of computational methods have been developed for drug response prediction. However, few methods incorporate both gene expression data and the biological network, which can harbor essential information about the underlying process of the drug response. We proposed an analysis framework called DrugGCN for prediction of Drug response using a Graph Convolutional Network (GCN). DrugGCN first generates a gene graph by combining a Protein-Protein Interaction (PPI) network and gene expression data with feature selection of drug-related genes, and the GCN model detects the local features such as subnetworks of genes that contribute to the drug response by localized filtering. We demonstrated the effectiveness of DrugGCN using biological data showing its high prediction accuracy among the competing methods.

Download Full-text

Surgery for Coagulopathy-Related Intracerebral Hemorrhage: Craniotomy vs. Minimally Invasive Neurosurgery

Life ◽

10.3390/life11060564 ◽

2021 ◽

Vol 11 (6) ◽

pp. 564

Author(s):

Yen-Bo Liu ◽

Lu-Ting Kuo ◽

Chih-Hao Chen ◽

Woon-Man Kung ◽

Hsin-Hsi Tsai ◽

...

Keyword(s):

Minimally Invasive ◽

Intracerebral Hemorrhage ◽

Functional Outcomes ◽

Large Scale ◽

Promising Result ◽

Multivariable Logistic Regression Analysis ◽

Published Data ◽

Complication Rates ◽

Minimally Invasive Neurosurgery ◽

Reduction Of Mortality

Coagulopathy-related intracerebral hemorrhage (ICH) is life-threatening. Recent studies have shown promising results with minimally invasive neurosurgery (MIN) in the reduction of mortality and improvement of functional outcomes, but no published data have recorded the safety and efficacy of MIN for coagulopathy-related ICH. Seventy-five coagulopathy-related ICH patients were retrospectively reviewed to compare the surgical outcomes between craniotomy (n = 52) and MIN (n = 23). Postoperative rebleeding rates, morbidity rates, and mortality at 1 month were analyzed. Postoperative Glasgow Outcome Scale Extended (GOSE) and modified Rankin Scale (mRS) scores at 1 year were assessed for functional outcomes. Morbidity, mortality, and rebleeding rates were all lower in the MIN group than the craniotomy group (8.70% vs. 30.77%, 8.70% vs. 19.23%, and 4.35% vs. 23.08%, respectively). The 1-year GOSE score was significantly higher in the MIN group than the craniotomy group (3.96 ± 1.55 vs. 3.10 ± 1.59, p = 0.027). Multivariable logistic regression analysis also revealed that MIN contributed to improved GOSE (estimate: 0.99650, p = 0.0148) and mRS scores (estimate: −0.72849, p = 0.0427) at 1 year. MIN, with low complication rates and improved long-term functional outcome, is feasible and favorable for coagulopathy-related ICH. This promising result should be validated in a large-scale prospective study.

Download Full-text

sEMG-Based Continuous Estimation of Finger Kinematics via Large-Scale Temporal Convolutional Network

Applied Sciences ◽

10.3390/app11104678 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4678

Author(s):

Chao Chen ◽

Weiyu Guo ◽

Chenfei Ma ◽

Yongkui Yang ◽

Zheng Wang ◽

...

Keyword(s):

Motion Estimation ◽

Motion Control ◽

Large Scale ◽

Human Motion ◽

Kernel Size ◽

Continuous Motion ◽

Convolutional Network ◽

Continuous Estimation ◽

Discrete Motion ◽

Precision Rate

Since continuous motion control can provide a more natural, fast and accurate man–machine interface than that of discrete motion control, it has been widely used in human–robot cooperation (HRC). Among various biological signals, the surface electromyogram (sEMG)—the signal of actions potential superimposed on the surface of the skin containing the temporal and spatial information—is one of the best signals with which to extract human motion intentions. However, most of the current sEMG control methods can only perform discrete motion estimation, and thus fail to meet the requirements of continuous motion estimation. In this paper, we propose a novel method that applies a temporal convolutional network (TCN) to sEMG-based continuous estimation. After analyzing the relationship between the convolutional kernel’s size and the lengths of atomic segments (defined in this paper), we propose a large-scale temporal convolutional network (LS-TCN) to overcome the TCN’s problem: that it is difficult to fully extract the sEMG’s temporal features. When applying our proposed LS-TCN with a convolutional kernel size of 1 × 31 to continuously estimate the angles of the 10 main joints of fingers (based on the public dataset Ninapro), it can achieve a precision rate of 71.6%. Compared with TCN (kernel size of 1 × 3), LS-TCN (kernel size of 1 × 31) improves the precision rate by 6.6%.

Download Full-text

Large-Scale Twin Parametric Support Vector Machine Using Pinball Loss Function

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2019.2896642 ◽

2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Sweta Sharma ◽

Reshma Rastogi ◽

Suresh Chandra

Keyword(s):

Support Vector Machine ◽

Loss Function ◽

Large Scale ◽

Support Vector ◽

Pinball Loss

Download Full-text