scholarly journals Multi-Level Fusion Temporal–Spatial Co-Attention for Video-Based Person Re-Identification

Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1686
Author(s):  
Shengyu Pei ◽  
Xiaoping Fan

A convolutional neural network can easily fall into local minima for insufficient data, and the needed training is unstable. Many current methods are used to solve these problems by adding pedestrian attributes, pedestrian postures, and other auxiliary information, but they require additional collection, which is time-consuming and laborious. Every video sequence frame has a different degree of similarity. In this paper, multi-level fusion temporal–spatial co-attention is adopted to improve person re-identification (reID). For a small dataset, the improved network can better prevent over-fitting and reduce the dataset limit. Specifically, the concept of knowledge evolution is introduced into video-based person re-identification to improve the backbone residual neural network (ResNet). The global branch, local branch, and attention branch are used in parallel for feature extraction. Three high-level features are embedded in the metric learning network to improve the network’s generalization ability and the accuracy of video-based person re-identification. Simulation experiments are implemented on small datasets PRID2011 and iLIDS-VID, and the improved network can better prevent over-fitting. Experiments are also implemented on MARS and DukeMTMC-VideoReID, and the proposed method can be used to extract more feature information and improve the network’s generalization ability. The results show that our method achieves better performance. The model achieves 90.15% Rank1 and 81.91% mAP on MARS.

Life ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 582
Author(s):  
Yuchai Wan ◽  
Zhongshu Zheng ◽  
Ran Liu ◽  
Zheng Zhu ◽  
Hongen Zhou ◽  
...  

Many computer-aided diagnosis methods, especially ones with deep learning strategies, of liver cancers based on medical images have been proposed. However, most of such methods analyze the images under only one scale, and the deep learning models are always unexplainable. In this paper, we propose a deep learning-based multi-scale and multi-level fusing approach of CNNs for liver lesion diagnosis on magnetic resonance images, termed as MMF-CNN. We introduce a multi-scale representation strategy to encode both the local and semi-local complementary information of the images. To take advantage of the complementary information of multi-scale representations, we propose a multi-level fusion method to combine the information of both the feature level and the decision level hierarchically and generate a robust diagnostic classifier based on deep learning. We further explore the explanation of the diagnosis decision of the deep neural network through visualizing the areas of interest of the network. A new scoring method is designed to evaluate whether the attention maps can highlight the relevant radiological features. The explanation and visualization make the decision-making process of the deep neural network transparent for the clinicians. We apply our proposed approach to various state-of-the-art deep learning architectures. The experimental results demonstrate the effectiveness of our approach.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Lei Li

The growth and popularity of streaming music have changed the way people consume music, and users can listen to online music anytime and anywhere. By integrating various recommendation algorithms/strategies (user profiling, collaborative filtering, content filtering, etc.), we capture users’ interests and preferences and recommend the content of interest to them. To address the sparsity of behavioral data in digital music marketing, which leads to inadequate mining of user music preference features, a metric ranking learning recommendation algorithm with fused content representation is proposed. Relative partial order relations are constructed using observed and unobserved behavioral data to enable the model to be fully trained, while audio feature extraction submodels related to the recommendation task are constructed to further alleviate the data sparsity problem, and finally, the preference relationships between users and songs are mined through metric learning. Convolutional neural networks are used to extract the high-level semantic features of songs, and then the high-level semantic features of songs extracted from the previous layer are reformed into a session time sequence list according to the time sequence of user listening in order to build a bidirectional recurrent neural network model based on the attention mechanism so that it can reduce the influence of noisy data and learn the strong dependencies between songs.


Author(s):  
Yujia Sun ◽  
Geng Chen ◽  
Tao Zhou ◽  
Yi Zhang ◽  
Nian Liu

Camouflaged object detection (COD) is a challenging task due to the low boundary contrast between the object and its surroundings. In addition, the appearance of camouflaged objects varies significantly, e.g., object size and shape, aggravating the difficulties of accurate COD. In this paper, we propose a novel Context-aware Cross-level Fusion Network (C2F-Net) to address the challenging COD task. Specifically, we propose an Attention-induced Cross-level Fusion Module (ACFM) to integrate the multi-level features with informative attention coefficients. The fused features are then fed to the proposed Dual-branch Global Context Module (DGCM), which yields multi-scale feature representations for exploiting rich global context information. In C2F-Net, the two modules are conducted on high-level features using a cascaded manner. Extensive experiments on three widely used benchmark datasets demonstrate that our C2F-Net is an effective COD model and outperforms state-of-the-art models remarkably. Our code is publicly available at: https://github.com/thograce/C2FNet.


2020 ◽  
Vol 2020 (8) ◽  
pp. 188-1-188-7
Author(s):  
Xiaoyu Xiang ◽  
Yang Cheng ◽  
Jianhang Chen ◽  
Qian Lin ◽  
Jan Allebach

Image aesthetic assessment has always been regarded as a challenging task because of the variability of subjective preference. Besides, the assessment of a photo is also related to its style, semantic content, etc. Conventionally, the estimations of aesthetic score and style for an image are treated as separate problems. In this paper, we explore the inter-relatedness between the aesthetics and image style, and design a neural network that can jointly categorize image by styles and give an aesthetic score distribution. To this end, we propose a multi-task network (MTNet) with an aesthetic column serving as a score predictor and a style column serving as a style classifier. The angular-softmax loss is applied in training primary style classifiers to maximize the margin among classes in single-label training data; the semi-supervised method is applied to improve the network’s generalization ability iteratively. We combine the regression loss and classification loss in training aesthetic score. Experiments on the AVA dataset show the superiority of our network in both image attributes classification and aesthetic ranking tasks.


Genes ◽  
2021 ◽  
Vol 12 (4) ◽  
pp. 572
Author(s):  
Alan M. Luu ◽  
Jacob R. Leistico ◽  
Tim Miller ◽  
Somang Kim ◽  
Jun S. Song

Understanding the recognition of specific epitopes by cytotoxic T cells is a central problem in immunology. Although predicting binding between peptides and the class I Major Histocompatibility Complex (MHC) has had success, predicting interactions between T cell receptors (TCRs) and MHC class I-peptide complexes (pMHC) remains elusive. This paper utilizes a convolutional neural network model employing deep metric learning and multimodal learning to perform two critical tasks in TCR-epitope binding prediction: identifying the TCRs that bind a given epitope from a TCR repertoire, and identifying the binding epitope of a given TCR from a list of candidate epitopes. Our model can perform both tasks simultaneously and reveals that inconsistent preprocessing of TCR sequences can confound binding prediction. Applying a neural network interpretation method identifies key amino acid sequence patterns and positions within the TCR, important for binding specificity. Contrary to common assumption, known crystal structures of TCR-pMHC complexes show that the predicted salient amino acid positions are not necessarily the closest to the epitopes, implying that physical proximity may not be a good proxy for importance in determining TCR-epitope specificity. Our work thus provides an insight into the learned predictive features of TCR-epitope binding specificity and advances the associated classification tasks.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ha Min Son ◽  
Wooho Jeon ◽  
Jinhyun Kim ◽  
Chan Yeong Heo ◽  
Hye Jin Yoon ◽  
...  

AbstractAlthough computer-aided diagnosis (CAD) is used to improve the quality of diagnosis in various medical fields such as mammography and colonography, it is not used in dermatology, where noninvasive screening tests are performed only with the naked eye, and avoidable inaccuracies may exist. This study shows that CAD may also be a viable option in dermatology by presenting a novel method to sequentially combine accurate segmentation and classification models. Given an image of the skin, we decompose the image to normalize and extract high-level features. Using a neural network-based segmentation model to create a segmented map of the image, we then cluster sections of abnormal skin and pass this information to a classification model. We classify each cluster into different common skin diseases using another neural network model. Our segmentation model achieves better performance compared to previous studies, and also achieves a near-perfect sensitivity score in unfavorable conditions. Our classification model is more accurate than a baseline model trained without segmentation, while also being able to classify multiple diseases within a single image. This improved performance may be sufficient to use CAD in the field of dermatology.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 617
Author(s):  
Guoqing Bao ◽  
Xiuying Wang ◽  
Ran Xu ◽  
Christina Loh ◽  
Oreoluwa Daniel Adeyinka ◽  
...  

We have developed a platform, termed PathoFusion, which is an integrated system for marking, training, and recognition of pathological features in whole-slide tissue sections. The platform uses a bifocal convolutional neural network (BCNN) which is designed to simultaneously capture both index and contextual feature information from shorter and longer image tiles, respectively. This is analogous to how a microscopist in pathology works, identifying a cancerous morphological feature in the tissue context using first a narrow and then a wider focus, hence bifocal. Adjacent tissue sections obtained from glioblastoma cases were processed for hematoxylin and eosin (H&E) and immunohistochemical (CD276) staining. Image tiles cropped from the digitized images based on markings made by a consultant neuropathologist were used to train the BCNN. PathoFusion demonstrated its ability to recognize malignant neuropathological features autonomously and map immunohistochemical data simultaneously. Our experiments show that PathoFusion achieved areas under the curve (AUCs) of 0.985 ± 0.011 and 0.988 ± 0.001 in patch-level recognition of six typical pathomorphological features and detection of associated immunoreactivity, respectively. On this basis, the system further correlated CD276 immunoreactivity to abnormal tumor vasculature. Corresponding feature distributions and overlaps were visualized by heatmaps, permitting high-resolution qualitative as well as quantitative morphological analyses for entire histological slides. Recognition of more user-defined pathomorphological features can be added to the system and included in future tissue analyses. Integration of PathoFusion with the day-to-day service workflow of a (neuro)pathology department is a goal. The software code for PathoFusion is made publicly available.


Sign in / Sign up

Export Citation Format

Share Document