visual descriptors
Recently Published Documents


TOTAL DOCUMENTS

106
(FIVE YEARS 16)

H-INDEX

13
(FIVE YEARS 2)

2021 ◽  
Vol 11 (15) ◽  
pp. 6783
Author(s):  
Thanh-Vu Dang ◽  
Gwang-Hyun Yu ◽  
Jin-Young Kim

Recent empirical works reveal that visual representation learned by deep neural networks can be successfully used as descriptors for image retrieval. A common technique is to leverage pre-trained models to learn visual descriptors by ranking losses and fine-tuning with labeled data. However, retrieval systems’ performance significantly decreases when querying images of lower resolution than the training images. This study considered a contrastive learning framework fine-tuned on features extracted from a pre-trained neural network encoder equipped with an attention mechanism to address the image retrieval task for low-resolution image retrieval. Our method is simple yet effective since the contrastive learning framework drives similar samples close to each other in feature space by manipulating variants of their augmentations. To benchmark the proposed framework, we conducted quantitative and qualitative analyses of CARS196 (mAP = 0.8804), CUB200-2011 (mAP = 0.9379), and Stanford Online Products datasets (mAP = 0.9141) and analyzed their performances.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249950
Author(s):  
Rebecca Scheurich ◽  
Caroline Palmer ◽  
Batu Kaya ◽  
Caterina Agostino ◽  
Signy Sheldon

Although it is understood that episodic memories of everyday events involve encoding a wide array of perceptual and non-perceptual information, it is unclear how these distinct types of information are recalled. To address this knowledge gap, we examine how perceptual (visual versus auditory) and non-perceptual details described within a narrative, a proxy for everyday event memories, were retrieved. Based on previous work indicating a bias for visual content, we hypothesized that participants would be most accurate at recalling visually described details and would tend to falsely recall non-visual details with visual descriptors. In Study 1, participants watched videos of a protagonist telling narratives of everyday events under three conditions: with visual, auditory, or audiovisual details. All narratives contained the same non-perceptual content. Participants’ free recall of these narratives under each condition were scored for the type of details recalled (perceptual, non-perceptual) and whether the detail was recalled with gist or verbatim memory. We found that participants were more accurate at gist and verbatim recall for visual perceptual details. This visual bias was also evident when we examined the errors made during recall such that participants tended to incorrectly recall details with visual information, but not with auditory information. Study 2 tested for this pattern of results when the narratives were presented in auditory only format. Results conceptually replicated Study 1 in that there was still a persistent visual bias in what was recollected from the complex narratives. Together, these findings indicate a bias for recruiting visualizable content to construct complex multi-detail memories.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 1010
Author(s):  
Claudio Cusano ◽  
Paolo Napoletano ◽  
Raimondo Schettini

In this paper we present T1K+, a very large, heterogeneous database of high-quality texture images acquired under variable conditions. T1K+ contains 1129 classes of textures ranging from natural subjects to food, textile samples, construction materials, etc. T1K+ allows the design of experiments especially aimed at understanding the specific issues related to texture classification and retrieval. To help the exploration of the database, all the 1129 classes are hierarchically organized in 5 thematic categories and 266 sub-categories. To complete our study, we present an evaluation of hand-crafted and learned visual descriptors in supervised texture classification tasks.


Author(s):  
K. Salvador Aguilar-Dominguez ◽  
Manuel Mejia-Lavalle ◽  
Gerardo Reyes-Salgado ◽  
Osslan Osiris Vergara-Villegas

Author(s):  
Hung Phuoc Truong ◽  
Minh Bao Nguyen-Khoa ◽  
Yong-Guk Kim

Local binary pattern is one of the visual descriptors and can be used as a powerful feature extractor for texture classification. In this paper, a novel representation for face recognition is proposed, called it Bilateral Line Local Binary Patterns (BL-LBP). This scheme is an extension of Line Local Binary Patterns descriptors in the statistical learning subspace. The present bilateral descriptors are fused with an ensemble learning of calibrated SVM models. The performance of this scheme is evaluated using 5 standard face databases. It is found that it is robust against illumination variation, diverse facial expressions and head pose variations and its recognition accuracy reaches 98 percent, running on a mobile device with a processing speed of 63 ms per face. Results suggest that our proposed method can be very useful for the vision systems that have limited resources where the computational cost is critical.


Author(s):  
Anitha Ganesan ◽  
Anbarasu Balasubramanian

AbstractIn the context of improved navigation for micro aerial vehicles, a new scene recognition visual descriptor, called spatial color gist wavelet descriptor (SCGWD), is proposed. SCGWD was developed by combining proposed Ohta color-GIST wavelet descriptors with census transform histogram (CENTRIST) spatial pyramid representation descriptors for categorizing indoor versus outdoor scenes. A binary and multiclass support vector machine (SVM) classifier with linear and non-linear kernels was used to classify indoor versus outdoor scenes and indoor scenes, respectively. In this paper, we have also discussed the feature extraction methodology of several, state-of-the-art visual descriptors, and four proposed visual descriptors (Ohta color-GIST descriptors, Ohta color-GIST wavelet descriptors, enhanced Ohta color histogram descriptors, and SCGWDs), in terms of experimental perspectives. The proposed enhanced Ohta color histogram descriptors, Ohta color-GIST descriptors, Ohta color-GIST wavelet descriptors, SCGWD, and state-of-the-art visual descriptors were evaluated, using the Indian Institute of Technology Madras Scene Classification Image Database two, an Indoor-Outdoor Dataset, and the Massachusetts Institute of Technology indoor scene classification dataset [(MIT)-67]. Experimental results showed that the indoor versus outdoor scene recognition algorithm, employing SVM with SCGWDs, produced the highest classification rates (CRs)—95.48% and 99.82% using radial basis function kernel (RBF) kernel and 95.29% and 99.45% using linear kernel for the IITM SCID2 and Indoor-Outdoor datasets, respectively. The lowest CRs—2.08% and 4.92%, respectively—were obtained when RBF and linear kernels were used with the MIT-67 dataset. In addition, higher CRs, precision, recall, and area under the receiver operating characteristic curve values were obtained for the proposed SCGWDs, in comparison with state-of-the-art visual descriptors.


Sign in / Sign up

Export Citation Format

Share Document