Robust Image Labeling Using Conditional Random Fields

10.32920/ryerson.14651541.v1 ◽

2021 ◽

Author(s):

Maryam Nematollahi Arani

Keyword(s):

Object Recognition ◽

Mixture Models ◽

Random Fields ◽

Conditional Random Fields ◽

Semantic Gap ◽

Visual Features ◽

Image Labeling ◽

Low Level ◽

Robust Image ◽

High Level

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.

Download Full-text

Argo

Advances in Multimedia and Interactive Technologies - Online Multimedia Advertising ◽

10.4018/978-1-60960-189-8.ch005 ◽

2011 ◽

pp. 67-83

Author(s):

Xin-Jing Wang ◽

Mo Yu ◽

Lei Zhang ◽

Wei-Ying Ma

Keyword(s):

Semantic Gap ◽

Visual Features ◽

User Interest ◽

Low Level ◽

User Interests ◽

Supplement User ◽

Target User ◽

Series Of Experiments ◽

High Level

In this chapter, we introduce the Argo system which provides intelligent advertising made possible from user generated photos. Based on the intuition that user-generated photos imply user interests which are the key for profitable targeted ads, Argo attempts to learn a user’s profile from his shared photos and suggests relevant ads accordingly. To learn a user interest, in an offline step, a hierarchical and efficient topic space is constructed based on the ODP ontology, which is used later on for bridging the vocabulary gap between ads and photos as well as reducing the effect of noisy photo tags. In the online stage, the process of Argo contains three steps: 1) understanding the content and semantics of a user’s photos and auto-tagging each photo to supplement user-submitted tags (such tags may not be available); 2) learning the user interest given a set of photos based on the learnt hierarchical topic space; and 3) representing ads in the topic space and matching their topic distributions with the target user interest; the top ranked ads are output as the suggested ads. Two key challenges are tackled during the process: 1) the semantic gap between the low-level image visual features and the high-level user semantics; and 2) the vocabulary impedance between photos and ads. We conducted a series of experiments based on real Flickr users and Amazon.com products (as candidate ads), which show the effectiveness of the proposed approach.

Download Full-text

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

Signal Processing Image Communication ◽

10.1016/j.image.2011.04.001 ◽

2011 ◽

Vol 26 (10) ◽

pp. 612-627 ◽

Cited By ~ 2

Author(s):

Hyun-seok Min ◽

Jae Young Choi ◽

Wesley De Neve ◽

Yong Man Ro

Keyword(s):

Video Clip ◽

Visual Features ◽

Semantic Features ◽

Low Level ◽

High Level ◽

Duplicate Video

Download Full-text

Cerebral cortex classification by conditional random fields applied to intraoperative thermal imaging

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2016-0105 ◽

2016 ◽

Vol 2 (1) ◽

pp. 475-478

Author(s):

Nico Hoffmann ◽

Edmund Koch ◽

Uwe Petersohn ◽

Matthias Kirsch ◽

Gerald Steiner

Keyword(s):

Cerebral Cortex ◽

Random Fields ◽

Conditional Random Fields ◽

Intraoperative Imaging ◽

A Priori ◽

Approaches To Learning ◽

Time Frequency ◽

Classification Framework ◽

Preprocessing Technique ◽

High Level

AbstractIntraoperative thermal neuroimaging is a novel intraoperative imaging technique for the characterization of perfusion disorders, neural activity and other pathological changes of the brain. It bases on the correlation of (sub-)cortical metabolism and perfusion with the emitted heat of the cortical surface. In order to minimize required computational resources and prevent unwanted artefacts in subsequent data analysis workflows foreground detection is a important preprocessing technique to differentiate pixels representing the cerebral cortex from background objects. We propose an efficient classification framework that integrates characteristic dynamic thermal behaviour into this classification task to include additional discriminative features. The first stage of our framework consists of learning this representation of characteristic thermal time-frequency behaviour. This representation models latent interconnections in the time-frequency domain that cover specific, yet a priori unknown, thermal properties of the cortex. In a second stage these features are then used to classify each pixel’s state with conditional random fields. We quantitatively evaluate several approaches to learning high-level features and their impact to the overall prediction accuracy. The introduction of high-level features leads to a significant accuracy improvement compared to a baseline classifier.

Download Full-text

Ontology-based conditional random fields for object recognition

Knowledge-Based Systems ◽

10.1016/j.knosys.2019.01.005 ◽

2019 ◽

Vol 168 ◽

pp. 100-108 ◽

Cited By ~ 3

Author(s):

Jose-Raul Ruiz-Sarmiento ◽

Cipriano Galindo ◽

Javier Monroy ◽

Francisco-Angel Moreno ◽

Javier Gonzalez-Jimenez

Keyword(s):

Object Recognition ◽

Random Fields ◽

Conditional Random Fields

Download Full-text

Feature-Specific Neural Reactivation during Episodic Memory

10.1101/622837 ◽

2019 ◽

Author(s):

Michael B. Bone ◽

Fahad Ahmad ◽

Bradley R. Buchsbaum

Keyword(s):

Frontal Cortex ◽

Visual Features ◽

Semantic Features ◽

Low Level ◽

Visual Hierarchy ◽

Strict Interpretation ◽

Novel Approach ◽

High Level ◽

Hierarchical Representations

AbstractWhen recalling an experience of the past, many of the component features of the original episode may be, to a greater or lesser extent, reconstructed in the mind’s eye. There is strong evidence that the pattern of neural activity that occurred during an initial perceptual experience is recreated during episodic recall (neural reactivation), and that the degree of reactivation is correlated with the subjective vividness of the memory. However, while we know that reactivation occurs during episodic recall, we have lacked a way of precisely characterizing the contents—in terms of its featural constituents—of a reactivated memory. Here we present a novel approach, feature-specific informational connectivity (FSIC), that leverages hierarchical representations of image stimuli derived from a deep convolutional neural network to decode neural reactivation in fMRI data collected while participants performed an episodic recall task. We show that neural reactivation associated with low-level visual features (e.g. edges), high-level visual features (e.g. facial features), and semantic features (e.g. “terrier”) occur throughout the dorsal and ventral visual streams and extend into the frontal cortex. Moreover, we show that reactivation of both low- and high-level visual features correlate with the vividness of the memory, whereas only reactivation of low-level features correlates with recognition accuracy when the lure and target images are semantically similar. In addition to demonstrating the utility of FSIC for mapping feature-specific reactivation, these findings resolve the relative contributions of low- and high-level features to the vividness of visual memories, clarify the role of the frontal cortex during episodic recall, and challenge a strict interpretation the posterior-to-anterior visual hierarchy.

Download Full-text

Sequential Gaussian Mixture Models for Two-Level Conditional Random Fields

Lecture Notes in Computer Science - Pattern Recognition ◽

10.1007/978-3-642-40602-7_16 ◽

2013 ◽

pp. 153-163 ◽

Cited By ~ 4

Author(s):

Sergey Kosov ◽

Franz Rottensteiner ◽

Christian Heipke

Keyword(s):

Mixture Models ◽

Random Fields ◽

Conditional Random Fields ◽

Gaussian Mixture Models ◽

Gaussian Mixture

Download Full-text

Fusing Low-Level Visual Features and High-Level Semantic Features for Breast Cancer Diagnosis in Digital Mammograms

2020 IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE) ◽

10.1109/bibe50027.2020.00149 ◽

2020 ◽

Author(s):

George Apostolopoulos ◽

Athanasios Koutras ◽

Dionysios Anyfantis ◽

Ioanna Christoyianni ◽

Evaggelos Dermatas

Keyword(s):

Breast Cancer ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Visual Features ◽

Semantic Features ◽

Low Level ◽

High Level

Download Full-text

Distraction biases working memory for faces

10.31219/osf.io/qvez5 ◽

2019 ◽

Author(s):

Remington Mallett ◽

Anurima Mummaneni ◽

Jarrod Lewis-Peacock

Keyword(s):

Working Memory ◽

Irrelevant Information ◽

Visual Features ◽

Maintenance Period ◽

Stimulus Space ◽

Estimation Task ◽

Low Level ◽

The Face ◽

High Level ◽

Task Irrelevant

Working memory persists in the face of distraction, yet not without consequence. Previous research has shown that memory for low-level visual features is systematically influenced by the maintenance or presentation of a similar distractor stimulus. Responses are frequently biased in stimulus space towards a perceptual distractor, though this has yet to be determined for high-level stimuli. We investigated whether these influences are shared for complex visual stimuli such as faces. To quantify response accuracies for these stimuli, we used a delayed-estimation task with a computer-generated “face space” consisting of eighty faces that varied continuously as a function of age and sex. In a set of three experiments, we found that responses for a target face held in working memory were biased towards a distractor face presented during the maintenance period. The amount of response bias did not vary as a function of distance between target and distractor. Our data suggest that, similar to low-level visual features, high-level face representations in working memory are biased by the processing of related but task-irrelevant information.

Download Full-text

Visual features influence thought content in the absence of overt semantic information

10.31234/osf.io/ptxdr ◽

2019 ◽

Author(s):

Kathryn E Schertz ◽

Omid Kardan ◽

Marc Berman

Keyword(s):

Cognitive Processes ◽

Semantic Content ◽

Visual Features ◽

Edge Density ◽

Low Level ◽

Life Journey ◽

Edge Content ◽

High Level ◽

Scene Identification ◽

Selection Of

It has recently been shown that the perception of visual features of the environment can influence thought content. Both low-level (e.g., fractalness) and high-level (e.g., presence of water) visual features of the environment can influence thought content, in real-world and experimental settings where these features can make people more reflective and contemplative in their thoughts. It remains to be seen, however, if these visual features retain their influence on thoughts in the absence of overt semantic content, which could indicate a more fundamental mechanism for this effect. In this study, we removed this limitation, by creating scrambled edge versions of images, which maintain edge content from the original images but remove scene identification. Non-straight edge density is one visual feature which has been shown to influence many judgements about objects and landscapes, and has also been associated with thoughts of spirituality. We extend previous findings by showing that non-straight edges retain their influence on the selection of a “Spiritual & Life Journey” topic after scene identification removal. These results strengthen the implication of a causal role for the perception of low-level visual features on the influence of higher-order cognitive function, by demonstrating that in the absence of overt semantic content, low-level features, such as edges, influence cognitive processes.

Download Full-text