Fuzzy Model for Human Color Perception and Its Application in E-Commerce

Author(s):  
Pakizar Shamoi ◽  
Atsushi Inoue ◽  
Hiroharu Kawanaka

Although image retrieval for e-commerce field has a huge commercial potential, e-commerce oriented content-based image retrieval is still very raw. Modern online shopping systems have certain limitations. In particular, they use conventional tag-based retrieval and lack making use of visual content. The paper presents a methodology to retrieve images of shopping items based on fuzzy dominant colors. People regard color as an aesthetic issue, especially when it comes to choosing the colors of their clothing, apartment design and other objects around. No doubt, color inuences purchasing behavior — to a certain extent, it is a reection of human's likes and dislikes. The fuzzy color model that we are proposing represents the collection of fuzzy sets, providing the conceptual quantization of crisp HSI space having soft boundaries. The proposed method has two parts: assigning a fuzzy colorimetric profile to the image and processing the user query. We also use underlying mechanisms of attention from a theory of visual attention, like perceptual categorization. Subjectivity and sensitivity of humans in color perception and bridging the semantic gap between low-level color visual features and high-level concepts are major issues that we plan to tackle in this research.

2021 ◽  
Author(s):  
Rui Zhang

This thesis is primarily focused on the information combination at different levels of a statistical pattern classification framework for image annotation and retrieval. Based on the previous study within the fields of image annotation and retrieval, it has been well-recognized that the low-level visual features, such as color and texture, and high-level features, such as textual description and context, are distinct yet complementary in terms of their distributions and the corresponding discriminative powers of dealing with machine-based recognition and retrieval tasks. Therefore, effective feature combination for image annotation and retrieval has become a desirable and promising perspective from which the semantic gap can be further bridged. Motivated by this fact, the combination of the visual and context modalities and that of different features in the visual domain are tackled by developing two statistical patterns classification approaches considering that the features of the visual modality and those across different modalities exhibit different degrees of heterogeneities, and thus, should be treated differently. Regarding the cross-modality feature combination, a Bayesian framework is proposed to integrate visual content and context, which has been applied to various image annotation and retrieval frameworks. In terms of the combination of different low-level features in the visual domain, the problem is tackled with a novel method that combines texture and color features via a mixture model of their joint distribution. To evaluate the proposed frameworks, many different datasets are employed in the experiments, including the COREL database for image retrieval and the MSRC, LabelMe, PASCAL VOC2009, and an animal image database collected by ourselves for image annotation. Using various evaluation criteria, the first framework is shown to be more effective than the methods purely based on the low-level features or high-level context. As for the second, the experimental results demonstrate not only its superior performance to other feature combination methods but also its ability to discover visual clusters using texture and color simultaneously. Moreover, a demo search engine based on the Bayesian framework is implemented and available online.


Author(s):  
Rhong Zhao ◽  
William I. Grosky

The emergence of multimedia technology and the rapidly expanding image and video collections on the Internet have attracted significant research efforts in providing tools for effective retrieval and management of visual data. Image retrieval is based on the availability of a representation scheme of image content. Image content descriptors may be visual features such as color, texture, shape, and spatial relationships, or semantic primitives. Conventional information retrieval was based solely on text, and those approaches to textual information retrieval have been transplanted into image retrieval in a variety of ways. However, “a picture is worth a thousand words.” Image content is much more versatile compared with text, and the amount of visual data is already enormous and still expanding very rapidly. Hoping to cope with these special characteristics of visual data, content-based image retrieval methods have been introduced. It has been widely recognized that the family of image retrieval techniques should become an integration of both low-level visual features addressing the more detailed perceptual aspects and high-level semantic features underlying the more general conceptual aspects of visual data. Neither of these two types of features is sufficient to retrieve or manage visual data in an effective or efficient way (Smeulders, et al., 2000). Although efforts have been devoted to combining these two aspects of visual data, the gap between them is still a huge barrier in front of researchers. Intuitive and heuristic approaches do not provide us with satisfactory performance. Therefore, there is an urgent need of finding the latent correlation between low-level features and high-level concepts and merging them from a different perspective. How to find this new perspective and bridge the gap between visual features and semantic features has been a major challenge in this research field. Our chapter addresses these issues.


2011 ◽  
Vol 268-270 ◽  
pp. 1427-1432
Author(s):  
Chang Yong Ri ◽  
Min Yao

This paper presented the key problems to shorten “semantic gap” between low-level visual features and high-level semantic features to implement high-level semantic image retrieval. First, introduced ontology based semantic image description and semantic extraction methods based on machine learning. Then, illustrated image grammar on the high-level semantic image understanding and retrieval, and-or graph and context based methods of semantic image. Finally, we discussed the development directions and research emphases in this field.


2021 ◽  
Author(s):  
Rui Zhang

This thesis is primarily focused on the information combination at different levels of a statistical pattern classification framework for image annotation and retrieval. Based on the previous study within the fields of image annotation and retrieval, it has been well-recognized that the low-level visual features, such as color and texture, and high-level features, such as textual description and context, are distinct yet complementary in terms of their distributions and the corresponding discriminative powers of dealing with machine-based recognition and retrieval tasks. Therefore, effective feature combination for image annotation and retrieval has become a desirable and promising perspective from which the semantic gap can be further bridged. Motivated by this fact, the combination of the visual and context modalities and that of different features in the visual domain are tackled by developing two statistical patterns classification approaches considering that the features of the visual modality and those across different modalities exhibit different degrees of heterogeneities, and thus, should be treated differently. Regarding the cross-modality feature combination, a Bayesian framework is proposed to integrate visual content and context, which has been applied to various image annotation and retrieval frameworks. In terms of the combination of different low-level features in the visual domain, the problem is tackled with a novel method that combines texture and color features via a mixture model of their joint distribution. To evaluate the proposed frameworks, many different datasets are employed in the experiments, including the COREL database for image retrieval and the MSRC, LabelMe, PASCAL VOC2009, and an animal image database collected by ourselves for image annotation. Using various evaluation criteria, the first framework is shown to be more effective than the methods purely based on the low-level features or high-level context. As for the second, the experimental results demonstrate not only its superior performance to other feature combination methods but also its ability to discover visual clusters using texture and color simultaneously. Moreover, a demo search engine based on the Bayesian framework is implemented and available online.


2019 ◽  
Author(s):  
John A. Greenwood ◽  
Michael J. Parsons

AbstractOur ability to recognise objects in peripheral vision is fundamentally limited by crowding, the deleterious effect of clutter that disrupts the recognition of features ranging from orientation and colour to motion and depth. Prior research is equivocal on whether this reflects a singular process that disrupts all features simultaneously or multiple processes that affect each independently. We examined crowding for motion and colour, two features that allow a strong test of feature independence. ‘Cowhide’ stimuli were presented 15 degrees in peripheral vision, either in isolation or surrounded by flankers to give crowding. Observers reported either the target direction (clockwise/counterclockwise from upwards) or its hue (blue/purple). We first established that both features show systematic crowded errors (predominantly biased towards the flanker identities) and selectivity for target-flanker similarity (with reduced crowding for dissimilar target/flanker elements). The multiplicity of crowding was then tested with observers identifying both features: a singular object-selective mechanism predicts that when crowding is weak for one feature and strong for the other that crowding should be all-or-none for both. In contrast, when crowding was weak for colour and strong for motion, errors were reduced for colour but remained for motion, and vice versa with weak motion and strong colour crowding. This double dissociation reveals that crowding disrupts certain combinations of visual features in a feature-specific manner, ruling out a singular object-selective mechanism. The ability to recognise one aspect of a cluttered scene, like colour, thus offers no guarantees for the correct recognition of other aspects, like motion.Significance statementOur peripheral vision is primarily limited by crowding, the disruption to object recognition that arises in clutter. Crowding is widely assumed to be a singular process, affecting all of the features (orientation, motion, colour, etc.) within an object simultaneously. In contrast, we observe a double dissociation whereby observers make errors regarding the colour of a crowded object whilst correctly judging its direction, and vice versa. This dissociation can be reproduced by a population-coding model where the direction and hue of target/flanker elements are pooled independently. The selective disruption of some object features independently of others rules out a singular crowding mechanism, posing problems for high-level crowding theories, and suggesting that the underlying mechanisms may be distributed throughout the visual system.


2012 ◽  
Vol 482-484 ◽  
pp. 512-517
Author(s):  
Xian Wen Zeng ◽  
Xue Dong Shen

This paper analysis the reasons that traditional CBIR can’t support based Semantic image retrieval, and gave a kind of method that Using SVM may solute it. Through studying and Classification, combining HSV Color feature as input parameter ,it realized the connection and map between the high-level semantics and low-level image features .Using this method to retrieve can have proved to get higher accuracy.


Author(s):  
Silvester Tena ◽  
Rudy Hartanto ◽  
Igi Ardiyanto

In <span>recent years, a great deal of research has been conducted in the area of fabric image retrieval, especially the identification and classification of visual features. One of the challenges associated with the domain of content-based image retrieval (CBIR) is the semantic gap between low-level visual features and high-level human perceptions. Generally, CBIR includes two main components, namely feature extraction and similarity measurement. Therefore, this research aims to determine the content-based image retrieval for fabric using feature extraction techniques grouped into traditional methods and convolutional neural networks (CNN). Traditional descriptors deal with low-level features, while CNN addresses the high-level, called semantic features. Traditional descriptors have the advantage of shorter computation time and reduced system requirements. Meanwhile, CNN descriptors, which handle high-level features tailored to human perceptions, deal with large amounts of data and require a great deal of computation time. In general, the features of a CNN's fully connected layers are used for matching query and database images. In several studies, the extracted features of the CNN's convolutional layer were used for image retrieval. At the end of the CNN layer, hash codes are added to reduce  </span>search time.


Author(s):  
Zewen Xu ◽  
Zheng Rong ◽  
Yihong Wu

AbstractIn recent years, simultaneous localization and mapping in dynamic environments (dynamic SLAM) has attracted significant attention from both academia and industry. Some pioneering work on this technique has expanded the potential of robotic applications. Compared to standard SLAM under the static world assumption, dynamic SLAM divides features into static and dynamic categories and leverages each type of feature properly. Therefore, dynamic SLAM can provide more robust localization for intelligent robots that operate in complex dynamic environments. Additionally, to meet the demands of some high-level tasks, dynamic SLAM can be integrated with multiple object tracking. This article presents a survey on dynamic SLAM from the perspective of feature choices. A discussion of the advantages and disadvantages of different visual features is provided in this article.


Sign in / Sign up

Export Citation Format

Share Document