Probability Association Approach in Automatic Image Annotation

Author(s):  
Feng Xu ◽  
Yu-Jin Zhang

Content-based image retrieval (CBIR) has wide applications in public life. Either from a static image database or from the Web, one can search for a specific image, generally browse to make an interactive choice, and search for a picture to go with a broad story or to illustrate a document. Although CBIR has been well studied, it is still a challenging problem to search for images from a large image database because of the well-acknowledged semantic gap between low-level features and high-level semantic concepts. An alternative solution is to use keyword-based approaches, which usually associate images with keywords by either manually labeling or automatically extracting surrounding text from Web pages. Although such a solution is widely adopted by most existing commercial image search engines, it is not perfect. First, manual annotation, though precise, is expensive and difficult to extend to large-scale databases. Second, automatically extracted surrounding text might by incomplete and ambiguous in describing images, and even more, surrounding text may not be available in some applications. To overcome these problems, automated image annotation is considered as a promising approach in understanding and describing the content of images.

2019 ◽  
Vol 11 (8) ◽  
pp. 922 ◽  
Author(s):  
Juli Zhang ◽  
Junyi Zhang ◽  
Tao Dai ◽  
Zhanzhuang He

Manually annotating remote sensing images is laborious work, especially on large-scale datasets. To improve the efficiency of this work, we propose an automatic annotation method for remote sensing images. The proposed method formulates the multi-label annotation task as a recommended problem, based on non-negative matrix tri-factorization (NMTF). The labels of remote sensing images can be recommended directly by recovering the image–label matrix. To learn more efficient latent feature matrices, two graph regularization terms are added to NMTF that explore the affiliated relationships on the image graph and label graph simultaneously. In order to reduce the gap between semantic concepts and visual content, both low-level visual features and high-level semantic features are exploited to construct the image graph. Meanwhile, label co-occurrence information is used to build the label graph, which discovers the semantic meaning to enhance the label prediction for unlabeled images. By employing the information from images and labels, the proposed method can efficiently deal with the sparsity and cold-start problem brought by limited image–label pairs. Experimental results on the UCMerced and Corel5k datasets show that our model outperforms most baseline algorithms for multi-label annotation of remote sensing images and performs efficiently on large-scale unlabeled datasets.


2019 ◽  
Vol 1 (3) ◽  
pp. 238-270 ◽  
Author(s):  
Lei Ji ◽  
Yujing Wang ◽  
Botian Shi ◽  
Dawei Zhang ◽  
Zhongyuan Wang ◽  
...  

Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and Ads relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 pageviews, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.


Author(s):  
Lin Lin ◽  
Mei-Ling Shyu

Motivated by the growing use of multimedia services and the explosion of multimedia collections, efficient retrieval from large-scale multimedia data has become very important in multimedia content analysis and management. In this paper, a novel ranking algorithm is proposed for video retrieval. First, video content is represented by the global and local features and second, multiple correspondence analysis (MCA) is applied to capture the correlation between video content and semantic concepts. Next, video segments are scored by considering the features with high correlations and the transaction weights converted from correlations. Finally, a user interface is implemented in a video retrieval system that allows the user to enter his/her interested concept, searches videos based on the target concept, ranks the retrieved video segments using the proposed ranking algorithm, and then displays the top-ranked video segments to the user. Experimental results on 30 concepts from the TRECVID high-level feature extraction task have demonstrated that the presented video retrieval system assisted by the proposed ranking algorithm is able to retrieve more video segments belonging to the target concepts and to display more relevant results to the users.


2009 ◽  
pp. 596-614 ◽  
Author(s):  
I. Koffina ◽  
G. Serfiotis ◽  
V. Christophides ◽  
V. Tannen

Semantic Web (SW) technology aims to facilitate the integration of legacy data sources spread worldwide. Despite the plethora of SW languages (e.g., RDF/S, OWL) recently proposed for supporting large-scale information interoperation, the vast majority of legacy sources still rely on relational databases (RDB) published on the Web or corporate intranets as virtual XML. In this article, we advocate a first-order logic framework for mediating high-level queries to relational and/or XML sources using community ontologies expressed in a SW language such as RDF/S. We describe the architecture and reasoning services of our SW integration middleware, termed SWIM, and we present the main design choices and techniques for supporting powerful mappings between different data models, as well as reformulation and optimization of queries expressed against mediator ontologies and views.


2016 ◽  
Vol 10 (04) ◽  
pp. 503-525
Author(s):  
Mehdi Allahyari ◽  
Krys Kochut

The volume of documents and online resources has been increasing significantly on the Web for many years. Effectively, organizing this huge amount of information has become a challenging problem. Tagging is a mechanism to aggregate information and a great step towards the Semantic Web vision. Tagging aims to organize, summarize, share and search the Web resources in an effective way. One important problem facing tagging systems is to automatically determine the most appropriate tags for Web documents. In this paper, we propose a probabilistic topic model that incorporates DBpedia knowledge into the topic model for tagging Web pages and online documents with topics discovered in them. Our method is based on integration of the DBpedia hierarchical category network with statistical topic models, where DBpedia categories are considered as topics. We have conducted extensive experiments on two different datasets to demonstrate the effectiveness of our method.


2000 ◽  
Vol 29 (547) ◽  
Author(s):  
Kurt Jensen

This booklet contains the proceedings of the Workshop on Practical Use of High-level Petri Nets, June 27, 2000. The workshop is part of the 21st International Conference on Application and Theory of Petri Nets organised by the CPN group at the Department of Computer Science, University of Aarhus, Denmark. The workshop papers are available in electronic form via the web pages:<br /> http://www.daimi.au.dk/pn2000/proceedings


2013 ◽  
Vol 347-350 ◽  
pp. 2666-2672
Author(s):  
Kai Lei ◽  
Guang Yu Sun ◽  
Lian En Huang

Delta compression techniques are commonly used in the context of version control systems and the World Wide Web. They are used to compactly encode the differences between two files or strings in order to reduce communication or storage costs. In this paper, we study the use of delta compression in compressing massive web pages according to the similarity of their templates. We propose a framework for template-based delta compression which uses template-based clustering techniques to find the web pages that have similar templates and then encode their differences with delta compression techniques to reduce the storage cost. We also propose a filter-based optimization of Diff algorithm to improve the efficiency of the delta compression approach. To demonstrate the efficiency of our approach, we present experimental results on massive web pages. Our experiments show that template-based delta compression achieves significant improvements in compression ratio as compared to individually compressing each web page.


Author(s):  
Uche Ogbuji ◽  
Mark Baker

If you search for books and other media on the Web, you find Amazon, Wikipedia, and many other resources long before you see any libraries. This is a historical problem of librarians' having started ahead of the state of the art in database technologies, and yet unable to keep up with mainstream computing developments, including the Web. As a result, libraries are left with extraordinarily rich catalogs in formats which are unsuited to the Web, and which need a lot of work to adapt for the Web. A first step towards addressing this problem, BIBFRAME is a model developed for representing metadata from libraries and other cultural heritage institutions in linked data form. Libhub is a project building on BIBFRAME to convert traditional library formats, especially MARC/XML, to Web resource pages using BIBFRAME and other vocabulary frameworks. The technology used to implement Libhub transforms MARC/XML to a semi-structured, RDF-like metamodel called Versa, from which various outputs are possible, including data-rich Web pages. The authors developed a pipeline processing technology in Python in order to address the need for high performance and scalability as well as a prodigious degree of customization to accommodate a half century of variations and nuances in library cataloging conventions. The heart of this pipelining system is in the open-source project pybibframe, and the main way to customize the transform for non-technical librarians is a pattern microlanguage called marcpatterns.py. Using marcpatterns.py recipes specialized for the first Libhub participant, Denver Public Library, further specialized from common patterns among public libraries, (FIXME - not quite sure what is being said here) The first prerelease of linked data Web pages has already demonstrated the dramatic improvement in visibility for the library and quality, curated content for the Web, made possible through the adaptive, semistructured transform from notoriously abstruse library catalog formats. This paper discusses an unorthodox approach to structured and heuristics-based transformation from a large corpus of XML in a difficult format which doesn't well serve the richness of its content. It covers some of the pragmatic choices made by developers of the system who happen to be pioneering advocates of The Web, markup, and standards around these, but who had to subordinate purity to the urgent need to effect large-scale exposure of dark cultural heritage data in difficult circumstances for a small development and maintenance team. This is a case study of where proper knowledge of XML and its related standards must combine with agile techniques and "worse-is-better" concessions to solve a stubborn problem in extracting value from cultural heritage markup.


Author(s):  
Lin Lin ◽  
Mei-Ling Shyu

Motivated by the growing use of multimedia services and the explosion of multimedia collections, efficient retrieval from large-scale multimedia data has become very important in multimedia content analysis and management. In this paper, a novel ranking algorithm is proposed for video retrieval. First, video content is represented by the global and local features and second, multiple correspondence analysis (MCA) is applied to capture the correlation between video content and semantic concepts. Next, video segments are scored by considering the features with high correlations and the transaction weights converted from correlations. Finally, a user interface is implemented in a video retrieval system that allows the user to enter his/her interested concept, searches videos based on the target concept, ranks the retrieved video segments using the proposed ranking algorithm, and then displays the top-ranked video segments to the user. Experimental results on 30 concepts from the TRECVID high-level feature extraction task have demonstrated that the presented video retrieval system assisted by the proposed ranking algorithm is able to retrieve more video segments belonging to the target concepts and to display more relevant results to the users.


Author(s):  
Zhiwei Shi ◽  
Zhongzhi Shi ◽  
Hong Hu

Traditionally, how to bridge the gap between low-level visual features and high-level semantic concepts has been a tough task for researchers. In this article, we propose a novel plausible model, namely cellular Bayesian networks (CBNs), to model the process of visual perception. The new model takes advantage of both the low-level visual features, such as colors, textures, and shapes, of target objects and the interrelationship between the known objects, and integrates them into a Bayesian framework, which possesses both firm theoretical foundation and wide practical applications. The novel model successfully overcomes some weakness of traditional Bayesian Network (BN), which prohibits BN being applied to large-scale cognitive problem. The experimental simulation also demonstrates that the CBNs model outperforms purely Bottom-up strategy 6% or more in the task of shape recognition. Finally, although the CBNs model is designed for visual perception, it has great potential to be applied to other areas as well.


Sign in / Sign up

Export Citation Format

Share Document