An Improved Semantic Annotation Method

2012 ◽  
Vol 198-199 ◽  
pp. 495-499
Author(s):  
Shuai Zhang ◽  
Guang Hong ◽  
Bing Xu

Semantic annotation is the fundament for the progress and realization of semantic web, meanwhile, provides formatted description for the knowledge in web pages and its semantic meaning in the field. A method for semantic annotation to webpage was presented under the instruction of domain ontology in this paper. By Edit distance and Wordnet distance and from the two aspects the semantic meaning of the word, the semantic correlation degree was measured, then the mapping relation of webpage and ontology was built. Moreover, after the semantic annotating to the WebPages, the ontology was expanded effectively by the annotation results, to domanialize the ontology. At the end, experimental results show the tagging method bases on the weight coefficient acquired form Edit distance, wordnet distance and extended ontology concept is provided with the best performance and the method is effective and applicable.

2012 ◽  
Vol 2 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Syarifah Bahiyah Rahayu

Semantic annotation represents a metadata of the document based on domain ontology. The purpose of this paper is to develop a ranking algorithm for semantic document annotation and to evaluate its performance in the Semantic Web (SW) application. The evaluation is to compare the ranking algorithm against other algorithms. For the evaluation purpose, all the algorithms are applied into the SW application. The SW application is a research prototype retrieval engine, PicoDoc. The system framework of PicoDoc is based on OCAS2008 ontology. During the experimentation stage, a real-life dataset from news article corpus of ABC and BBC websites are selected. The experiment shows promising results in retrieving related information using the ranking algorithm.


2012 ◽  
Vol 263-266 ◽  
pp. 1588-1592
Author(s):  
Jiu Qing Li ◽  
Chi Zhang ◽  
Peng Zhou Zhang

To solve resource-tagging inefficiency and low-precision retrieval in special field, an analysis method of tag semantic relevancy based on controlled database was proposed. The characteristic of special field and building method for controlled database were discussed. Domain ontology correlation calculation method was used to get semantic correlation. The tag semantic similarity calculation method was developed for semantic similarity, and normalization was used to increase the similarity accuracy. With semantic correlation and similarity as parameters, the semantic relevancy in special field can be obtained. This method was used successfully in the special field of actual projects, improved resource-tagging and retrieval efficiency.


2021 ◽  
Author(s):  
Enshuai Hou ◽  
Jie zhu

Tibetan is a low-resource language. In order to alleviate the shortage of parallel corpus between Tibetan and Chinese, this paper uses two monolingual corpora and a small number of seed dictionaries to learn the semi-supervised method with seed dictionaries and self-supervised adversarial training method through the similarity calculation of word clusters in different embedded spaces and puts forward an improved self-supervised adversarial learning method of Tibetan and Chinese monolingual data alignment only. The experimental results are as follows. First, the experimental results of Tibetan syllables Chinese characters are not good, which reflects the weak semantic correlation between Tibetan syllables and Chinese characters; second, the seed dictionary of semi-supervised method made before 10 predicted word accuracy of 66.5 (Tibetan - Chinese) and 74.8 (Chinese - Tibetan) results, to improve the self-supervision methods in both language directions have reached 53.5 accuracy.


Author(s):  
Andrew Iliadis ◽  
Wesley Stevens ◽  
Jean-Christophe Plantin ◽  
Amelia Acker ◽  
Huw Davies ◽  
...  

This panel focuses on the way that platforms have become key players in the representation of knowledge. Recently, there have been calls to combine infrastructure and platform-based frameworks to understand the nature of information exchange on the web through digital tools for knowledge sharing. The present panel builds and extends work on platform and infrastructure studies in what has been referred to as “knowledge as programmable object” (Plantin, et al., 2018), specifically focusing on how metadata and semantic information are shaped and exchanged in specific web contexts. As Bucher (2012; 2013) and Helmond (2015) show, data portability in the context of web platforms requires a certain level of semantic annotation. Semantic interoperability is the defining feature of so-called "Web 3.0"—traditionally referred to as the semantic web (Antoniou et al, 2012; Szeredi et al, 2014). Since its inception, the semantic web has privileged the status of metadata for providing the fine-grained levels of contextual expressivity needed for machine-readable web data, and can be found in products as diverse as Google's Knowledge Graph, online research repositories like Figshare, and other sources that engage in platformizing knowledge. The first paper in this panel examines the international Schema.org collaboration. The second paper investigates the epistemological implications when platforms organize data sharing. The third paper argues for the use of patents to inform research methodologies for understanding knowledge graphs. The fourth paper discusses private platforms’ extraction and collection of user metadata and the enclosure of data access.


Author(s):  
Wei Du ◽  
Haiyan Zhu ◽  
Teeraporn Saeheaw

Based on the LDA model, this paper builds a three-layer semantic model of Web English educational resources “document-topic-keyword”, models the semantic topics of resource documents, and obtains the semantic topics and keywords of document resources as the semantic labels of resources. The experimental results show that document LDA topic modeling is beneficial to the macroscopic classification of Web English educational resources. The experimental results show that LDA topic modeling of documents is useful for macroscopic cataloging of Web English educational resources, highlighting teaching priorities, difficulties, and interrelationships, while LDA modeling of teaching topics with the same teaching content expands the metadata generation method of resource description based on the basic education metadata standard and provides more information about the inherent characteristics of resources. The semantic information can be used to mine the semantic thematic features and detailed differences inherent in the resources, and the final performance analysis verifies the parallel computing advantages of the LDA model in a big data environment.


Author(s):  
Jie Zhao ◽  
Jianfei Wang ◽  
Jia Yang ◽  
Peiquan Jin

In this chapter, we study the problem of extracting company acquisition relation from huge amounts of webpages, and propose a novel algorithm for a company acquisition relation extraction. Our algorithm considers the tense feature of Web content and classification technology of semantic strength when extracting company acquisition relation from webpages. It first determines the tense of each sentence in a webpage, where a CRF model is employed. Then, the tense of sentences is applied to sentences classification so as to evaluate the semantic strength of the candidate sentences in describing company acquisition relation. After that, we rank the candidate acquisition relations and return the top-k company acquisition relation. We run experiments on 6144 pages crawled through Google, and measure the performance of our algorithm under different metrics. The experimental results show that our algorithm is effective in determining the tense of sentences as well as the company acquisition relation.


2012 ◽  
pp. 535-578
Author(s):  
Jie Tang ◽  
Duo Zhang ◽  
Limin Yao ◽  
Yi Li

This chapter aims to give a thorough investigation of the techniques for automatic semantic annotation. The Semantic Web provides a common framework that allows data to be shared and reused across applications, enterprises, and community boundaries. However, lack of annotated semantic data is a bottleneck to make the Semantic Web vision a reality. Therefore, it is indeed necessary to automate the process of semantic annotation. In the past few years, there was a rapid expansion of activities in the semantic annotation area. Many methods have been proposed for automating the annotation process. However, due to the heterogeneity and the lack of structure of the Web data, automated discovery of the targeted or unexpected knowledge information still present many challenging research problems. In this chapter, we study the problems of semantic annotation and introduce the state-of-the-art methods for dealing with the problems. We will also give a brief survey of the developed systems based on the methods. Several real-world applications of semantic annotation will be introduced as well. Finally, some emerging challenges in semantic annotation will be discussed.


Sign in / Sign up

Export Citation Format

Share Document