scholarly journals Bulgarian-English Treebank

2012 ◽  
Vol 7 ◽  
Author(s):  
Kiril Simov ◽  
Petya Osenova

The paper describes the construction of a Bulgarian-English treebank aligned on the word and semantic level. We consider the manual word level alignment easier and more reliable than the manual alignment on syntactic and semantic level. Thus, after manual word level alignment we apply an automatic procedure for the construction of semantic level alignments. Our work presents the main steps of this automatic procedure which exploits the syntactic analysis of both sentences, morphosyntactic annotation, manual word level alignment in producing the semantic annotation of the sentences and semantic alignment. Last, but not least, a method for identification of potential errors is discussed using the automatically constructed semantic analyses of Bulgarian sentences and their comparison to the semantic representation of the English sentences.

2013 ◽  
Vol 849 ◽  
pp. 298-301
Author(s):  
Gui Yang Jin ◽  
Fu Zai Lv ◽  
Zhan Qin Xiang

Modern enterprises consist of complex business systems. These systems need to be integrated to support enterprises operation. The SOA and ESB become an important enterprise integration architecture style for designing and implementing integration systems. But there are some limitations of todays ESB framework, such as only syntactic description of service interface, inability to perform semantic mediation and incapable process knowledge management. Therefore developers need deep and intimate knowledge to develop integration systems. We introduce an ontology-based semantic annotation approach to enrich and reconcile semantics of data, services and process models on ESB that enables data, service and process models interoperability on the semantic level through common domain ontologies.


2021 ◽  
Vol 48 (3) ◽  
pp. 231-247
Author(s):  
Xu Tan ◽  
Xiaoxi Luo ◽  
Xiaoguang Wang ◽  
Hongyu Wang ◽  
Xilong Hou

Digital images of cultural heritage (CH) contain rich semantic information. However, today’s semantic representations of CH images fail to fully reveal the content entities and context within these vital surrogates. This paper draws on the fields of image research and digital humanities to propose a systematic methodology and a technical route for semantic enrichment of CH digital images. This new methodology systematically applies a series of procedures including: semantic annotation, entity-based enrichment, establishing internal relations, event-centric enrichment, defining hierarchy relations between properties text annotation, and finally, named entity recognition in order to ultimately provide fine-grained contextual semantic content disclosure. The feasibility and advantages of the proposed semantic enrichment methods for semantic representation are demonstrated via a visual display platform for digital images of CH built to represent the Wutai Mountain Map, a typical Dunhuang mural. This study proves that semantic enrichment offers a promising new model for exposing content at a fine-grained level, and establishing a rich semantic network centered on the content of digital images of CH.


Author(s):  
Chong Wang ◽  
Zheng-Jun Zha ◽  
Dong Liu ◽  
Hongtao Xie

High-level semantic knowledge in addition to low-level visual cues is essentially crucial for co-saliency detection. This paper proposes a novel end-to-end deep learning approach for robust co-saliency detection by simultaneously learning highlevel group-wise semantic representation as well as deep visual features of a given image group. The inter-image interaction at semantic-level as well as the complementarity between group semantics and visual features are exploited to boost the inferring of co-salient regions. Specifically, the proposed approach consists of a co-category learning branch and a co-saliency detection branch. While the former is proposed to learn group-wise semantic vector using co-category association of an image group as supervision, the latter is to infer precise co-salient maps based on the ensemble of group semantic knowledge and deep visual cues. The group semantic vector is broadcasted to each spatial location of multi-scale visual feature maps and is used as a top-down semantic guidance for boosting the bottom-up inferring of co-saliency. The co-category learning and co-saliency detection branches are jointly optimized in a multi-task learning manner, further improving the robustness of the approach. Moreover, we construct a new large-scale co-saliency dataset COCO-SEG to facilitate research of co-saliency detection. Extensive experimental results on COCO-SEG and a widely used benchmark Cosal2015 have demonstrated the superiority of the proposed approach as compared to the state-of-the-art methods.


2014 ◽  
Vol 18 (3) ◽  
pp. 372-390 ◽  
Author(s):  
DENISA BORDAG ◽  
AMIT KIRSCHENBAUM ◽  
ERWIN TSCHIRNER ◽  
ANDREAS OPITZ

A novel combination of several experimental and non-experimental paradigms was applied to explore initial stages of incidental vocabulary acquisition (IVA) during reading in German as a second language (L2). The results show that syntactic complexity of the context positively affects incidental acquisition of new words, triggering the learner's shift of attention from the text level to the word level. A subsequent semantic priming task revealed that the new words establish associations with semantically related representations in the L2 mental lexicon after just three previous occurrences and without any consolidation period. The semantic inhibition effect for the new words (contrary to semantic facilitation for known L2 words), however, indicates that the memory traces of the new semantic representation are still very weak and that their retrieval is probably hindered by stronger semantically related representations that have much lower activation thresholds and higher potential for being selected.


2021 ◽  
Vol 72 (2) ◽  
pp. 319-329
Author(s):  
Aleksei Dobrov ◽  
Maria Smirnova

Abstract This article presents the current results of an ongoing study of the possibilities of fine-tuning automatic morphosyntactic and semantic annotation by means of improving the underlying formal grammar and ontology on the example of one Tibetan text. The ultimate purpose of work at this stage was to improve linguistic software developed for natural-language processing and understanding in order to achieve complete annotation of a specific text and such state of the formal model, in which all linguistic phenomena observed in the text would be explained. This purpose includes the following tasks: analysis of error cases in annotation of the text from the corpus; eliminating these errors in automatic annotation; development of formal grammar and updating of dictionaries. Along with the morpho-syntactic analysis, the current approach involves simultaneous semantic analysis as well. The article describes semantic annotation of the corpus, required by grammar revision and development, which was made with the use of computer ontology. The work is carried out with one of the corpus texts – a grammatical poetic treatise Sum-cu-pa (VII c.).


2012 ◽  
Vol 06 (01) ◽  
pp. 51-65 ◽  
Author(s):  
MAMDOUH FAROUK ◽  
MITSURU ISHIZUKA

Representing web data in a machine-understandable format is an important task for the next generation of the web. Most solutions reported to date rely on ontologies. However, the use of ontologies is associated with many problems. Moreover, representing web data in a machine-understandable format is not sufficient to respond to all user queries. This paper proposes an approach for representing dynamic web page content retrieved from an underlying database in the concept description language (CDL). This is a semantic format that does not depend on ontologies. However, CDL describes the semantic structure of web content based on a set of predefined concepts and semantic relations. Moreover, this work proposes the addition of extra knowledge to the semantic level of the DB schema to improve the query answering process. A prototype of the proposed approach was implemented to demonstrate its feasibility, and a simple query engine was developed to demonstrate the effectiveness of adding extra knowledge to the DB semantic schema.


Terminology ◽  
1998 ◽  
Vol 5 (2) ◽  
pp. 203-228 ◽  
Author(s):  
Bernardo Magnini

The role of generic lexical resources as well as specialized terminology is crucial in the design of complex dialogue systems, where a human interacts with the computer using Natural Language. Lexicon and terminology are supposed to store information for several purposes, including the discrimination of semantic-ally inconsistent interpretations, the use of lexical variations, the compositional construction of a semantic representation for a complex sentence and the ability to access equivalencies across different languages. For these purposes it is necessary to rely on representational tools that are both theoretically motivated and operationally well defined. In this paper we propose a solution to lexical and terminology representation which is based on the combination of a linguistically motivated upper model and a multilingual WordNet. The upper model accounts for the linguistic analysis at the sentence level, while the multilingual WordNet accounts for lexical and conceptual relations at the word level.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Michael P. Broderick ◽  
Giovanni M. Di Liberto ◽  
Andrew J. Anderson ◽  
Adrià Rofes ◽  
Edmund C. Lalor

AbstractHealthy ageing leads to changes in the brain that impact upon sensory and cognitive processing. It is not fully clear how these changes affect the processing of everyday spoken language. Prediction is thought to play an important role in language comprehension, where information about upcoming words is pre-activated across multiple representational levels. However, evidence from electrophysiology suggests differences in how older and younger adults use context-based predictions, particularly at the level of semantic representation. We investigate these differences during natural speech comprehension by presenting older and younger subjects with continuous, narrative speech while recording their electroencephalogram. We use time-lagged linear regression to test how distinct computational measures of (1) semantic dissimilarity and (2) lexical surprisal are processed in the brains of both groups. Our results reveal dissociable neural correlates of these two measures that suggest differences in how younger and older adults successfully comprehend speech. Specifically, our results suggest that, while younger and older subjects both employ context-based lexical predictions, older subjects are significantly less likely to pre-activate the semantic features relating to upcoming words. Furthermore, across our group of older adults, we show that the weaker the neural signature of this semantic pre-activation mechanism, the lower a subject’s semantic verbal fluency score. We interpret these findings as prediction playing a generally reduced role at a semantic level in the brains of older listeners during speech comprehension and that these changes may be part of an overall strategy to successfully comprehend speech with reduced cognitive resources.


2012 ◽  
Vol 2 (4) ◽  
pp. 31-44
Author(s):  
Mohamed H. Haggag ◽  
Bassma M. Othman

Context processing plays an important role in different Natural Language Processing applications. Sentence ordering is one of critical tasks in text generation. Following the same order of sentences in the row sources of text is not necessarily to be applied for the resulted text. Accordingly, a need for chronological sentence ordering is of high importance in this regard. Some researches followed linguistic syntactic analysis and others used statistical approaches. This paper proposes a new model for sentence ordering based on sematic analysis. Word level semantics forms a seed to sentence level sematic relations. The model introduces a clustering technique based on sentences senses relatedness. Following to this, sentences are chronologically ordered through two main steps; overlap detection and chronological cause-effect rules. Overlap detection drills down into each cluster to step through its sentences in chronological sequence. Cause-effect rules forms the linguistic knowledge controlling sentences relations. Evaluation of the proposed algorithm showed the capability of the proposed model to process size free texts, non-domain specific and open to extend the cause-effect rules for specific ordering needs.


2020 ◽  
Vol 47 (7) ◽  
pp. 604-615
Author(s):  
Xiaoguang Wang ◽  
Wanli Chang ◽  
Xu Tan

This study employs a knowledge graph approach to realize the representation and association of information resources, promote the research, teaching, and dissemination of Dunhuang cultural heritage (CH). The Dunhuang Mogao Grottoes is a UNESCO world CH site, and digitization of Dunhuang CH has produced a large amount of information resources. However, these digitized resources continue to lack the systematic granular semantic representation required to correlate Dunhuang cultural heritage information (CHI) in order to facilitate efficient research and appreciation. To respond to this need, new approaches for representing CHI are being developed. This study identifies five facets and their semantic relationship to Dunhuang CH, constructs an ontology model to regulate the entities, attributes, and relationships of Dunhuang CH knowledge, and subsequently processes the resulting data using various techniques (such as semantic annotation and entity association) to facilitate rendering the data in a knowledge graph construction. Finally, we constructed a DH-oriented knowledge graph service platform in order to provide a user friendly visual display and semantic retrieval service.


Sign in / Sign up

Export Citation Format

Share Document