universal networking language
Recently Published Documents


TOTAL DOCUMENTS

52
(FIVE YEARS 2)

H-INDEX

5
(FIVE YEARS 0)

Author(s):  
Md. Nawab Yousuf Ali ◽  
Md. Lizur Rahman ◽  
Jyotismita Chaki ◽  
Nilanjan Dey ◽  
K. C. Santosh




Author(s):  
Subalalitha C. N.

This chapter discusses how text summaries could be generated by using a high-level semantic representation. The semantic representation is built using the discourse structure which is comprised of three text representation techniques, namely, universal networking language (UNL), rhetorical structure theory (RST), and Saṅgatis. Sangati is an ancient concept that is used in Sanskrit language literature to capture coherence. This discourse structure is indexed using a concept called sūtra which has been used in both Tamil language and Sanskrit literatures. The chapter mainly focusses on how summary could be generated by using this unique discourse structure and the indexing technique concept, sūtra. Forum for information retreival (FIRE) corpus has been used to test the system and a performance comparison has been done with the one of the state-of-art summary generation systems that is built on discourse structure.



2020 ◽  
pp. 7-17
Author(s):  
Shruti Srivastava ◽  
◽  
Sharvari Govilkar ◽  

Paraphrasing refers to the sentences that either differs in their textual content or dissimilar in rearrangement of words but convey the same meaning. Identifying a paraphrase is exceptionally important in various real life applications such as Information Retrieval, Plagiarism Detection, Text Summarization and Question Answering. A large amount of work in Paraphrase Detection has been done in English and many Indian Languages. However, there is no existing system to identify paraphrases in Marathi. This is the first such endeavor in Marathi Language. A paraphrase has different structured sentences and Marathi being semantically strong language hence this system is designed for checking both statistical and semantic similarity of Marathi sentences. Statistical similarity measure does not need any prior knowledge as it is only based on the factual data of sentences. The factual data is calculated on the basis of the degree of closeness between the word-set, word-order, word-vector and word-distance. Universal Networking Language (UNL) speaks about the semantic significance in the sentence without any syntacticpointofinterest.Hence, these mantic similarity calculated on the basis of generated UNL graphs for two Marathi sentences renders semantic equality of two Marathi sentences. The total para phrases core was calculated after joining statistical and semantic similarity scores which gives the judgement of being paraphrase or non-paraphrase about the Marathi sentences.



Information ◽  
2019 ◽  
Vol 10 (10) ◽  
pp. 324
Author(s):  
Ali ◽  
Rahman ◽  
Sorwar

The people in Bangladesh and two states (i.e., Tripura and West Bengal) in India, which is about 230 million of the world population, use Bengali as their first dialect. However, very few numbers of resources and tools are available for this language. This paper presents a Bangla DeConverter to extract Bangla texts from Universal Networking Language (UNL). It explains and illustrates the different phases of the proposed Bangla DeConverter. The syntactic linearization, the implementation of the results of the proposed Bangla DeConverter, and the extraction of a Bangla sentence from UNL expressions are presented in this paper. The Bangla DeConverter has been tested on UNL expressions of 300 Bangla sentences using a Russian and English Language Server. The proposed system generates 90% syntactically and semantically correct Bangla sentences with a UNL Bilingual Evaluation Understudy (BLEU) score of 0.76.



2019 ◽  
Vol 15 (1) ◽  
pp. 119-149
Author(s):  
Balaji Jagan ◽  
Ranjani Parthasarathi ◽  
T V Geetha

This article focuses on the use of a bootstrapping approach for the extraction of semantic relations that exist between two different concepts in a Tamil text. The proposed system, bootstrapping approach to semantic UNL relation extraction (BASURE) extracts generic relations that exist between different components of a sentence by exploiting the morphological richness of Tamil. Tamil is essentially a partially free word order language which means that semantic relations that exist between the concepts can occur anywhere in the sentence not necessarily in a fixed order. Here, the authors use Universal Networking Language (UNL), an Interlingua framework, to represent the word-based features and aim to define UNL semantic relations that exist between any two constituents in a sentence. The morphological suffix, lexical category and UNL semantic constraints associated with a word are defined as tuples of the pattern used for bootstrapping. Most systems define the initial set of seed patterns manually. However, this article uses a rule-based approach to obtain word-based features that form tuples of the patterns. A bootstrapping approach is then applied to extract all possible instances from the corpus and to generate new patterns. Here, the authors also introduce the use of UNL ontology to discover the semantic similarity between semantic tuples of the pattern, hence, to learn new patterns from the text corpus in an iterative manner. The use of UNL Ontology makes this approach general and domain independent. The results obtained are evaluated and compared with existing approaches and it has been shown that this approach is generic, can extract all sentence based semantic UNL relations and significantly increases the performance of the generic semantic relation extraction system.



Author(s):  
Subalalitha C. N. ◽  
Balaji J.

Summary generation systems when integrated with Information Retrieval (IR) can give an idea about the retrieved web pages to the user before even the user opens the web page. The summary could be generated for a single web page or for a set of web pages retrieved for a given query. When such a system is built for tourism web sites, the user can get a summary of a particular tourist spot or about the tourist spots present in a particular place. This chapter describes about such a summary generation system which is built using Rhetorical Structure Theory (RST). RST is a well-known discourse theory which is used for discourse analysis of text documents. The RST makes use of another semantic representation namely, Universal Networking Language (UNL) to find the coherent text fragments. These coherent text fragments are indexed and linked with an IR system. When a user gives a query, the web pages along with a single document and multi document summary.



Sign in / Sign up

Export Citation Format

Share Document