A NEW APPROACH FOR TEXT SIMILARITY USING ARTICLES

Conventional approaches to text analysis and information retrieval which measured document similarity by considering all information in texts are relatively inefficiency for processing large text collections in heterogeneous subject areas. Previous researches showed that evidence from passage can improve retrieval results. But it also raised questions about how passage is defined, how they can be ranked efficiently, and what is their proper rule in long structure documents. Moreover, the frequency of "the" with important sentence is efficiently to summarize the text by dexterity way. We previously proposed an approach for extracting sentences which including article "the" by some restrict rules to carry out effectiveness passages. Based on previous approaches, this paper presents a new Passage SIMilarity (P-SIM) measurements between documents based on effectiveness passages after extracting them using article "the". Moreover, our new approach showing that this method is more efficient than traditional methods. Also, Recall and Precision are achieved by 92.6% and 97.5% respectively, depending on extracted passages. Furthermore, Recall and Precision significantly improved by 38.3% and 44.2% over the traditional method. The proposed methods are applied to 3,990 articles from the large tagged corpus.

Download Full-text

A Medical Text Analysis System for German - Syntax Analysis

Methods of Information in Medicine ◽

10.1055/s-0038-1634842 ◽

1991 ◽

Vol 30 (04) ◽

pp. 275-283 ◽

Cited By ~ 7

Author(s):

P. M. Pietrzyk

Keyword(s):

Text Analysis ◽

Free Text ◽

New Approach ◽

Syntax Analysis ◽

Medical Text ◽

Medical Language ◽

Computerized Processing ◽

Language Data ◽

Analysis System ◽

German Syntax

Abstract:Much information about patients is stored in free text. Hence, the computerized processing of medical language data has been a well-known goal of medical informatics resulting in different paradigms. In Gottingen, a Medical Text Analysis System for German (abbr. MediTAS) has been under development for some time, trying to combine and to extend these paradigms. This article concentrates on the automated syntax analysis of German medical utterances. The investigated text material consists of 8,790 distinct utterances extracted from the summary sections of about 18,400 cytopathological findings reports. The parsing is based upon a new approach called Left-Associative Grammar (LAG) developed by Hausser. By extending considerably the LAG approach, most of the grammatical constructions occurring in the text material could be covered.

Download Full-text

Integrating Semantic Knowledge into Text Similarity and Information Retrieval

International Conference on Semantic Computing (ICSC 2007) ◽

10.1109/icsc.2007.12 ◽

2007 ◽

Cited By ~ 4

Author(s):

Christof Muller ◽

Iryna Gurevych ◽

Max Muhlhauser

Keyword(s):

Information Retrieval ◽

Semantic Knowledge ◽

Text Similarity

Download Full-text

Contribution to Semantic Analysis of Arabic Language

Advances in Artificial Intelligence ◽

10.1155/2012/620461 ◽

2012 ◽

Vol 2012 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Anis Zouaghi ◽

Mounir Zrigui ◽

Georges Antoniadis ◽

Laroussi Merhbene

Keyword(s):

Information Retrieval ◽

Semantic Analysis ◽

String Matching ◽

Ambiguous Word ◽

Arabic Language ◽

New Approach ◽

Matching Algorithm ◽

Ambiguous Words ◽

Lesk Algorithm ◽

Context Of Use

We propose a new approach for determining the adequate sense of Arabic words. For that, we propose an algorithm based on information retrieval measures to identify the context of use that is the closest to the sentence containing the word to be disambiguated. The contexts of use represent a set of sentences that indicates a particular sense of the ambiguous word. These contexts are generated using the words that define the senses of the ambiguous words, the exact string-matching algorithm, and the corpus. We use the measures employed in the domain of information retrieval, Harman, Croft, and Okapi combined to the Lesk algorithm, to assign the correct sense of those proposed.

Download Full-text

Multilingual Legal Information Retrieval System for Mapping Recitals and Normative Provisions

Frontiers in Artificial Intelligence and Applications - Legal Knowledge and Information Systems ◽

10.3233/faia200856 ◽

2020 ◽

Author(s):

Rohan Nanda ◽

Llio Humphreys ◽

Lorenzo Grossio ◽

Adebayo Kolawole John

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Text Similarity ◽

Legal Information ◽

Additional Information ◽

Eu Legislation ◽

Multilingual Corpus ◽

Eu Directives ◽

Develop State

This paper presents a multilingual legal information retrieval system for mapping recitals to articles in European Union (EU) directives and normative provisions in national legislation. Such a system could be useful for purposive interpretation of norms. A previous work on mapping recitals and normative provisions was limited to EU legislation in English and only one lexical text similarity technique. In this paper, we develop state-of-the-art text similarity models to investigate the interplay between directive recitals, directive (sub-)articles and provisions of national implementing measures (NIMs) on a multilingual corpus (from Ireland, Italy and Luxembourg). Our results indicate that directive recitals do not have a direct influence on NIM provisions, but they sometimes contain additional information that is not present in the transposed directive sub-article, and can therefore facilitate purposive interpretation.

Download Full-text

REFINING THE CALCULATION OF THE CAR TRACTION POWER

The Russian Automobile and Highway Industry Journal ◽

10.26518/2071-7296-2019-5-572-579 ◽

2019 ◽

Vol 16 (5) ◽

pp. 572-579 ◽

Cited By ~ 1

Author(s):

E. A. Maksimov ◽

E. P. Chelyabinsk

Keyword(s):

Elastic Modulus ◽

Comparative Analysis ◽

Contact Zone ◽

Traditional Method ◽

Traditional Methods ◽

Friction Forces ◽

Refined Equation ◽

Refined Calculation ◽

Traction Power ◽

Tangential Friction

Introduction. Traction power of the car is used to determine its traction-speed properties. The purpose of the paper is the calculation refinement of the car traction power.Materials and methods. The authors used the methodology of the refined calculation of the car traction power.Results. The authors carried out the comparative analysis of the refined and traditional methods for calculating traction power. As a result, the authors obtained the refined equation for calculating the traction power, taking into account the elastic modulus, the width of the contact track, the free radius of the wheel, the deflection of the tire and the tangential friction forces in the contact zone. The largest discrepancy between the curve of the vehicle’s traction power calculated by the updated methodology and the curve of the vehicle’s traction power calculated by the traditional method was 26.8%.Discussion and conclusions. The results of the research are useful to specialists of automobile and transport enterprises and masters of universities to compare the traction and speed properties of the various car types.

Download Full-text

A new approach to information retrieval systems using fuzzy expressions

Fuzzy Sets and Systems ◽

10.1016/0165-0114(85)90003-x ◽

1985 ◽

Vol 17 (1) ◽

pp. 9-22 ◽

Cited By ~ 16

Author(s):

Rembrand B.R.C. Zenner ◽

Rita M.M. De Caluwe ◽

Etienne E. Kerre

Keyword(s):

Information Retrieval ◽

New Approach ◽

Retrieval Systems ◽

Information Retrieval Systems

Download Full-text

Information Retrieval and Text Analysis

Discourse and Communication ◽

10.1515/9783110852141.106 ◽

2011 ◽

Cited By ~ 1

Author(s):

W. JOHN HUTCHINS

Keyword(s):

Information Retrieval ◽

Text Analysis

Download Full-text

To improve the Recovery of an Arab Stemmer for Information Retrieval

International Journal of Distributed Artificial Intelligence ◽

10.4018/ijdai.2018010102 ◽

2018 ◽

Vol 10 (1) ◽

pp. 25-33

Author(s):

Khaireddine Bacha

Keyword(s):

Information Retrieval ◽

Error Rate ◽

Automatic Processing ◽

Arabic Language ◽

Extraction Process ◽

Finite State Automata ◽

New Approach ◽

Root Extraction ◽

Finite State

The automatic processing of the Arabic language is a growing discipline, in which one sees more and more research and technologies to examine the specificities of this language and to propose tools necessary to the development of its automatic processing. The old techniques of rooting have limits that weaken the process of root extraction. In this article, the author proposes a new approach to rooting based on two finite state automata. The technique proposed is based on finite state automata in the root extraction process, with the aim of minimizing the error rate and ambiguity, usually due to the removal of the affixes. The author is currently focusing on the development and improvement of the rooting technique while trying to overcome the various problems encountered. The author is working on the compilation of a corpus of evaluation which will allow him to evaluate and compare their approach to others

Download Full-text