Similarity Measure for Polish Short Texts Based on Wordnet-Enhanced Bag-of-words Representation

Author(s):  
Maciej Piasecki ◽  
Anna Gut
Author(s):  
Charbel Azzi ◽  
Daniel Asmar ◽  
Adel Fakih ◽  
John Zelek

3D pose of a camera with respect to a 3D representation of thescene. IBL, despite being a trivial problem for small scenes, becomesquite challenging as the size of the scene grows. Aside fromthe computational burden, matching against a very large numberof 3D keypoints spanning a wide variety of viewpoints, illumination,and areas is a very unreliable process that results in a largenumber of outliers and ambiguous situations. In recent years, anumber of approaches have attempted to address the problem usingparadigms such as bag-of-words, features co-occurrence andothers, with varying degrees of success. This paper explores theuse of global descriptors, in particular GIST, to tackle this problem.We present a system that relies on a similarity measure derivedfrom GIST to qualify a limited number of 3D points for the matchingprocess, hence reducing the problem to its small size counterpart.Our results on a standard dataset show that our system canachieve better localization accuracy than the state of the art at afraction of the computational cost, which can used towards globallocalization.


2015 ◽  
Vol 77 (18) ◽  
Author(s):  
Atif Khan ◽  
Naomie Salim ◽  
Waleed Reafee ◽  
Anupong Sukprasert ◽  
Yogan Jaya Kumar

Multi-document abstractive summarization aims is to create a compact version of the source text and preserves the important information. The existing graph based methods rely on Bag of Words approach, which treats sentence as bag of words and relies on content similarity measure. The obvious limitation of Bag of Words approach is that it ignores semantic relationships among words and thus the summary produced from the source text would not be adequate. This paper proposes a clustered semantic graph based approach for multi-document abstractive summarization. The approach operates by employing semantic role labeling (SRL) to extract the semantic structure (predicate argument structures) from the document text. The predicate argument structures (PASs) are compared pair wise based on Lin semantic similarity measure to build semantic similarity matrix, which is thus represented as semantic graph whereas the vertices of graph represent the PASs and the edges correspond to the semantic similarity weight between the vertices. Content selection for summary is made by ranking the important graph vertices (PASs) based on modified graph based ranking algorithm. Agglomerative hierarchical clustering is performed to eliminate redundancy in such a way that representative PAS with the highest salience score from each cluster is chosen, and fed to language generation to generate summary sentences. Experiment of this study is performed using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach outperforms other summarization systems.


Author(s):  
Mohana Priya K ◽  
Pooja Ragavi S ◽  
Krishna Priya G

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%


Informatica ◽  
2018 ◽  
Vol 29 (3) ◽  
pp. 399-420
Author(s):  
Alessia Amelio ◽  
Darko Brodić ◽  
Radmila Janković

2012 ◽  
Vol 38 (2) ◽  
pp. 229-235 ◽  
Author(s):  
Wen-Qing LI ◽  
Xin SUN ◽  
Chang-You ZHANG ◽  
Ye FENG

2013 ◽  
Vol 34 (9) ◽  
pp. 2064-2070 ◽  
Author(s):  
Chun-hui Zhao ◽  
Ying Wang ◽  
KANEKO Masahide

2020 ◽  
Vol 10 (1) ◽  
pp. 193-197
Author(s):  
D. Stephen Dinagar ◽  
E. Fany Helena
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document