scholarly journals Partial Matching in the Space of Varifolds

Author(s):  
Pierre-Louis Antonsanti ◽  
Joan Glaunès ◽  
Thomas Benseghir ◽  
Vincent Jugnon ◽  
Irène Kaltenmark
Keyword(s):  
Author(s):  
И.В. Селиванова ◽  
I.V. Selivanova ◽  
Д.В. Косяков ◽  
D.V. Kosyakov ◽  
А.Е. Гуськов ◽  
...  

Исследуется возможность установления смысловой близости научных текстов методом их автоматической классификации, основанным на сжатии аннотаций. Идея метода состоит в том, что алгоритмы компрессии типа PPM (prediction by partial matching) сжимают терминологически близкие тексты существенно лучше, чем далекие. Если для каждой классифицируемой тематики будет сформировано ядро публикаций (аналог обучающей выборки), то наилучшая доля сжатия будет указывать на принадлежность классифицируемого текста к соответствующей тематике. Было определено 30 тематических категорий, каждой из них в базе данных Scopus получены аннотации около 500 публикаций, из которых разными способами выбирались 100 аннотаций для ядра и 20 аннотаций для тестирования. Установлено, что построение ядра на основе высокоцитируемых публикаций выявляет до 12% ошибок против 32% при случайной выборке. На качество классификации влияет и изначальное количество категорий: чем меньше категорий участвует в классификации и чем больше терминологические различия между ними, тем выше её качество.


2005 ◽  
Vol 38 (10) ◽  
pp. 1560-1573 ◽  
Author(s):  
Eli Saber ◽  
Yaowu Xu ◽  
A. Murat Tekalp

2015 ◽  
Vol 24 (4) ◽  
pp. 043010
Author(s):  
Shu Wang ◽  
Zhenjiang Miao

2021 ◽  
Author(s):  
Sae Hyong Park ◽  
Yong Yoon Shin ◽  
Namseok Ko
Keyword(s):  

Author(s):  
P. K. Nizar Banu ◽  
H. Inbarani

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.


2020 ◽  
Vol 10 (12) ◽  
pp. 4316 ◽  
Author(s):  
Ivan Boban ◽  
Alen Doko ◽  
Sven Gotovac

Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference (TREC). The scope of this paper was to find out if such approach is generally beneficial to sentence retrieval. However, we did not examine in depth how partial matching helps or hinders the finding of relevant sentences.


Sign in / Sign up

Export Citation Format

Share Document