A Comparison of Text String Similarity Algorithms for POI Name Harmonisation

Text String Detection From Natural Scenes by Structure-Based Partition and Grouping

IEEE Transactions on Image Processing ◽

10.1109/tip.2011.2126586 ◽

2011 ◽

Vol 20 (9) ◽

pp. 2594-2605 ◽

Cited By ~ 186

Author(s):

Chucai Yi ◽

YingLi Tian

Keyword(s):

Natural Scenes ◽

Text String

Download Full-text

In-place differential file compression of non-aligned files with applications to file distribution, backups, and string similarity

Data Compression Conference, 2004. Proceedings. DCC 2004 ◽

10.1109/dcc.2004.1281453 ◽

2004 ◽

Cited By ~ 2

Author(s):

D. Shapira ◽

J.A. Storer

Keyword(s):

String Similarity ◽

File Distribution

Download Full-text

Comparative analysis of string similarity and corpus-based similarity for automatic essay scoring system on e-learning gamification

2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS) ◽

10.1109/icacsis.2016.7872785 ◽

2016 ◽

Cited By ~ 5

Author(s):

Eko Sakti Pramukantoro ◽

M. Ali Fauzi

Keyword(s):

Comparative Analysis ◽

Scoring System ◽

String Similarity ◽

E Learning ◽

Essay Scoring

Download Full-text

MLS-Join: An Efficient MapReduce-Based Algorithm for String Similarity Self-joins with Edit Distance Constraint

Cloud Computing and Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-00006-6_60 ◽

2018 ◽

pp. 662-674

Author(s):

Decai Sun ◽

Xiaoxia Wang

Keyword(s):

Edit Distance ◽

Distance Constraint ◽

String Similarity

Download Full-text

UniMelb at SemEval-2016 Task 3: Identifying Similar Questions by combining a CNN with String Similarity Measures

10.18653/v1/s16-1131 ◽

2016 ◽

Cited By ~ 1

Author(s):

Timothy Baldwin ◽

Huizhi Liang ◽

Bahar Salehi ◽

Doris Hoogeveen ◽

Yitong Li ◽

...

Keyword(s):

Similarity Measures ◽

String Similarity ◽

String Similarity Measures

Download Full-text

Efficient and Scalable Processing of String Similarity Join

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2012.195 ◽

2013 ◽

Vol 25 (10) ◽

pp. 2217-2230 ◽

Cited By ~ 28

Author(s):

Chuitian Rong ◽

Wei Lu ◽

Xiaoli Wang ◽

Xiaoyong Du ◽

Yueguo Chen ◽

...

Keyword(s):

Similarity Join ◽

String Similarity

Download Full-text

PG-Join: Proximity Graph Based String Similarity Joins

Lecture Notes in Computer Science - Scientific and Statistical Database Management ◽

10.1007/978-3-642-22351-8_17 ◽

2011 ◽

pp. 274-292

Author(s):

Michail Kazimianec ◽

Nikolaus Augsten

Keyword(s):

String Similarity ◽

Proximity Graph ◽

Similarity Joins

Download Full-text

A New LCS-Neutrosophic Similarity Measure for Text Information Retrieval

Neutrosophic Sets in Decision Analysis and Operations Research - Advances in Logistics, Operations, and Management Science ◽

10.4018/978-1-7998-2555-5.ch012 ◽

2020 ◽

pp. 258-280

Author(s):

Misturah Adunni Alaran ◽

AbdulAkeem Adesina Agboola ◽

Adio Taofiki Akinwale ◽

Olusegun Folorunso

Keyword(s):

Information Retrieval ◽

Similarity Measure ◽

Information Search ◽

Longest Common Subsequence ◽

Data Set ◽

String Similarity ◽

True Match ◽

Neutrosophic Logic ◽

Common Subsequence ◽

Text Information

The reality of human existence and their interactions with various things that surround them reveal that the world is imprecise, incomplete, vague, and even sometimes indeterminate. Neutrosophic logic is the only theory that attempts to unify all previous logics in the same global theoretical framework. Extracting data from a similar environment is becoming a problem as the volume of data keeps growing day-in and day-out. This chapter proposes a new neutrosophic string similarity measure based on the longest common subsequence (LCS) to address uncertainty in string information search. This new method has been compared with four other existing classical string similarity measure using wordlist as data set. The analyses show the performance of proposed neutrosophic similarity measure to be better than the existing in information retrieval task as the evaluation is based on precision, recall, highest false match, lowest true match, and separation.

Download Full-text

Para-Join: an efficient parallel method for string similarity join

International Journal of High Performance Computing and Networking ◽

10.1504/ijhpcn.2017.086542 ◽

2017 ◽

Vol 10 (4/5) ◽

pp. 381

Author(s):

Cairong Yan ◽

Jian Wang ◽

Bin Zhu ◽

Wenjing Guo

Keyword(s):

Parallel Method ◽

Similarity Join ◽

String Similarity

Download Full-text

A Machine Learning Approach to Data Cleaning in Databases and Data Warehouses

Handbook of Research on Fuzzy Information Processing in Databases ◽

10.4018/978-1-59904-853-6.ch030 ◽

2011 ◽

pp. 745-759

Author(s):

Hamid Haidarian Shahri

Keyword(s):

Fuzzy Inference ◽

Data Cleaning ◽

Similarity Measures ◽

Entity Resolution ◽

String Similarity ◽

Domain Specific ◽

Meta Level ◽

Neuro Fuzzy ◽

Resolution Problem ◽

String Similarity Measures

Entity resolution (also known as duplicate elimination) is an important part of the data cleaning process, especially in data integration and warehousing, where data are gathered from distributed and inconsistent sources. Learnable string similarity measures are an active area of research in the entity resolution problem. Our proposed framework builds upon our earlier work on entity resolution, in which fuzzy rules and membership functions are defined by the user. Here, we exploit neuro-fuzzy modeling for the first time to produce a unique adaptive framework for entity resolution, which automatically learns and adapts to the specific notion of similarity at a meta-level. This framework encompasses many of the previous work on trainable and domain-specific similarity measures. Employing fuzzy inference, it removes the repetitive task of hard-coding a program based on a schema, which is usually required in previous approaches. In addition, our extensible framework is very flexible for the end user. Hence, it can be utilized in the production of an intelligent tool to increase the quality and accuracy of data.

Download Full-text