scholarly journals Explaining Propagators for String Edit Distance Constraints

2020 ◽  
Vol 34 (02) ◽  
pp. 1676-1683
Author(s):  
Felix Winter ◽  
Nysret Musliu ◽  
Peter Stuckey

The computation of string similarity measures has been thoroughly studied in the scientific literature and has applications in a wide variety of different areas. One of the most widely used measures is the so called string edit distance which captures the number of required edit operations to transform a string into another given string. Although polynomial time algorithms are known for calculating the edit distance between two strings, there also exist NP-hard problems from practical applications like scheduling or computational biology that constrain the minimum edit distance between arrays of decision variables. In this work, we propose a novel global constraint to formulate restrictions on the minimum edit distance for such problems. Furthermore, we describe a propagation algorithm and investigate an explanation strategy for an edit distance constraint propagator that can be incorporated into state of the art lazy clause generation solvers. Experimental results show that the proposed propagator is able to significantly improve the performance of existing exact methods regarding solution quality and computation speed for benchmark problems from the literature.

2016 ◽  
Author(s):  
Timothy Baldwin ◽  
Huizhi Liang ◽  
Bahar Salehi ◽  
Doris Hoogeveen ◽  
Yitong Li ◽  
...  

Author(s):  
Hamid Haidarian Shahri

Entity resolution (also known as duplicate elimination) is an important part of the data cleaning process, especially in data integration and warehousing, where data are gathered from distributed and inconsistent sources. Learnable string similarity measures are an active area of research in the entity resolution problem. Our proposed framework builds upon our earlier work on entity resolution, in which fuzzy rules and membership functions are defined by the user. Here, we exploit neuro-fuzzy modeling for the first time to produce a unique adaptive framework for entity resolution, which automatically learns and adapts to the specific notion of similarity at a meta-level. This framework encompasses many of the previous work on trainable and domain-specific similarity measures. Employing fuzzy inference, it removes the repetitive task of hard-coding a program based on a schema, which is usually required in previous approaches. In addition, our extensible framework is very flexible for the end user. Hence, it can be utilized in the production of an intelligent tool to increase the quality and accuracy of data.


2009 ◽  
pp. 2245-2260
Author(s):  
Hamid Haidarian Shahri

Entity resolution (also known as duplicate elimination) is an important part of the data cleaning process, especially in data integration and warehousing, where data are gathered from distributed and inconsistent sources. Learnable string similarity measures are an active area of research in the entity resolution problem. Our proposed framework builds upon our earlier work on entity resolution, in which fuzzy rules and membership functions are defined by the user. Here, we exploit neuro-fuzzy modeling for the first time to produce a unique adaptive framework for entity resolution, which automatically learns and adapts to the specific notion of similarity at a metalevel. This framework encompasses many of the previous work on trainable and domain-specific similarity measures. Employing fuzzy inference, it removes the repetitive task of hard-coding a program based on a schema, which is usually required in previous approaches. In addition, our extensible framework is very flexible for the end user. Hence, it can be utilized in the production of an intelligent tool to increase the quality and accuracy of data.


2019 ◽  
Vol 129 ◽  
pp. 169-185 ◽  
Author(s):  
Najlah Gali ◽  
Radu Mariescu-Istodor ◽  
Damien Hostettler ◽  
Pasi Fränti

Author(s):  
Hamid Haidarian Shahri

Entity resolution (also known as duplicate elimination) is an important part of the data cleaning process, especially in data integration and warehousing, where data are gathered from distributed and inconsistent sources. Learnable string similarity measures are an active area of research in the entity resolution problem. Our proposed framework builds upon our earlier work on entity resolution, in which fuzzy rules and membership functions are defined by the user. Here, we exploit neuro-fuzzy modeling for the first time to produce a unique adaptive framework for entity resolution, which automatically learns and adapts to the specific notion of similarity at a meta-level. This framework encompasses many of the previous work on trainable and domain-specific similarity measures. Employing fuzzy inference, it removes the repetitive task of hard-coding a program based on a schema, which is usually required in previous approaches. In addition, our extensible framework is very flexible for the end user. Hence, it can be utilized in the production of an intelligent tool to increase the quality and accuracy of data.


2007 ◽  
Vol 23 (20) ◽  
pp. 2768-2774 ◽  
Author(s):  
Y. Tsuruoka ◽  
J. McNaught ◽  
J.;c. Tsujii ◽  
S. Ananiadou

Sign in / Sign up

Export Citation Format

Share Document