Listing Center Strings Under the Edit Distance Metric

Author(s):  
Hiromitsu Maji ◽  
Taisuke Izumi
2009 ◽  
Vol 20 (06) ◽  
pp. 1047-1068 ◽  
Author(s):  
MANOLIS CHRISTODOULAKIS ◽  
GERHARD BREY

Approximate pattern matching has a wide range of applications and, depending on the type of approximation, there exist numerous algorithms for solving it. In this article we focus on texts which originate from OCRed documents, whose errors quite often have a particular form and are far from being random errors. We introduce a new variant of the edit distance metric, where apart from the traditional edit operations, two new operations are supported. The combination operation allows two or more symbols from a string x to be interpreted as a single symbol and then "matched" (or aligned) against a single symbol of a second string y. Its dual is the operation of a split, where a single symbol from x is broken down into a sequence of two or more other symbols, that can then be matched against an equal number of symbols from y. Our algorithm requires O(L) time for preprocessing, and O(mnk) time for computing the edit distance, where L is the total length of all the valid combinations/splits, m and n are the lengths of the two strings under comparison and k is an upper bound on the number of valid splits for any single symbol. The expected running time is O(mn).


Author(s):  
Muhammad Marwan Muhammad Fuad ◽  
Pierre-Francois Marteau

2013 ◽  
Vol 32 (12) ◽  
pp. 3529-3533
Author(s):  
Nan JIA ◽  
Xiao-dong FU ◽  
Yuan HUANG ◽  
Xiao-yan LIU ◽  
Zhi-hua DAI

2014 ◽  
Author(s):  
Ryan Cotterell ◽  
Nanyun Peng ◽  
Jason Eisner
Keyword(s):  

2011 ◽  
Vol 36 (12) ◽  
pp. 1661-1673
Author(s):  
Jun GAO ◽  
Shi-Tong WANG ◽  
Xiao-Ming WANG

2021 ◽  
Author(s):  
Tomoki Yoshida ◽  
Ichiro Takeuchi ◽  
Masayuki Karasuyama

Sign in / Sign up

Export Citation Format

Share Document