scholarly journals Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure

2016 ◽  
Vol 42 (1) ◽  
pp. 48-54
Author(s):  
Abbas Al-Bakry ◽  
Marwa Al-Rikaby

Levenshtein is a Minimum Edit Distance method; it is usually used in spell checking applications for generatingcandidates. The method computes the number of the required edit operations to transform one string to another and it canrecognize three types of edit operations: deletion, insertion, and substitution of one letter. Damerau modified the Levenshteinmethod to consider another type of edit operations, the transposition of two adjacent letters, in addition to theconsidered three types. However, the modification suffers from the time complexity which was added to the original quadratictime complexity of the original method. In this paper, we proposed a modification for the original Levenshtein toconsider the same four types using very small number of matching operations which resulted in a shorter execution timeand a similarity measure is also achieved to exploit the resulted distance from any Edit Distance method for finding the amountof similarity between two given strings.

2021 ◽  
Vol 25 (2) ◽  
pp. 283-303
Author(s):  
Na Liu ◽  
Fei Xie ◽  
Xindong Wu

Approximate multi-pattern matching is an important issue that is widely and frequently utilized, when the pattern contains variable-length wildcards. In this paper, two suffix array-based algorithms have been proposed to solve this problem. Suffix array is an efficient data structure for exact string matching in existing studies, as well as for approximate pattern matching and multi-pattern matching. An algorithm called MMSA-S is for the short exact characters in a pattern by dynamic programming, while another algorithm called MMSA-L deals with the long exact characters by the edit distance method. Experimental results of Pizza & Chili corpus demonstrate that these two newly proposed algorithms, in most cases, are more time-efficient than the state-of-the-art comparison algorithms.


2021 ◽  
Vol 10 (2) ◽  
pp. 90
Author(s):  
Jin Zhu ◽  
Dayu Cheng ◽  
Weiwei Zhang ◽  
Ci Song ◽  
Jie Chen ◽  
...  

People spend more than 80% of their time in indoor spaces, such as shopping malls and office buildings. Indoor trajectories collected by indoor positioning devices, such as WiFi and Bluetooth devices, can reflect human movement behaviors in indoor spaces. Insightful indoor movement patterns can be discovered from indoor trajectories using various clustering methods. These methods are based on a measure that reflects the degree of similarity between indoor trajectories. Researchers have proposed many trajectory similarity measures. However, existing trajectory similarity measures ignore the indoor movement constraints imposed by the indoor space and the characteristics of indoor positioning sensors, which leads to an inaccurate measure of indoor trajectory similarity. Additionally, most of these works focus on the spatial and temporal dimensions of trajectories and pay less attention to indoor semantic information. Integrating indoor semantic information such as the indoor point of interest into the indoor trajectory similarity measurement is beneficial to discovering pedestrians having similar intentions. In this paper, we propose an accurate and reasonable indoor trajectory similarity measure called the indoor semantic trajectory similarity measure (ISTSM), which considers the features of indoor trajectories and indoor semantic information simultaneously. The ISTSM is modified from the edit distance that is a measure of the distance between string sequences. The key component of the ISTSM is an indoor navigation graph that is transformed from an indoor floor plan representing the indoor space for computing accurate indoor walking distances. The indoor walking distances and indoor semantic information are fused into the edit distance seamlessly. The ISTSM is evaluated using a synthetic dataset and real dataset for a shopping mall. The experiment with the synthetic dataset reveals that the ISTSM is more accurate and reasonable than three other popular trajectory similarities, namely the longest common subsequence (LCSS), edit distance on real sequence (EDR), and the multidimensional similarity measure (MSM). The case study of a shopping mall shows that the ISTSM effectively reveals customer movement patterns of indoor customers.


2021 ◽  
Author(s):  
Ze Xi Xu ◽  
Lei Zhuang ◽  
Meng Yang He ◽  
Si Jin Yang ◽  
Yu Song ◽  
...  

Abstract Virtualization and resource isolation techniques have enabled the efficient sharing of networked resources. How to control network resource allocation accurately and flexibly has gradually become a research hotspot due to the growth in user demands. Therefore, this paper presents a new edge-based virtual network embedding approach to studying this problem that employs a graph edit distance method to accurately control resource usage. In particular, to manage network resources efficiently, we restrict the use conditions of network resources and restrict the structure based on common substructure isomorphism and an improved spider monkey optimization algorithm is employed to prune redundant information from the substrate network. Experimental results showed that the proposed method achieves better performance than existing algorithms in terms of resource management capacity, including energy savings and the revenue-cost ratio.


Author(s):  
Misturah Adunni Alaran ◽  
AbdulAkeem Adesina Agboola ◽  
Adio Taofiki Akinwale ◽  
Olusegun Folorunso

The reality of human existence and their interactions with various things that surround them reveal that the world is imprecise, incomplete, vague, and even sometimes indeterminate. Neutrosophic logic is the only theory that attempts to unify all previous logics in the same global theoretical framework. Extracting data from a similar environment is becoming a problem as the volume of data keeps growing day-in and day-out. This chapter proposes a new neutrosophic string similarity measure based on the longest common subsequence (LCS) to address uncertainty in string information search. This new method has been compared with four other existing classical string similarity measure using wordlist as data set. The analyses show the performance of proposed neutrosophic similarity measure to be better than the existing in information retrieval task as the evaluation is based on precision, recall, highest false match, lowest true match, and separation.


2014 ◽  
Vol 9 (10) ◽  
Author(s):  
Peisen Yuan ◽  
Haoyun Wang ◽  
Jianghua Che ◽  
Shougang Ren ◽  
Huanliang Xu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document