OpenMP Implementation of Parallel Longest Common Subsequence Algorithm for Mathematical Expression Retrieval

Parallel Processing Letters ◽

10.1142/s0129626421500079 ◽

2021 ◽

pp. 2150007

Author(s):

Pavan Kumar Perepu

Keyword(s):

Time Complexity ◽

Mathematical Expression ◽

Longest Common Subsequence ◽

Retrieval Algorithm ◽

Retrieval Process ◽

Parallel Version ◽

Mathematical Expressions ◽

Common Subsequence ◽

Text Processor ◽

Performance Results

Given a mathematical expression in LaTeX or MathML format, retrieval algorithm extracts similar expressions from a database. In our previous work, we have used Longest Common Subsequence (LCS) algorithm to match two expressions of lengths, [Formula: see text] and [Formula: see text], which takes [Formula: see text] time complexity. If there are [Formula: see text] database expressions, total complexity is [Formula: see text], and an increase in [Formula: see text] also increases this complexity. In the present work, we propose to use parallel LCS algorithm in our retrieval process. Parallel LCS has [Formula: see text] time complexity with [Formula: see text] processors and total complexity can be reduced to [Formula: see text]. For our experimentation, OpenMP based implementation has been used on Intel [Formula: see text] processor with 4 cores. However, for smaller expressions, parallel version takes more time as the implementation overhead dominates the algorithmic improvement. As such, we have proposed to use parallel version, selectively, only on larger expressions, in our retrieval algorithm to achieve better performance. We have compared the sequential and parallel versions of our ME retrieval algorithm, and the performance results have been reported on a database of 829 mathematical expressions.

Download Full-text

Automata Technique for The LCS Problem

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/35/1/13293 ◽

2019 ◽

Vol 35 (1) ◽

pp. 21-37

Author(s):

Trường Huy Nguyễn

Keyword(s):

Dynamic Programming ◽

Parallel Algorithms ◽

Time Complexity ◽

Dynamic Programming Algorithm ◽

Longest Common Subsequence ◽

Programming Algorithm ◽

Worst Case ◽

Common Subsequence ◽

Input Strings ◽

Upper Estimate

In this paper, we introduce two eﬃcient algorithms in practice for computing the length of a longest common subsequence of two strings, using automata technique, in sequential and parallel ways. For two input strings of lengths m and n with m ≤ n, the parallel algorithm uses k processors (k ≤ m) and costs time complexity O(n) in the worst case, where k is an upper estimate of the length of a longest common subsequence of the two strings. These results are based on the Knapsack Shaking approach proposed by P. T. Huy et al. in 2002. Experimental results show that for the alphabet of size 256, our sequential and parallel algorithms are about 65.85 and 3.41m times faster than the standard dynamic programming algorithm proposed by Wagner and Fisher in 1974, respectively.

Download Full-text

ALGORITHMS FOR THE CONSTRAINED LONGEST COMMON SUBSEQUENCE PROBLEMS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054105003674 ◽

2005 ◽

Vol 16 (06) ◽

pp. 1099-1109 ◽

Cited By ~ 41

Author(s):

ABDULLAH N. ARSLAN ◽

ÖMER EĞECIOĞLU

Keyword(s):

Time Complexity ◽

Edit Distance ◽

Longest Common Subsequence ◽

Longest Common Subsequence Problem ◽

Common Subsequence ◽

Definition Of

Given strings S1, S2, and P, the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 which contains P as a subsequence. We present an algorithm which improves the time complexity of the problem from the previously known O(rn2m2) to O(rnm) where r, n, and m are the lengths of P, S1, and S2, respectively. As a generalization of this, we extend the definition of the problem so that the lcs sought contains a subsequence whose edit distance from P is less than a given parameter d. For the latter problem, we propose an algorithm whose time complexity is O(drnm).

Download Full-text

XLCS: A New Bit-Parallel Longest Common Subsequence Algorithm on Xeon Phi Clusters

2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc/smartcity/dss.2019.00204 ◽

2019 ◽

Author(s):

Zekun Yin ◽

Hao Zhang ◽

Kai Xu ◽

Yuandong Chan ◽

Shaoliang Peng ◽

...

Keyword(s):

Longest Common Subsequence ◽

Xeon Phi ◽

Common Subsequence

Download Full-text

Longest common subsequence as private search

Proceedings of the 8th ACM workshop on Privacy in the electronic society - WPES '09 ◽

10.1145/1655188.1655200 ◽

2009 ◽

Cited By ~ 5

Author(s):

Mark Gondree ◽

Payman Mohassel

Keyword(s):

Longest Common Subsequence ◽

Common Subsequence ◽

Private Search

Download Full-text

Side Channel Leakage Alignment Based on Longest Common Subsequence

2020 IEEE 14th International Conference on Big Data Science and Engineering (BigDataSE) ◽

10.1109/bigdatase50710.2020.00025 ◽

2020 ◽

Author(s):

Anni Jia ◽

Wei Yang ◽

Gongxuan Zhang

Keyword(s):

Longest Common Subsequence ◽

Side Channel ◽

Common Subsequence

Download Full-text

Longest Common Subsequence based Multistage Collaborative Filtering for Recommender Systems

2020 21st International Arab Conference on Information Technology (ACIT) ◽

10.1109/acit50332.2020.9300068 ◽

2020 ◽

Author(s):

Dilip Singh Sisodia ◽

Inakollu NehaPriyanka ◽

Prodduturi Amulya

Keyword(s):

Collaborative Filtering ◽

Recommender Systems ◽

Longest Common Subsequence ◽

Common Subsequence

Download Full-text

Research on longest common subsequence fast algorithm

2011 International Conference on Consumer Electronics, Communications and Networks (CECNet) ◽

10.1109/cecnet.2011.5768323 ◽

2011 ◽

Cited By ~ 3

Author(s):

Jiamei Liu ◽

Suping Wu

Keyword(s):

Fast Algorithm ◽

Longest Common Subsequence ◽

Common Subsequence

Download Full-text

Algorithms for computing variants of the longest common subsequence problem

Theoretical Computer Science ◽

10.1016/j.tcs.2008.01.009 ◽

2008 ◽

Vol 395 (2-3) ◽

pp. 255-267 ◽

Cited By ~ 15

Author(s):

Costas S. Iliopoulos ◽

M. Sohel Rahman

Keyword(s):

Longest Common Subsequence ◽

Longest Common Subsequence Problem ◽

Common Subsequence

Download Full-text

Cross-lingual Text Reuse Detection Using Translation Plus Monolingual Analysis for English-Urdu Language Pair

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3473331 ◽

2022 ◽

Vol 21 (2) ◽

pp. 1-18

Author(s):

Iqra Muneer ◽

Rao Muhammad Adeel Nawab

Keyword(s):

Probabilistic Approach ◽

Detailed Comparison ◽

Longest Common Subsequence ◽

Common Subsequence ◽

Classification Tasks ◽

N Gram ◽

Cross Lingual ◽

Recurrent Architecture ◽

Translation Systems ◽

Language Pair

Cross-Lingual Text Reuse Detection (CLTRD) has recently attracted the attention of the research community due to a large amount of digital text readily available for reuse in multiple languages through online digital repositories. In addition, efficient machine translation systems are freely and readily available to translate text from one language into another, which makes it quite easy to reuse text across languages, and consequently difficult to detect it. In the literature, the most prominent and widely used approach for CLTRD is Translation plus Monolingual Analysis (T+MA). To detect CLTR for English-Urdu language pair, T+MA has been used with lexical approaches, namely, N-gram Overlap, Longest Common Subsequence, and Greedy String Tiling. This clearly shows that T+MA has not been thoroughly explored for the English-Urdu language pair. To fulfill this gap, this study presents an in-depth and detailed comparison of 26 approaches that are based on T+MA. These approaches include semantic similarity approaches (semantic tagger based approaches, WordNet-based approaches), probabilistic approach (Kullback-Leibler distance approach), monolingual word embedding-based approaches siamese recurrent architecture, and monolingual sentence transformer-based approaches for English-Urdu language pair. The evaluation was carried out using the CLEU benchmark corpus, both for the binary and the ternary classification tasks. Our extensive experimentation shows that our proposed approach that is a combination of 26 approaches obtained an F 1 score of 0.77 and 0.61 for the binary and ternary classification tasks, respectively, and outperformed the previously reported approaches [ 41 ] ( F 1 = 0.73) for the binary and ( F 1 = 0.55) for the ternary classification tasks) on the CLEU corpus.

Download Full-text

Efficient Longest Common Subsequence Computation Using Bulk-Synchronous Parallelism

Computational Science and Its Applications - ICCSA 2006 - Lecture Notes in Computer Science ◽

10.1007/11751649_18 ◽

2006 ◽

pp. 165-174 ◽

Cited By ~ 8

Author(s):

Peter Krusche ◽

Alexander Tiskin

Keyword(s):

Longest Common Subsequence ◽

Common Subsequence

Download Full-text