An error-tolerant approximate matching algorithm for labeled combinatorial maps

The goal of digital forensics is to recover and investigate pieces of data found on digital devices, analysing in the process their relationship with other fragments of data from the same device or from different ones. Approximate matching functions, also called similarity preserving or fuzzy hashing functions, try to achieve that goal by comparing files and determining their resemblance. In this regard, ssdeep, sdhash, and LZJD are nowadays some of the best-known functions dealing with this problem. However, even though those applications are useful and trustworthy, they also have important limitations (mainly, the inability to compare files of very different sizes in the case of ssdeep and LZJD, the excessive size of sdhash and LZJD signatures, and the occasional scarce relationship between the comparison score obtained and the actual content of the files when using the three applications). In this article, we propose a new signature generation procedure and an algorithm for comparing two files through their digital signatures. Although our design is based on ssdeep, it improves some of its limitations and satisfies the requirements that approximate matching applications should fulfil. Through a set of ad-hoc and standard tests based on the FRASH framework, it is possible to state that the proposed algorithm presents remarkable overall detection strengths and is suitable for comparing files of very different sizes. A full description of the multi-thread implementation of the algorithm is included, along with all the tests employed for comparing this proposal with ssdeep, sdhash, and LZJD.

Download Full-text

On a fast deterministic parallel approximate matching algorithm

Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing ◽

10.1109/spdp.1991.218242 ◽

2002 ◽

Author(s):

D.J. Haglin

Keyword(s):

Matching Algorithm ◽

Approximate Matching

Download Full-text

Parallel Scalable Approximate Matching Algorithm for Network Intrusion Detection Systems

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/1/9 ◽

2020 ◽

Vol 18 (1) ◽

Keyword(s):

Intrusion Detection ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Matching Algorithm ◽

Approximate Matching ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Computer Processor ◽

Sequential Matching

Matching algorithms are working to find the exact or the approximate matching between text “T” and pattern “P”, due to the development of a computer processor, which currently contains a set of multi-cores, multitasks can be performed simultaneously. This technology makes these algorithms work in parallel to improve their speed matching performance. Several exact string matching and approximate matching algorithms have been developed to work in parallel to find the correspondence between text “T” and pattern “P”. This paper proposed two models: First, parallelized the Direct Matching Algorithm (PDMA) in multi-cores architecture using OpenMP technology. Second, the PDMA implemented in Network Intrusion Detection Systems (NIDS) to enhance the speed of the NIDS detection engine. The PDMA can be achieved more than 19.7% in parallel processing time compared with sequential matching processing. In addition, the performance of the NIDS detection engine improved for more than 8% compared to the current SNORT-NIDS detection engine

Download Full-text

An Error-Tolerant Approximate Matching Algorithm for Attributed Planar Graphs and Its Application to Fingerprint Classification

Lecture Notes in Computer Science - Structural, Syntactic, and Statistical Pattern Recognition ◽

10.1007/978-3-540-27868-9_18 ◽

2004 ◽

pp. 180-189 ◽

Cited By ~ 32

Author(s):

Michel Neuhaus ◽

Horst Bunke

Keyword(s):

Planar Graphs ◽

Matching Algorithm ◽

Approximate Matching ◽

Fingerprint Classification

Download Full-text

Reconstruction of the Mekhilta Deuteronomy Using Philological and Computational Tools

Journal of Ancient Judaism ◽

10.30965/21967954-00901002 ◽

2018 ◽

Vol 9 (1) ◽

pp. 2-25

Author(s):

Michal Bar-Asher Siegal ◽

Avi Shmidman

Keyword(s):

13Th Century ◽

Computational Tools ◽

Matching Algorithm ◽

Approximate Matching ◽

Independent Form ◽

The Future ◽

Final Text

The tannaitic legal Midrashim did not all survive and are not all known to us in a complete independent form. David Zvi Hoffman was one of the first scholars to recognize the 13th century Yemenite Midrash, Midrash haGadol, written by R. David of Aden, as a major source of the lost legal Midarshim. He published the Midrash Tannaim, containing all of the tannaitic looking paragraphs from Midrash haGadol on the book of Deuteronomy. However, the author of Midrash haGadol often introduced changes into the material he borrowed from rabbinic and medieval sources. The resulting passages often seem to be unparalleled tannaitic sources, when in fact they are not. This article proposes a re-examination of the Mekhilta material as found in the Midrash haGadol, in order to reconstruct more accurately the tannaitic text. We propose a methodology for contending with this challenge, via a new approximate-matching algorithm designed to identify modified sources of this sort. Using this algorithm, we first compared Hoffman’s Midrash Tannaim on Deuteronomy to the Sifre, filtering out all parts of the text that are simply reworkings of the Sifre, despite many interpolations, omissions, and modified words. Having removed the Sifre passages from within the Midrash Tannaim text, we then proceeded to the next stage, in which we investigated the presence of reworked Maimonidean excerpts within the remaining text. The Maimonidean excerpts pose a particular challenge, because their reuse in the Midrash haGadol involves not only modifications and interpolations, but also changes of order. We describe the modifications that were necessary to the algorithm in order to handle these out-of-order cases of reuse as well. We have thus far succeeded in identifying and removing the reworked material appropriated from the Sifre and from Maimonides, and in the future we plan to tweak the algorithm such that it can successfully identify additional rabbinic passages as well, including the Babylonian and Palestinian Talmudic material, and other midrashic compilations. This will ultimately allow us to produce a final text approximating the original Mekhilta, to the greatest extent possible.

Download Full-text

Research on Wildcard Character of Appoximate Matching in Manufacturing Engineering

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.312.502 ◽

2013 ◽

Vol 312 ◽

pp. 502-505

Author(s):

Xi Hong Wu

Keyword(s):

Pattern Matching ◽

Experimental Result ◽

Forward Algorithm ◽

Matching Algorithm ◽

Approximate Matching ◽

Engineering Problems ◽

Flow Charts ◽

Manufacturing Engineering

The paper first analysiss the types and methods of the wildcard characters, then discuss the matching in manufacturing engineering problems which it exists in the pattern matching. According to the problems, it puts forward algorithm to realize approximate matching. At last, it provides the flow charts and examples of the algorithm. From the example of experimental result, exact machining can be realized approximate matching through the wildcard characters. So various matching algorithm can be extended and used more.

Download Full-text

An approximate matching algorithm for finding (sub-)optimal sequences in S-attributed grammars

Bioinformatics ◽

10.1093/bioinformatics/18.suppl_2.s250 ◽

2002 ◽

Vol 18 (Suppl 2) ◽

pp. S250-S259 ◽

Cited By ~ 10

Author(s):

J. Waldispuhl ◽

B. Behzadi ◽

J.-M. Steyaert

Keyword(s):

Matching Algorithm ◽

Approximate Matching ◽

Attributed Grammars

Download Full-text

The application of XML query approximate matching algorithm in PDM technology

2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet) ◽

10.1109/cecnet.2012.6201894 ◽

2012 ◽

Author(s):

Hui Liu ◽

Zhongyan Liu

Keyword(s):

Matching Algorithm ◽

Approximate Matching

Download Full-text

A New Approximate Matching Algorithm and its Application in Internet Music Search by Humming

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.3662 ◽

2012 ◽

Vol 433-440 ◽

pp. 3662-3668

Author(s):

Yun Feng Dong ◽

Bei Qi

Keyword(s):

Dynamic Time Warping ◽

Large Scale ◽

Time Warping ◽

Similarity Matching ◽

Matching Algorithm ◽

Approximate Matching ◽

Query By Humming ◽

Dynamic Time

This paper has proposed a new approximate matching algorithm—similarity matching, and use the characteristics of algorithm to establish a system of internet music search by humming. The author compared the similarity matching algorithm and dynamic time warping (DTW) algorithm, which is most commonly used to query by humming, by the system of internet music search by humming. On the two standard of the query hit ratio and query speed, we got the result that similarity matching algorithm's comprehensive efficiency is superior, is one of QBH (query by humming) algorithm, which is applicable to the large-scale music library such as internet music search.

Download Full-text