inexact matching Latest Research Papers

Linear Approximate Pattern Matching Algorithm

10.21203/rs.3.rs-1021063/v1 ◽

2021 ◽

Author(s):

Anas Al-okaily ◽

Abdelghani Tbakhi

Keyword(s):

Pattern Matching ◽

Linear Time ◽

Search Costs ◽

Exact Matching ◽

Time And Space ◽

Matching Problem ◽

Approximate Matching ◽

Large Length ◽

Reference Stream ◽

Inexact Matching

Abstract Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can be as an exact or as an approximate (inexact) matching. Exact matching is to search for the pattern without allowing for mismatches (or insertions and deletions) of one or more characters in the pattern), while approximate matching is the opposite. For exact matching, several data structures that can be built in linear time and space are used and in practice nowadays. For approximate matching, the solutions proposed to solve this matching are non-linear and currently impractical. In this paper, we designed and implemented a structure that can be built in linear time and space and solve the approximate matching problem in (O(m + {log_Σ^k}n/{k!} + occ) search costs, where m is the length of the pattern, n is the length of the reference, and k is the number of tolerated mismatches (and insertion and deletions).

Download Full-text

Spatial Location in Integrated Circuits through Infrared Microscopy

Sensors ◽

10.3390/s21062175 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2175

Author(s):

Raphaël Abelé ◽

Jean-Luc Damoiseaux ◽

Redouane El Moubtahij ◽

Jean-Marc Boi ◽

Daniele Fronte ◽

...

Keyword(s):

Integrated Circuits ◽

Integrated Circuit ◽

Graph Matching ◽

Silicon Layer ◽

Spatial Location ◽

Optical Microscope ◽

Automated System ◽

Infrared Microscopy ◽

Speed Increase ◽

Inexact Matching

In this paper, we present an infrared microscopy based approach for structures’ location in integrated circuits, to automate their secure characterization. The use of an infrared sensor is the key device for internal integrated circuit inspection. Two main issues are addressed. The first concerns the scan of integrated circuits using a motorized optical system composed of an infrared uncooled camera combined with an optical microscope. An automated system is required to focus the conductive tracks under the silicon layer. It is solved by an autofocus system analyzing the infrared images through a discrete polynomial image transform which allows an accurate features detection to build a focus metric robust against specific image degradation inherent to the acquisition context. The second issue concerns the location of structures to be characterized on the conductive tracks. Dealing with a large amount of redundancy and noise, a graph-matching method is presented—discriminating graph labels are developed to overcome the redundancy, while a flexible assignment optimizer solves the inexact matching arising from noises on graphs. The resulting automated location system brings reproducibility for secure characterization of integrated systems, besides accuracy and time speed increase.

Download Full-text

Inexact Matching

10.1007/978-3-030-63416-2_300291 ◽

2021 ◽

pp. 666-666

Keyword(s):

Inexact Matching

Download Full-text

Enhancing investigative pattern detection via inexact matching and graph databases

IEEE Transactions on Services Computing ◽

10.1109/tsc.2021.3073145 ◽

2021 ◽

pp. 1-1

Author(s):

Shashika R Muramudalige ◽

Benjamin W. K. Hung ◽

Anura P Jayasumana ◽

Indrakshi Ray ◽

Jytte Klausen

Keyword(s):

Pattern Detection ◽

Graph Databases ◽

Inexact Matching

Download Full-text

An inexact matching approach for the comparison of plane curves with general elastic metrics

2019 53rd Asilomar Conference on Signals, Systems, and Computers ◽

10.1109/ieeeconf44664.2019.9049031 ◽

2019 ◽

Author(s):

Yashil Sukurdeep ◽

Martin Bauer ◽

Nicolas Charon

Keyword(s):

Plane Curves ◽

Inexact Matching

Download Full-text

Recognizing software names in biomedical literature using machine learning

Health Informatics Journal ◽

10.1177/1460458219869490 ◽

2019 ◽

Vol 26 (1) ◽

pp. 21-33 ◽

Cited By ~ 1

Author(s):

Qiang Wei ◽

Yaoyun Zhang ◽

Muhammad Amith ◽

Rebecca Lin ◽

Jenay Lapeyrolerie ◽

...

Keyword(s):

Machine Learning ◽

Language Processing ◽

Domain Knowledge ◽

Named Entity Recognition ◽

Biomedical Literature ◽

Entity Recognition ◽

Word Representation ◽

Inexact Matching ◽

Matching Criteria ◽

F Measure

Software tools now are essential to research and applications in the biomedical domain. However, existing software repositories are mainly built using manual curation, which is time-consuming and unscalable. This study took the initiative to manually annotate software names in 1,120 MEDLINE abstracts and titles and used this corpus to develop and evaluate machine learning–based named entity recognition systems for biomedical software. Specifically, two strategies were proposed for feature engineering: (1) domain knowledge features and (2) unsupervised word representation features of clustered and binarized word embeddings. Our best system achieved an F-measure of 91.79% for recognizing software from titles and an F-measure of 86.35% for recognizing software from both titles and abstracts using inexact matching criteria. We then created a biomedical software catalog with 19,557 entries using the developed system. This study demonstrates the feasibility of using natural language processing methods to automatically build a high-quality software index from biomedical literature.

Download Full-text

Extensions to AGraP Algorithm for Finding a Reduced Set of Inexact Graph Patterns

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001418600121 ◽

2017 ◽

Vol 32 (01) ◽

pp. 1860012 ◽

Cited By ~ 1

Author(s):

Marisol Flores-Garrido ◽

J. Ariel Carrasco-Ochoa ◽

José Fco. Martínez-Trinidad

Keyword(s):

Image Classification ◽

Traditional Approach ◽

Graph Isomorphism ◽

Isomorphism Problem ◽

Graph Isomorphism Problem ◽

Output Pattern ◽

Inexact Matching

Most algorithms to mine graph patterns, during the searching process, require a pattern to be identical to its occurrences, relying on the graph isomorphism problem. However, in recent years, there has been interest in the case in which it is acceptable to have some differences between a pattern and its occurrences, whether these differences are in labels or in structure. Allowing some differences and using inexact matching to measure the similarity between graphs lead to the discovery of new patterns, but some important challenges, such as the increment on the number of found patterns, make the post-mining analysis harder. In this work we focus on two extensions of the AGraP algorithm, which mines inexact patterns, addressing the issue of reducing the output pattern set while trying to retain the useful information gained through the use of inexact matching. First, exploring a traditional approach, we propose the CloseAFG algorithm that focuses on closed patterns. Then, we propose the IntAFG algorithm to find a subset of patterns covering the original pattern set, while lessening redundancy among selected patterns. We show the performance of our approaches through some experiments on synthetic databases; additionally, we also show the usefulness of the reduced pattern sets for image classification.

Download Full-text

An RCT Simulation Study on Performance and Accuracy of Inexact Matching Algorithms for Patient Identity in Ambulatory Care Settings

2015 International Conference on Healthcare Informatics ◽

10.1109/ichi.2015.7 ◽

2015 ◽

Cited By ~ 1

Author(s):

Fieran Mason-Blakley ◽

Jens Weber ◽

Linghong Lu ◽

Morgan Price ◽

Abdul Roudsari

Keyword(s):

Ambulatory Care ◽

Simulation Study ◽

Inexact Matching

Download Full-text

Handwritten word spotting by inexact matching of grapheme graphs

2015 13th International Conference on Document Analysis and Recognition (ICDAR) ◽

10.1109/icdar.2015.7333868 ◽

2015 ◽

Cited By ~ 13

Author(s):

Pau Riba ◽

Josep Llados ◽

Alicia Fornes

Keyword(s):

Word Spotting ◽

Inexact Matching

Download Full-text

Parameterized Strings: Algorithms and Applications

10.33915/etd.5168 ◽

2015 ◽

Author(s):

◽

Richard Beal ◽

Keyword(s):

String Theory ◽

Data Structures ◽

Structural Similarity ◽

Suffix Array ◽

Plagiarism Detection ◽

Worst Case ◽

Matching Problems ◽

Common Prefix ◽

Inexact Matching ◽

Longest Common Prefix Array

The parameterized string (p-string), a generalization of the traditional string, is composed of constant and parameter symbols. A parameterized match (p-match) exists between two p-strings if the constants match exactly and there exists a bijection between the parameter symbols. Historically, p-strings have been employed in source code cloning, plagiarism detection, and structural similarity between biological sequences. By handling the intricacies of the parameterized suffix, we can efficiently address complex applications with data structures also reusable in traditional matching scenarios. In this dissertation, we extend data structures for p-strings (and variants) to address sophisticated string computations.;We introduce a taxonomy of classes for longest factor problems. Using this taxonomy, we show an interesting connection between the parameterized longest previous factor (pLPF) and familiar data structures in string theory, including the border array, prefix array, longest common prefix array, and analogous p-string data structures. Exploiting this connection, we construct a multitude of data structures using the same general pLPF framework.;Before this dissertation, the p-match was defined predominately by the matching between uncompressed p-strings. Here, we introduce the compressed parameterized pattern match to find all p-matches between a pattern and a text, using only the pattern and a compressed form of the text. We present parameterized compression (p-compression) as a new way to losslessly compress data to support p-matching. Experimentally, it is shown that p-compression is competitive with standard compression schemes. Using p-compression, we address the compressed p-match independent of the underlying compression routine.;Currently, p-string theory lacks the capability to support indeterminate symbols, a staple essential for applications involving inexact matching such as in music analysis. In this work, we propose and efficiently address two new types of p-matching with indeterminate symbols. (1) We introduce the indeterminate parameterized match (ip-match) to permit matching with indeterminate holes in a p-string. We support the ip-match by introducing data structures that extend the prefix array. (2) From a different perspective, the equivalence parameterized match (e-match) evolves the p-match to consider intra-alphabet symbol classes as equivalence classes. We propose a method to perform the e-match using the p-string suffix array framework, i.e. the parameterized suffix array (pSA) and parameterized longest common prefix array (pLCP). Historically, direct constructions of the pSA and pLCP have suffered from quadratic time bounds in the worst-case. Here, we introduce new p-string theory to efficiently construct the pSA/pLCP and break the theoretical worst-case time barrier.;Biological applications have become a classical use of p-string theory. Here, we introduce the structural border array to provide a lightweight solution to the biologically-oriented variant of the p-match, i.e. the structural match (s-match) on structural strings (s-strings). Following the s-match, we show how to use s-string suffix structures to support various pattern matching problems involving RNA secondary structures. Finally, we propose/construct the forward stem matrix (FSM), a data structure to access RNA stem structures, and we apply the FSM to the detection of hairpins and pseudoknots in an RNA sequence.;This dissertation advances the state-of-the-art in p-string theory by developing data structures for p-strings/s-strings and using p-string/s-string theory in new and old contexts to address various applications. Due to the flexibility of the p-string/s-string, the data structures and algorithms in this work are also applicable to the myriad of problems in the string community that involve traditional strings.

Download Full-text

inexact matching
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Linear Approximate Pattern Matching Algorithm

Spatial Location in Integrated Circuits through Infrared Microscopy

Inexact Matching

Enhancing investigative pattern detection via inexact matching and graph databases

An inexact matching approach for the comparison of plane curves with general elastic metrics

Recognizing software names in biomedical literature using machine learning

Extensions to AGraP Algorithm for Finding a Reduced Set of Inexact Graph Patterns

An RCT Simulation Study on Performance and Accuracy of Inexact Matching Algorithms for Patient Identity in Ambulatory Care Settings

Handwritten word spotting by inexact matching of grapheme graphs

Parameterized Strings: Algorithms and Applications

Export Citation Format

inexact matchingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Linear Approximate Pattern Matching Algorithm

Spatial Location in Integrated Circuits through Infrared Microscopy

Inexact Matching

Enhancing investigative pattern detection via inexact matching and graph databases

An inexact matching approach for the comparison of plane curves with general elastic metrics

Recognizing software names in biomedical literature using machine learning

Extensions to AGraP Algorithm for Finding a Reduced Set of Inexact Graph Patterns

An RCT Simulation Study on Performance and Accuracy of Inexact Matching Algorithms for Patient Identity in Ambulatory Care Settings

Handwritten word spotting by inexact matching of grapheme graphs

Parameterized Strings: Algorithms and Applications

inexact matching
Recently Published Documents