approximate matching Latest Research Papers

Linear Approximate Pattern Matching Algorithm

10.21203/rs.3.rs-1021063/v1 ◽

2021 ◽

Author(s):

Anas Al-okaily ◽

Abdelghani Tbakhi

Keyword(s):

Pattern Matching ◽

Linear Time ◽

Search Costs ◽

Exact Matching ◽

Time And Space ◽

Matching Problem ◽

Approximate Matching ◽

Large Length ◽

Reference Stream ◽

Inexact Matching

Abstract Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can be as an exact or as an approximate (inexact) matching. Exact matching is to search for the pattern without allowing for mismatches (or insertions and deletions) of one or more characters in the pattern), while approximate matching is the opposite. For exact matching, several data structures that can be built in linear time and space are used and in practice nowadays. For approximate matching, the solutions proposed to solve this matching are non-linear and currently impractical. In this paper, we designed and implemented a structure that can be built in linear time and space and solve the approximate matching problem in (O(m + {log_Σ^k}n/{k!} + occ) search costs, where m is the length of the pattern, n is the length of the reference, and k is the number of tolerated mismatches (and insertion and deletions).

Download Full-text

Minimodel of semantic synthesis of Russian sentences

Lingvisticae Investigationes ◽

10.1075/li.00058.ior ◽

2021 ◽

Vol 44 (1) ◽

pp. 101-136

Author(s):

Lidija Iordanskaja ◽

Igor Mel’čuk

Keyword(s):

Semantic Representation ◽

Conceptual Representation ◽

Syntactic Structures ◽

Approximate Matching ◽

Sufficient Detail

Abstract A formal linguistitic model is presented, which produces, for a given conceptual representation of an extralinguistic situation, a corresponding semantic representation [SemR] that, in its turn, underlies the deep-syntactic representations of four near-synonymous Russian sentences expressing the starting information. Two full-fledged lexical entries are given for the lexemes besporjadki ‘disturbance’ and stolknovenie ‘clash(N)’, appearing in these sentences. Some principles of lexicalization – that is, matching the formal lexicographic definitions to the starting semantic representation in order to produce the deep-syntactic structures of the corresponding sentences – are formulated and illustrated; the problem of approximate matching is dealt with in sufficient detail.

Download Full-text

Mining long-COVID symptoms from Reddit: characterizing post-COVID syndrome from patient reports

JAMIA Open ◽

10.1093/jamiaopen/ooab075 ◽

2021 ◽

Vol 4 (3) ◽

Author(s):

Abeed Sarker ◽

Yao Ge

Keyword(s):

Mental Health ◽

Recent Literature ◽

Temporal Analysis ◽

Approximate Matching ◽

Patient Reports ◽

Persistent Symptoms ◽

Health Related ◽

Over Time ◽

Standard Concept

Abstract Our objective was to mine Reddit to discover long-COVID symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon. We retrieved posts from the /r/covidlonghaulers subreddit and extracted symptoms via approximate matching using an expanded meta-lexicon. We mapped the extracted symptoms to standard concept IDs, compared their distributions with those reported in recent literature and analyzed their distributions over time. From 42 995 posts by 4249 users, we identified 1744 users who expressed at least 1 symptom. The most frequently reported long-COVID symptoms were mental health-related symptoms (55.2%), fatigue (51.2%), general ache/pain (48.4%), brain fog/confusion (32.8%), and dyspnea (28.9%) among users reporting at least 1 symptom. Comparison with recent literature revealed a large variance in reported symptoms across studies. Temporal analysis showed several persistent symptoms up to 15 months after infection. The spectrum of symptoms identified from Reddit may provide early insights about long-COVID.

Download Full-text

Long COVID symptoms from Reddit: Characterizing post-COVID syndrome from patient reports

10.1101/2021.06.15.21259004 ◽

2021 ◽

Author(s):

Abeed Sarker ◽

Yao Ge

Keyword(s):

Mental Health ◽

Recent Literature ◽

Temporal Analysis ◽

Approximate Matching ◽

Patient Reports ◽

Persistent Symptoms ◽

Health Related ◽

Over Time ◽

Standard Concept

Objective To mine Reddit to discover long-COVID symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon. Materials and Methods We retrieved posts from the /r/covidlonghaulers subreddit and extracted symptoms via approximate matching using an expanded meta-lexicon. We mapped the extracted symptoms to standard concept IDs, compared their distributions with those reported in recent literature and analyzed their distributions over time. Results From 42,995 posts by 4249 users, we identified 1744 users who expressed at least 1 symptom. The most frequently reported long-COVID symptoms were mental health-related symptoms (55.2%), fatigue (51.2%), general ache/pain (48.4%), brain fog/confusion (32.8%) and dyspnea (28.9%) amongst users reporting at least 1 symptom. Comparison with recent literature revealed a large variance in reported symptoms across studies. Temporal analysis showed several persistent symptoms up to 15 months after infection. Conclusion The spectrum of symptoms identified from Reddit may provide early insights about long-COVID.

Download Full-text

Short Time air Temperature Prediction Using Pattern Approximate Matching

Energy and Buildings ◽

10.1016/j.enbuild.2021.111036 ◽

2021 ◽

pp. 111036

Author(s):

Yuying Wang ◽

Yan Bai ◽

Liu Yang ◽

Honglian Li

Keyword(s):

Air Temperature ◽

Temperature Prediction ◽

Approximate Matching ◽

Air Temperature Prediction ◽

Short Time

Download Full-text

Bringing order to approximate matching: Classification and attacks on similarity digest algorithms

Forensic Science International: Digital Investigation ◽

10.1016/j.fsidi.2021.301120 ◽

2021 ◽

pp. 301120

Author(s):

Miguel Martín-Pérez ◽

Ricardo J. Rodríguez ◽

Frank Breitinger

Keyword(s):

Approximate Matching

Download Full-text

EVALUATION OF NETWORK TRAFFIC ANALYSIS USING APPROXIMATE MATCHING ALGORITHMS

10.1007/978-3-030-88381-2_5 ◽

2021 ◽

pp. 89-108

Author(s):

Thomas Göbel ◽

Frieder Uhlig ◽

Harald Baier

Keyword(s):

Network Traffic ◽

Traffic Analysis ◽

Approximate Matching ◽

Network Traffic Analysis

Download Full-text

A Review of “Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms”

Investigación en Ciberseguridad - Colección Jornadas y Congresos ◽

10.18239/jornadas_2021.34.29 ◽

2021 ◽

Author(s):

Miguel Martín-Pérez ◽

Ricardo J. Rodríguez ◽

Frank Breitinger

Keyword(s):

Approximate Matching

Download Full-text

Parallel Scalable Approximate Matching Algorithm for Network Intrusion Detection Systems

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/1/9 ◽

2020 ◽

Vol 18 (1) ◽

Keyword(s):

Intrusion Detection ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Matching Algorithm ◽

Approximate Matching ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Computer Processor ◽

Sequential Matching

Matching algorithms are working to find the exact or the approximate matching between text “T” and pattern “P”, due to the development of a computer processor, which currently contains a set of multi-cores, multitasks can be performed simultaneously. This technology makes these algorithms work in parallel to improve their speed matching performance. Several exact string matching and approximate matching algorithms have been developed to work in parallel to find the correspondence between text “T” and pattern “P”. This paper proposed two models: First, parallelized the Direct Matching Algorithm (PDMA) in multi-cores architecture using OpenMP technology. Second, the PDMA implemented in Network Intrusion Detection Systems (NIDS) to enhance the speed of the NIDS detection engine. The PDMA can be achieved more than 19.7% in parallel processing time compared with sequential matching processing. In addition, the performance of the NIDS detection engine improved for more than 8% compared to the current SNORT-NIDS detection engine

Download Full-text

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies

BMC Bioinformatics ◽

10.1186/s12859-020-03777-y ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Van-Kien Bui ◽

Chaochun Wei

Keyword(s):

Average Length ◽

Taxonomic Classification ◽

Sequence Classification ◽

Approximate Matching ◽

Next Generation Sequencing Technology ◽

Third Generation Sequencing ◽

Effective Program ◽

Two Phases ◽

Classification Tool ◽

Generation Sequencing

Abstract Background Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs. Results We developed a Classification tool using Discriminative K-mers and Approximate Matching algorithm (CDKAM). This approximate matching method was used for searching k-mers, which included two phases, a quick mapping phase and a dynamic programming phase. Simulated datasets as well as real TGS datasets have been tested to compare the performance of CDKAM with existing methods. We showed that CDKAM performed better in many aspects, especially when classifying TGS data with average length 1000–1500 bases. Conclusions CDKAM is an effective program with higher accuracy and lower memory requirement for TGS metagenome sequence classification. It produces a high species-level accuracy.

Download Full-text

approximate matching
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Linear Approximate Pattern Matching Algorithm

Minimodel of semantic synthesis of Russian sentences

Mining long-COVID symptoms from Reddit: characterizing post-COVID syndrome from patient reports

Long COVID symptoms from Reddit: Characterizing post-COVID syndrome from patient reports

Short Time air Temperature Prediction Using Pattern Approximate Matching

Bringing order to approximate matching: Classification and attacks on similarity digest algorithms

EVALUATION OF NETWORK TRAFFIC ANALYSIS USING APPROXIMATE MATCHING ALGORITHMS

A Review of “Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms”

Parallel Scalable Approximate Matching Algorithm for Network Intrusion Detection Systems

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies

Export Citation Format

approximate matchingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Linear Approximate Pattern Matching Algorithm

Minimodel of semantic synthesis of Russian sentences

Mining long-COVID symptoms from Reddit: characterizing post-COVID syndrome from patient reports

Long COVID symptoms from Reddit: Characterizing post-COVID syndrome from patient reports

Short Time air Temperature Prediction Using Pattern Approximate Matching

Bringing order to approximate matching: Classification and attacks on similarity digest algorithms

EVALUATION OF NETWORK TRAFFIC ANALYSIS USING APPROXIMATE MATCHING ALGORITHMS

A Review of “Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms”

Parallel Scalable Approximate Matching Algorithm for Network Intrusion Detection Systems

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies

approximate matching
Recently Published Documents