scholarly journals The Improved Hybrid Algorithm for the Atheer and Berry-Ravindran Algorithms

Author(s):  
Atheer Akram Abdul Razzaq ◽  
Nur’Aini Abdul Rashid ◽  
Alaa Ahmed Abbood ◽  
Zurinahni Zainol

Exact String matching considers is one of the important ways in solving the basic problems in computer science. This research proposed a hybrid exact string matching algorithm called E-Atheer. This algorithm depended on good features; searching and shifting techniques in the Atheer and Berry-Ravindran algorithms, respectively. The proposed algorithm showed better performance in number of attempts and character comparisons compared to the original and recent and standard algorithms. E-Atheer algorithm used several types of databases, which are DNA, Protein, XML, Pitch, English, and Source. The best performancein the number of attempts is when the algorithm is executed using the pitch dataset. The worst performance is when it is used with DNA dataset. The best and worst databases in the number of character comparisons with the E-Atheer algorithm are the Source and DNA databases, respectively.

2018 ◽  
Vol 7 (3) ◽  
pp. 1709
Author(s):  
Atheer Akram Abdulrazzaq ◽  
Nur’Aini Abdul Rashid ◽  
Ahmed Majid Taha

Exact string matching is one of the critical issues in the field of computer science. This study proposed a hybrid string matching algorithm called E- AbdulRazzaq. This algorithm used the best properties of two original algorithms; AbdulRazzaq and Berry-Ravindran Algorithms. The proposed algorithm showed an efficient performance in the number of attempts and number of character comparison when compared the original and recent to the standard algorithms. The proposed algorithm was applied in several types of databases, which are DNA sequences, Protein sequences, XML structures, Pitch characters, English texts, and Source codes. The Pitch database was the best match for E-AbdulRazzaq with the number of attempts involving long and short patterns, while the DNA database was the worst match. No data is specified as the best or worst with the E-AbdulRazzaq algorithm in terms of the character comparisons. The E-AbdulRazzaq algorithms ranked first in most databases when using short and long patterns, in terms of number of attempts and character comparisons.  


2020 ◽  
pp. 298-324
Author(s):  
Abdulrakeeb M. Al-Ssulami ◽  
Hassan I. Mathkour ◽  
Mohammed Amer Arafah

The exact string matching is essential in application areas such as Bioinformatics and Intrusion Detection Systems. Speeding-up the string matching algorithm will therefore result in accelerating the searching process in DNA and binary data. Previously, there are two types of fast algorithms exist, bit-parallel based algorithms and hashing algorithms. The bit-parallel based are efficient when dealing with patterns of short lengths, less than 64, but slow on long patterns. On the other hand, hashing algorithms have optimal sublinear average case on large alphabets and long patterns, but the efficiency not so good on small alphabet such as DNA and binary texts. In this paper, the authors present hybrid algorithm to overcome the shortcomings of those previous algorithms. The proposed algorithm is based on q-gram hashing with guaranteeing the maximal shift in advance. Experimental results on random and complete human genome confirm that the proposed algorithm is efficient on various pattern lengths and small alphabet.


2017 ◽  
Vol 13 (4) ◽  
pp. 198-220
Author(s):  
Abdulrakeeb M. Al-Ssulami ◽  
Hassan Mathkour ◽  
Mohammed Amer Arafah

The exact string matching is essential in application areas such as Bioinformatics and Intrusion Detection Systems. Speeding-up the string matching algorithm will therefore result in accelerating the searching process in DNA and binary data. Previously, there are two types of fast algorithms exist, bit-parallel based algorithms and hashing algorithms. The bit-parallel based are efficient when dealing with patterns of short lengths, less than 64, but slow on long patterns. On the other hand, hashing algorithms have optimal sublinear average case on large alphabets and long patterns, but the efficiency not so good on small alphabet such as DNA and binary texts. In this paper, the authors present hybrid algorithm to overcome the shortcomings of those previous algorithms. The proposed algorithm is based on q-gram hashing with guaranteeing the maximal shift in advance. Experimental results on random and complete human genome confirm that the proposed algorithm is efficient on various pattern lengths and small alphabet.


2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Anis Zouaghi ◽  
Mounir Zrigui ◽  
Georges Antoniadis ◽  
Laroussi Merhbene

We propose a new approach for determining the adequate sense of Arabic words. For that, we propose an algorithm based on information retrieval measures to identify the context of use that is the closest to the sentence containing the word to be disambiguated. The contexts of use represent a set of sentences that indicates a particular sense of the ambiguous word. These contexts are generated using the words that define the senses of the ambiguous words, the exact string-matching algorithm, and the corpus. We use the measures employed in the domain of information retrieval, Harman, Croft, and Okapi combined to the Lesk algorithm, to assign the correct sense of those proposed.


Sign in / Sign up

Export Citation Format

Share Document