Combination of levenshtein distance and rabin-karp to  improve the accuracy of document equivalence level

Rabin Karp algorithm is a search algorithm that searches for a substring pattern in a text using hashing. It is beneficial for matching words with many patterns. One of the practical applications of Rabin Karp's algorithm is in the detection of plagiarism. Michael O. Rabin and Richard M. Karp invented the algorithm. This algorithm performs string search by using a hash function. A hash function is the values that are compared between two documents to determine the level of similarity of the document. Rabin-Karp algorithm is not very good for single pattern text search. This algorithm is perfect for multiple pattern search. The Levenshtein algorithm can be used to replace the hash calculation on the Rabin-Karp algorithm. The hash calculation on Rabin-Karp only counts the number of hashes that have the same value in both documents. Using the Levenshtein algorithm, the calculation of the hash distance in both documents will result in better accuracy.

Download Full-text

RABIN-CARP IMPLEMENTATION IN MEASURING SIMALIRITY OF RESEARCH PROPOSAL OF STUDENTS

Journal of Information Technology and Its Utilization ◽

10.30818/jitu.3.1.3210 ◽

2020 ◽

Vol 3 (1) ◽

pp. 9

Author(s):

Herman Herman ◽

Lukman Syafie ◽

Tasmil Tasmil ◽

Muhammad Resha

Keyword(s):

Hash Function ◽

Search Algorithm ◽

Large Data ◽

Scientific Work ◽

Test Results ◽

Academic World ◽

Matching Algorithm ◽

Use Of Data ◽

String Search ◽

Student Thesis

Plagiarism is the use of data, language and writing without including the original author or source. The place where palgiate practice occurs most often is the academic environment. In the academic world, the most frequently plagiarized thing is scientific work, for example thesis. To minimize the practice of plagiarism, it is not enough to just remind students. Therefore we need a system or application that can help in measuring the level of similarity of student thesis proposals in order to minimize plagiarism practice. In computer science, the Rabin-Karp algorithm can be used in measuring the level of similarity of texts. The Rabin-Karp algorithm is a string matching algorithm that uses a hash function as a comparison between the search string (m) and substrings in text (n). The Rabin-Karp algorithm is a string search algorithm that can work for large data sizes. The test results show that the use of values on k-gram has an effect on the results of the measurement of similarity levels. In addition, it was also found that the use of the value 5 on k-gram was faster in executing than the values 4 and 6.

Download Full-text

Building design optimization using a convergent pattern search algorithm with adaptive precision simulations

Energy and Buildings ◽

10.1016/j.enbuild.2004.09.005 ◽

2005 ◽

Vol 37 (6) ◽

pp. 603-612 ◽

Cited By ~ 32

Author(s):

Michael Wetter ◽

Elijah Polak

Keyword(s):

Design Optimization ◽

Search Algorithm ◽

Building Design ◽

Pattern Search ◽

Pattern Search Algorithm

Download Full-text

Optimum sitting and sizing of WTs, PVs, ESSs and EVCSs using hybrid soccer league competition-pattern search algorithm

Engineering Science and Technology an International Journal ◽

10.1016/j.jestch.2020.12.007 ◽

2021 ◽

Author(s):

Ahmet Dogan

Keyword(s):

Search Algorithm ◽

Pattern Search ◽

Pattern Search Algorithm ◽

Competition Pattern

Download Full-text

Implementing a faster string search algorithm in Ada

ACM SIGAda Ada Letters ◽

10.1145/44772.44777 ◽

1988 ◽

Vol VIII (3) ◽

pp. 87-97

Author(s):

P. Wood ◽

D. Turcaso

Keyword(s):

Search Algorithm ◽

String Search

Download Full-text

Pattern Search Algorithm for Blackboard-Based Multidisciplinary Design Optimization Frameworks

Journal of Aircraft ◽

10.2514/1.c034897 ◽

2019 ◽

Vol 56 (1) ◽

pp. 121-136

Author(s):

Nickolay Jelev ◽

Andy Keane ◽

Carren Holden

Keyword(s):

Design Optimization ◽

Multidisciplinary Design Optimization ◽

Search Algorithm ◽

Pattern Search ◽

Multidisciplinary Design ◽

Pattern Search Algorithm

Download Full-text

APLIKASI PENDETEKSI PLAGIARISME TUGAS DAN MAKALAH PADA SEKOLAH MENGGUNAKAN ALGORITMA RABIN KARP

Jurnal Algoritma, Logika dan Komputasi ◽

10.30813/j-alu.v1i1.1104 ◽

2018 ◽

Vol 1 (1) ◽

Author(s):

Danny Steveson ◽

Halim Agung ◽

Fendra Mulia

Keyword(s):

Search Algorithm ◽

Sentence Similarity ◽

String Search ◽

Frequent Problem ◽

Sørensen Index

Plagiarism is a very frequent problem in all aspects of one occurring in school. There is often plagiarism on the content of the papers or assignments collected by the students. This is to support the decreasing creativity of students in giving ideas and personal opinions on the task given. To answer the above problems then this research using Rabin-Karp algorithm. Rabin-Karp algorithm is a string search algorithm that uses hashing to find one of a series of string patterns in text. Using this application, the user can compare document 1 with another document, which gives results in sentence similarity, then spelled out per word, followed by per hashing and is calculated from the average number of percentages. The test in this research is done by taking samples 50 times and in comparison between percentage with Rabin Karp algorithm and percentage with manual taking. Testing is done by comparing one document with another document. Based on the result of the research, it can be concluded by using Rabin Karp Algorithm, which can be implemented in plagiarism application evidenced by the test using 50 test samples with 43 samples of success of 14.22%. Keywords: document , Rabin Karp Algorithm, Dice Sorensen Index, Plagiarism, sentence, word

Download Full-text

X-RAY CT Reconstruction by using Spatially Non Homogeneous ICD Optimization

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1352.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 2043-2046

Keyword(s):

Search Algorithm ◽

Computational Cost ◽

Selection Criterion ◽

Closed Form Solution ◽

Helical Ct ◽

Reconstruction Algorithm ◽

Form Solution ◽

Ct Reconstruction ◽

Practical Applications ◽

Voxel Selection

Recent applications of conventional iterative coordinate descent (ICD) algorithms to multislice helical CT reconstructions have shown that conventional ICD can greatly improve image quality by increasing resolution as well as reducing noise and some artifacts. However, high computational cost and long reconstruction times remain as a barrier to the use of conventional algorithm in the practical applications. Among the various iterative methods that have been studied for conventional, ICD has been found to have relatively low overall computational requirements due to its fast convergence. This paper presents a fast model-based iterative reconstruction algorithm using spatially nonhomogeneous ICD (NH-ICD) optimization. The NH-ICD algorithm speeds up convergence by focusing computation where it is most needed. The NH-ICD algorithm has a mechanism that adaptively selects voxels for update. First, a voxel selection criterion VSC determines the voxels in greatest need of update. Then a voxel selection algorithm VSA selects the order of successive voxel updates based upon the need for repeated updates of some locations, while retaining characteristics for global convergence. In order to speed up each voxel update, we also propose a fast 3-D optimization algorithm that uses a quadratic substitute function to upper bound the local 3-D objective function, so that a closed form solution can be obtained rather than using a computationally expensive line search algorithm. The experimental results show that the proposed method accelerates the reconstructions by roughly a factor of three on average for typical 3-D multislice geometries.

Download Full-text

IDENTIFICATION OF A ROBUST LICHEN INDEX FOR THE DECONVOLUTION OF LICHEN AND ROCK MIXTURES USING PATTERN SEARCH ALGORITHM (CASE STUDY: GREENLAND)

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b7-973-2016 ◽

2016 ◽

Vol XLI-B7 ◽

pp. 973-979

Author(s):

S. Salehi ◽

M. Karami ◽

R. Fensholt

Keyword(s):

Search Algorithm ◽

Mineral Exploration ◽

Optimization Procedure ◽

Hyperspectral Data ◽

Pattern Search ◽

Rock Outcrops ◽

Bare Rock ◽

Mixed Pixel ◽

Mixing Ratios ◽

Linear Mixtures

Lichens are the dominant autotrophs of polar and subpolar ecosystems commonly encrust the rock outcrops. Spectral mixing of lichens and bare rock can shift diagnostic spectral features of materials of interest thus leading to misinterpretation and false positives if mapping is done based on perfect spectral matching methodologies. Therefore, the ability to distinguish the lichen coverage from rock and decomposing a mixed pixel into a collection of pure reflectance spectra, can improve the applicability of hyperspectral methods for mineral exploration. The objective of this study is to propose a robust lichen index that can be used to estimate lichen coverage, regardless of the mineral composition of the underlying rocks. The performance of three index structures of ratio, normalized ratio and subtraction have been investigated using synthetic linear mixtures of pure rock and lichen spectra with prescribed mixing ratios. Laboratory spectroscopic data are obtained from lichen covered samples collected from Karrat, Liverpool Land, and Sisimiut regions in Greenland. The spectra are then resampled to Hyperspectral Mapper (HyMAP) resolution, in order to further investigate the functionality of the indices for the airborne platform. In both resolutions, a Pattern Search (PS) algorithm is used to identify the optimal band wavelengths and bandwidths for the lichen index. The results of our band optimization procedure revealed that the ratio between R894-1246 and R1110 explains most of the variability in the hyperspectral data at the original laboratory resolution (R2=0.769). However, the normalized index incorporating R1106-1121 and R904-1251 yields the best results for the HyMAP resolution (R2=0.765).

Download Full-text