RABIN-CARP IMPLEMENTATION IN MEASURING SIMALIRITY OF RESEARCH PROPOSAL OF STUDENTS

Plagiarism is the use of data, language and writing without including the original author or source. The place where palgiate practice occurs most often is the academic environment. In the academic world, the most frequently plagiarized thing is scientific work, for example thesis. To minimize the practice of plagiarism, it is not enough to just remind students. Therefore we need a system or application that can help in measuring the level of similarity of student thesis proposals in order to minimize plagiarism practice. In computer science, the Rabin-Karp algorithm can be used in measuring the level of similarity of texts. The Rabin-Karp algorithm is a string matching algorithm that uses a hash function as a comparison between the search string (m) and substrings in text (n). The Rabin-Karp algorithm is a string search algorithm that can work for large data sizes. The test results show that the use of values on k-gram has an effect on the results of the measurement of similarity levels. In addition, it was also found that the use of the value 5 on k-gram was faster in executing than the values 4 and 6.

Download Full-text

Perbandingan Algoritma N-gram dan Algoritma Knuth Morris Pratt untuk Mengukur Tingkat Akurasi Plagiarisme pada Dokumen Abstrak Skripsi Berbasis Website

JITU : Journal Informatic Technology And Communication ◽

10.36596/jitu.v5i1.390 ◽

2021 ◽

Vol 5 (1) ◽

pp. 30-39

Author(s):

Dwi Krisbiantoro ◽

Sofyan Fathur Rohim ◽

Irfan Santiko

Keyword(s):

Search Algorithm ◽

System Development ◽

Test Results ◽

Academic World ◽

Development Method ◽

Average Percentage ◽

Analysis Design ◽

String Search ◽

N Gram ◽

Better Than

Plagiarism is a crime that often occurs in the academic world, plagiarism occurs because of theft of other people's work that is illegally recognized as if the work is his own. N-gram is an algorithm by cutting as many characters as N-characters in a sentence or word. While the Knuth Morris Pratt (KMP) algorithm is a string search algorithm, this algorithm is used to maintain information that is used to carry out the number of shifts whenever there is no matched patency in the text. The purpose of this study is to create a system to measure the comparison of the accuracy of the N-gram algorithm with a website-based KMP on a thesis abstract document. This research uses the waterfall system development method which has stages, namely analysis, design, coding, and testing. The KMP test results are better than N-gram where kmp has an average percentage of 3.8% while the N-gram 3.5% results are obtained from an average of 10 trials and 5 documents tested.

Download Full-text

Combination of levenshtein distance and rabin-karp to improve the accuracy of document equivalence level

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.27.12084 ◽

2018 ◽

Vol 7 (2.27) ◽

pp. 17 ◽

Cited By ~ 4

Author(s):

Andysah Putera Utama Siahaan ◽

Solly Aryza ◽

Eko Hariyanto ◽

Rusiadi . ◽

Andre Hasudungan Lubis ◽

...

Keyword(s):

Hash Function ◽

Search Algorithm ◽

Pattern Search ◽

Levenshtein Distance ◽

Text Search ◽

Practical Applications ◽

Single Pattern ◽

String Search ◽

Good For

Rabin Karp algorithm is a search algorithm that searches for a substring pattern in a text using hashing. It is beneficial for matching words with many patterns. One of the practical applications of Rabin Karp's algorithm is in the detection of plagiarism. Michael O. Rabin and Richard M. Karp invented the algorithm. This algorithm performs string search by using a hash function. A hash function is the values that are compared between two documents to determine the level of similarity of the document. Rabin-Karp algorithm is not very good for single pattern text search. This algorithm is perfect for multiple pattern search. The Levenshtein algorithm can be used to replace the hash calculation on the Rabin-Karp algorithm. The hash calculation on Rabin-Karp only counts the number of hashes that have the same value in both documents. Using the Levenshtein algorithm, the calculation of the hash distance in both documents will result in better accuracy.

Download Full-text

EVALUASI ANR PADA TRANSMISI DATA NETWORK TERHADAP WP-REST API DALAM APLIKASI ANDROID

KOMIK (Konferensi Nasional Teknologi Informasi dan Komputer) ◽

10.30865/.v2i1.901 ◽

2018 ◽

Vol 2 (1) ◽

Author(s):

Agung Riyadi

Keyword(s):

Large Data ◽

Data Retrieval ◽

Test Results ◽

Android Application ◽

Retrieval Process ◽

Request Queue ◽

Plant Data ◽

Android Development ◽

Rest Api ◽

The One

The One of many way to connect to the database through the android application is using volleyball and RESTAPI. By using RestAPI, the android application does not directly connect to the database but there is an intermediary in the form of an API. In android development, Android-volley has the disadvantage of making requests from large and large data, so an evaluation is needed to test the capabilities of the Android volley. This research was conducted to test android-volley to retrieve data through RESTAPI presented in the form of an application to retrieve medicinal plant data. From the test results can be used by volley an error occurs when the back button is pressed, in this case another process is carried out if the previous volley has not been loaded. This error occurred on several android versions such as lollipops and marshmallows also on some brands of devices. So that in using android-volley developer need to check the request queue process that is carried out by the user, if the data retrieval process by volley has not been completed, it is necessary to stop the process to download data using volley so that there is no Android Not Responding (ANR) error.Keywords: Android, Volley, WP REST API, ANR Error

Download Full-text

Implementing a faster string search algorithm in Ada

ACM SIGAda Ada Letters ◽

10.1145/44772.44777 ◽

1988 ◽

Vol VIII (3) ◽

pp. 87-97

Author(s):

P. Wood ◽

D. Turcaso

Keyword(s):

Search Algorithm ◽

String Search

Download Full-text

Effects of genotype and lactation number on health and reproductive problems in dairy cows

Proceedings of the British Society of Animal Science ◽

10.1017/s1752756200595842 ◽

1997 ◽

Vol 1997 ◽

pp. 143-143

Author(s):

B.L. Nielsen ◽

R.F. Veerkamp ◽

J.E. Pryce ◽

G. Simm ◽

J.D. Oldham

Keyword(s):

Dairy Cows ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Variation Analysis ◽

Genetic Line ◽

Data Set ◽

Health Events ◽

Use Of Data ◽

Low Incidence

High producing dairy cows have been found to be more susceptible to disease (Jones et al., 1994; Göhn et al., 1995) raising concerns about the welfare of the modern dairy cow. Genotype and number of lactations may affect various health problems differently, and their relative importance may vary. The categorical nature and low incidence of health events necessitates large data-sets, but the use of data collected across herds may introduce unwanted variation. Analysis of a comprehensive data-set from a single herd was carried out to investigate the effects of genetic line and lactation number on the incidence of various health and reproductive problems.

Download Full-text

APLIKASI PENDETEKSI PLAGIARISME TUGAS DAN MAKALAH PADA SEKOLAH MENGGUNAKAN ALGORITMA RABIN KARP

Jurnal Algoritma, Logika dan Komputasi ◽

10.30813/j-alu.v1i1.1104 ◽

2018 ◽

Vol 1 (1) ◽

Author(s):

Danny Steveson ◽

Halim Agung ◽

Fendra Mulia

Keyword(s):

Search Algorithm ◽

Sentence Similarity ◽

String Search ◽

Frequent Problem ◽

Sørensen Index

Plagiarism is a very frequent problem in all aspects of one occurring in school. There is often plagiarism on the content of the papers or assignments collected by the students. This is to support the decreasing creativity of students in giving ideas and personal opinions on the task given. To answer the above problems then this research using Rabin-Karp algorithm. Rabin-Karp algorithm is a string search algorithm that uses hashing to find one of a series of string patterns in text. Using this application, the user can compare document 1 with another document, which gives results in sentence similarity, then spelled out per word, followed by per hashing and is calculated from the average number of percentages. The test in this research is done by taking samples 50 times and in comparison between percentage with Rabin Karp algorithm and percentage with manual taking. Testing is done by comparing one document with another document. Based on the result of the research, it can be concluded by using Rabin Karp Algorithm, which can be implemented in plagiarism application evidenced by the test using 50 test samples with 43 samples of success of 14.22%.<br />Keywords: document , Rabin Karp Algorithm, Dice Sorensen Index, Plagiarism, sentence, word

Download Full-text

IDPM: An Improved Degenerate Pattern Matching Algorithm for Biological Sequences

International Journal of Foundations of Computer Science ◽

10.1142/s0129054117500307 ◽

2017 ◽

Vol 28 (07) ◽

pp. 889-914

Author(s):

Jie Lin ◽

Yue Jiang ◽

E. James Harner ◽

Bing-Hua Jiang ◽

Don Adjeroh

Keyword(s):

Performance Improvement ◽

Pattern Matching ◽

Linear Time ◽

Computational Cost ◽

Large Data ◽

Biological Sequences ◽

Matching Problem ◽

Practical Utilization ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Let [Formula: see text] be a string, with symbols from an alphabet. [Formula: see text] is said to be degenerate if for some positions, say [Formula: see text], [Formula: see text] can contain a subset of symbols from the symbol alphabet, rather than just one symbol. Given a text string [Formula: see text] and a pattern [Formula: see text], both with symbols from an alphabet [Formula: see text], the degenerate string matching problem, is to find positions in [Formula: see text] where [Formula: see text] occured, such that [Formula: see text], [Formula: see text], or both are allowed to be degenerate. Though some algorithms have been proposed, their huge computational cost pose a significant challenge to their practical utilization. In this work, we propose IDPM, an improved degenerate pattern matching algorithm based on an extension of the Boyer–Moore algorithm. At the preprocessing phase, the algorithm defines an alphabet-independent compatibility rule, and computes the shift arrays using respective variants of the bad character and good suffix heuristics. At the search phase, IDPM improves the matching speed by using the compatibility rule. On average, the proposed IDPM algorithm has a linear time complexity with respect to the text size, and to the overall size of the pattern. IDPM demonstrates significance performance improvement over state-of-the-art approaches. It can be used in fast practical degenerate pattern matching with large data sizes, with important applications in flexible and scalable searching of huge biological sequences.

Download Full-text

First Aid Mobile Application for University Clinic using Predictive String Search Algorithm

2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM ) ◽

10.1109/hnicem48295.2019.9072749 ◽

2019 ◽

Author(s):

Marjory Faye F. Dando ◽

Evander Tampos ◽

Princess Karen F. De Guzman ◽

Francis F. Balahadia

Keyword(s):

Mobile Application ◽

Search Algorithm ◽

First Aid ◽

String Search

Download Full-text

An Efficient Block Matching Algorithm for Fast Motion Estimation Using New Three Step Search and Tabu Search Algorithm

International Journal of Scientific Research and Management ◽

10.18535/ijsrm/v5i7.58 ◽

2017 ◽

Vol 5 (7) ◽

Author(s):

Savita Malik ◽

◽

Vijay Nehra ◽

Keyword(s):

Tabu Search ◽

Motion Estimation ◽

Search Algorithm ◽

Block Matching ◽

Tabu Search Algorithm ◽

Fast Motion Estimation ◽

Matching Algorithm ◽

Block Matching Algorithm ◽

Fast Motion

Download Full-text

Learnings from a pragmatic study to evaluate benefit of performing reflex clinical trial matching and providing clinical decision support to physicians.

Journal of Clinical Oncology ◽

10.1200/jco.2019.37.15_suppl.e18006 ◽

2019 ◽

Vol 37 (15_suppl) ◽

pp. e18006-e18006

Author(s):

Neha M Jain ◽

Alison Culley ◽

Travis John Osterman ◽

Mia Alyce Levy

Keyword(s):

Clinical Trial ◽

Response Rate ◽

Clinical Decision ◽

Test Results ◽

Vital Status ◽

Matching Algorithm ◽

Study Results ◽

Clinical Trial Enrollment ◽

To Receive ◽

Pragmatic Study

e18006 Background: Clinical trial enrollment is an assiduous process and requires active initiation and maintenance efforts by providers, patients, trial investigators, or other clinical staff. Setting up automated process triggers to perform a reflex clinical trial matching can kick-start the process without requiring human intervention. Methods: Using a clinical trial matching service developed in collaboration with GenomOncology, we used the receipt of sequencing test results as a process trigger to perform reflex clinical trial matching on oncology patients. A research nurse performed additional refinements to these results using multi-faceted filtering and an initial manual prescreening. In this pragmatic study, providers were randomized to receive results of prescreening events. EMR messages were sent to the intervention cohort of providers with recommendations for clinical trials for their patients and suggested next steps. Provider responses and prescreening outcomes were recorded in a REDCap project. To iteratively refine the trial results, matching algorithm updates were deployed throughout the study. Results: In the pilot deployment of this trial, we performed prescreening on 60 patients. At the time of prescreening vital status of 17% of the patients was outdated. We observed that trial cohort related recruiting statuses were the highest contributors to false matches (44%) and provider response rate was 94%. Conclusions: It is not possible to make substantial improvements to the outcome of clinical trial enrollment events without investing in reliable publicly available resources that host updated recruiting status for trials at the arm/cohort level. Uptake of such efforts by NCI or NLM has the potential to radically change the accuracy of clinical trial matching services and thereby improve enrollment efficiencies.[Table: see text]

Download Full-text