scholarly journals Designing a word recommendation application using the Levenshtein Distance algorithm

2021 ◽  
Vol 11 (2) ◽  
pp. 63-70
Author(s):  
Nadhia Nurin Syarafina ◽  
◽  
Jozua Ferjanus Palandi ◽  

Good scriptwriting or reporting requires a high level of accuracy. The basic problem is that the level of accuracy of the authors is not the same. The low level of accuracy allows for mistyping of words in a sentence. Typing errors caused the word to become non-standard. Even worse, the word became meaningless. In this case, the recommendation application serves to provide word-writing recommendations in case of a typing error. This application can reduce the error rate of the writer when typing. One method to improve word spelling is Approximate String Matching. This method applies an approach to the string search process. The Levenshtein Distance algorithm is a part of the Approximate String-Matching method. This method, firstly, is necessary to go through the preprocessing stage to correct an incorrectly written word using the Levenshtein Distance algorithm. The application testing phase uses ten texts composed of 100 words, ten texts composed of 100 to 250 words, and ten texts composed of 250 to 500 words. The average accuracy rate of these test results was 95%, 94%, and 90%.

2016 ◽  
Vol 7 (2) ◽  
Author(s):  
Yeny Rochmawati ◽  
Retno Kusumaningrum

Abstract. Error typing resulting in the change of standard words into non-standard words are often caused by misspelling. This can be addressed by developing a system to identify errors in typing. Approximate string matching is one method that is widely implemented to identify error typing by using several string search algorithms, i.e. Levenshtein Distance, Hamming Distance, Damerau Levenshtein Distance and Jaro Winkler Distance. However, there is no study that compares the performance of the four algorithms.  Therefore, this research aims to compare the performance between the four algorithms in order to identify which algorithm is the most accurate and precise in the search string based on various errors typing. Evaluation is performed by using users’ relevance judgments which produce the mean average precision (MAP) to determine the best algorithm. The result shows that Jaro Winkler Distance algorithm is the best in word-checking with 0.87 of MAP value when identifying the typing error of 50 incorrect words.Keywords: Errors typing, Levenshtein, Hamming, Damerau Levenshtein, Jaro Winkler Abstrak. Kesalahan pengetikan mengakibatkan kata baku berubah menjadi kata tidak baku karena ejaan yang digunakan tidak sesuai. Hal tersebut dapat ditangani dengan mengembangkan sistem untuk mengidentifikasi kesalahan pengetikan. Metode approximate string matching merupakan salah satu metode yang banyak diterapkan untuk mengidentifikasi kesalahan pengetikan dengan berbagai jenis algoritma pencarian string yaitu Levenshtein Distance, Hamming Distance, Damerau Levenshtein Distance dan Jaro Winkler Distance. Akan tetapi studi perbandingan kinerja dari keempat algoritma tersebut untuk Bahasa Indonesia belum pernah dilakukan. Oleh karena itu penelitian ini bertujuan untuk melakukan studi perbandingan kinerja dari keempat algoritma tersebut sehingga dapat diketahui algoritma mana yang lebih akurat dan tepat dalam pencarian string berdasarkan kesalahan penulisan yang bervariasi. Evaluasi yang dilakukan menggunakan user relevance judgement yang menghasilkan nilai mean average precision (MAP) untuk menentukan algoritma yang terbaik. Hasil penelitian terhadap 50 kata salah menunjukkan bahwa algoritma Jaro Winkler Distance terbaik dalam melakukan pengecekan kata dengan nilai MAP sebesar 0,87.Kata Kunci: Kesalahan pengetikan, Levenshtein, Hamming, Damerau Levenshtein, Jaro Winkler


Author(s):  
Bonifacius Vicky Indriyono

Writing is one of the efforts made by the writer to express ideas and ideas to others. But sometimes when writing, there are many errors in typing spelling, especially English spelling, resulting in errors in capturing the meaning and meaning of the writing. To overcome this problem, we need a system that can detect word spelling errors. Damerau Levenshtein and Jaro Winkler Distance Algorithms are algorithms that can be used as a solution to detect English typing errors. From the test results, it can be concluded that the Damerau Levenshtein and Jaro-Winkler Distance are able to optimally detect word mismatches and look for similarities of words compared. The Damerau Levenshtein Distance works by finding the smallest distance value, while the Jaro-Winkler Distance works by finding the greatest proximity value of the string being compared. Using this algorithm, errors in writing the spelling of words can be minimized.   Keywords— Algorithm; Damerau Levenshtein; Jaro Winkler; Spelling Cheker; String Matching.


2017 ◽  
Author(s):  
Hongyi Xin ◽  
Jeremie Kim ◽  
Sunny Nahar ◽  
Can Alkan ◽  
Onur Mutlu

AbstractMotivationApproximate String Matching is a pivotal problem in the field of computer science. It serves as an integral component for many string algorithms, most notably, DNA read mapping and alignment. The improved LV algorithm proposes an improved dynamic programming strategy over the banded Smith-Waterman algorithm but suffers from support of a limited selection of scoring schemes. In this paper, we propose the Leaping Toad problem, a generalization of the approximate string matching problem, as well as LEAP, a generalization of the Landau-Vishkin’s algorithm that solves the Leaping Toad problem under a broader selection of scoring schemes.ResultsWe benchmarked LEAP against 3 state-of-the-art approximate string matching implementations. We show that when using a bit-vectorized de Bruijn sequence based optimization, LEAP is up to 7.4x faster than the state-of-the-art bit-vector Levenshtein distance implementation and up to 32x faster than the state-of-the-art affine-gap-penalty parallel Needleman Wunsch Implementation.AvailabilityWe provide an implementation of LEAP in C++ at github.com/CMU-SAFARI/[email protected], [email protected] or [email protected]


Author(s):  
Ni Putu Dian Permata Prasetyaningrum

Surabaya Shipping Polytechnic emphasizes on certain areas of expertise that Taruna must possess. This is the basis after graduating from shipping polytechnics, cadets must have expertise and skills. The purpose of this study was to study the effect of inquiry, discovery learning, and creativity levels on the ability to write descriptive essays on nautical and technical cadets at Surabaya Shipping Polytechnic. This type of research is research. This research uses quantitative methods using experiments. The location used in this research is Surabaya Shipping Polytechnic. The subjects in this study were the cadets of the Nautika A, Nautika B, Teknika A, and Teknika B. classes. Based on the results of the research and discussion, the following conclusions are obtained: There are those that can be solved looking for description essays in the cadets. learning discovery method. The test results show better investigation methods than the discovery of learning, There is a difference in the ability to write a description essay about cadets who have a high level of creativity with cadets who have a low level of creativity, the test results show better who have a high level of creativity, there are related with learning methods and descriptions of the ability to write essay descriptions, the test results show learning methods and creativity descriptions of the ability to write essay descriptions.


2021 ◽  
Vol 11 (2) ◽  
pp. 75
Author(s):  
Jan Amos Jelinek

The Earth’s shape concept develops as consecutive cognitive problems (e.g., the location of people and trees on the spherical Earth) are gradually resolved. Establishing the order of problem solving may be important for the organisation of teaching situations. This study attempted to determine the sequence of problems to be resolved based on tasks included in the EARTH2 test. The study covered a group of 444 children between 5 and 10 years of age. It captured the order in which children solve cognitive problems on the way to constructing a science-like concept. The test results were compared with previous studies. The importance of cultural influences connected to significant differences (24%) in test results was emphasised. Attention was drawn to the problem of the consistency of the mental model approach highlighted in the literature. The analysis of the individual sets of answers provided a high level of consistency of indications referring to the same model (36%), emphasising the importance of the concept of mental models.


Author(s):  
Xiaoling Luo ◽  
Adrian Cottam ◽  
Yao-Jan Wu ◽  
Yangsheng Jiang

Trip purpose information plays a significant role in transportation systems. Existing trip purpose information is traditionally collected through human observation. This manual process requires many personnel and a large amount of resources. Because of this high cost, automated trip purpose estimation is more attractive from a data-driven perspective, as it could improve the efficiency of processes and save time. Therefore, a hybrid-data approach using taxi operations data and point-of-interest (POI) data to estimate trip purposes was developed in this research. POI data, an emerging data source, was incorporated because it provides a wealth of additional information for trip purpose estimation. POI data, an open dataset, has the added benefit of being readily accessible from online platforms. Several techniques were developed and compared to incorporate this POI data into the hybrid-data approach to achieve a high level of accuracy. To evaluate the performance of the approach, data from Chengdu, China, were used. The results show that the incorporation of POI information increases the average accuracy of trip purpose estimation by 28% compared with trip purpose estimation not using the POI data. These results indicate that the additional trip attributes provided by POI data can increase the accuracy of trip purpose estimation.


Information ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 19
Author(s):  
Alexey Semenkov ◽  
Dmitry Bragin ◽  
Yakov Usoltsev ◽  
Anton Konev ◽  
Evgeny Kostuchenko

Modern facial recognition algorithms make it possible to identify system users by their appearance with a high level of accuracy. In such cases, an image of the user’s face is converted to parameters that later are used in a recognition process. On the other hand, the obtained parameters can be used as data for pseudo-random number generators. However, the closeness of the sequence generated by such a generator to a truly random one is questionable. This paper proposes a system which is able to authenticate users by their face, and generate pseudo-random values based on the facial image that will later serve to generate an encryption key. The generator of a random value was tested with the NIST Statistical Test Suite. The subsystem of image recognition was also tested under various conditions of taking the image. The test results of the random value generator show a satisfactory level of randomness, i.e., an average of 0.47 random generation (NIST test), with 95% accuracy of the system as a whole.


2011 ◽  
Vol 261-263 ◽  
pp. 989-993 ◽  
Author(s):  
Anuchit Uchaipichat ◽  
Ekachai Man Koksung

An experimental program of laboratory bearing tests was performed to characterize the bearing capacity of foundation on unsaturated granular soils. All tests were performed by pushing a circular rod on the surface of compacted sand specimens with different values of matric suction until failure. The test results show an increase in ultimate bearing capacity with increasing matric suction at low suction value but a decrease in that at high level of suction. The comparisons between the test results and simulations using the expressions proposed in this paper are presented and discussed. Good agreements are achieved for all testing values of suction.


Sign in / Sign up

Export Citation Format

Share Document