spelling correction
Recently Published Documents


TOTAL DOCUMENTS

202
(FIVE YEARS 51)

H-INDEX

19
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Shuai Zhang ◽  
Jiangyan Yi ◽  
Zhengkun Tian ◽  
Ye Bai ◽  
Jianhua Tao ◽  
...  

Author(s):  
Triyas Hevianto Saputro ◽  
Arief Hermawan

Sentiment analysis is a part of text mining used to dig up information from a sentence or document. This study focuses on text classification for the purpose of a sentiment analysis on hospital review by customers through criticism and suggestion on Google Maps Review. The data of texts collected still contain a lot of nonstandard words. These nonstandard words cause problem in the preprocessing stage. Thus, the selection and combination of techniques in the preprocessing stage emerge as something crucial for the accuracy improvement in the computation of machine learning. However, not all of the techniques in the preprocessing stage can contribute to improve the accuracy on classification machine. The objective of this study is to improve the accuracy of classification model on hospital review by customers for a sentiment analysis modeling. Through the implementation of the preprocessing technique combination, it can produce a highly accurate classification model. This study experimented with several preprocessing techniques: (1) tokenization, (2) case folding, (3) stop words removal, (4) stemming, and (5) removing punctuation and number. The experiment was done by adding the preprocessing methods: (1) spelling correction and (2) Slang. The result shows that spelling correction and Slang method can assist for improving the accuracy value. Furthermore, the selection of suitable preprocessing technique combination can fasten the training process to produce the more ideal text classification model.


Author(s):  
Reem Alsadoon

Spelling is an essential skill in EFL writing especially if writing is frequently used in texting through social media and smartphones. The study surveys Arabic EFL learners about their perceptions of using spelling correction tools: spell checkers and auto-correctors. The study focused on the perceived usefulness and ease of use of such tools in social media applications, since they are used frequently by the study’s participants. The questionnaire was sent online to 84 participants at Aljazeerah Academy in Riyadh, Saudi Arabia. The data was analyzed quantitively with descriptive and inferential statistics indicating that spell checkers are mostly perceived to be easier to use than auto-correctors. However, no significant difference was found in terms of learning the misspelled words.


Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1035
Author(s):  
Miguel Rivera-Acosta ◽  
Juan Manuel Ruiz-Varela ◽  
Susana Ortega-Cisneros ◽  
Jorge Rivera ◽  
Ramón Parra-Michel ◽  
...  

In this paper, we present a novel approach that aims to solve one of the main challenges in hand gesture recognition tasks in static images, to compensate for the accuracy lost when trained models are used to interpret completely unseen data. The model presented here consists of two main data-processing stages. A deep neural network (DNN) for performing handshape segmentation and classification is used in which multiple architectures and input image sizes were tested and compared to derive the best model in terms of accuracy and processing time. For the experiments presented in this work, the DNN models were trained with 24,000 images of 24 signs from the American Sign Language alphabet and fine-tuned with 5200 images of 26 generated signs. The system was real-time tested with a community of 10 persons, yielding a mean average precision and processing rate of 81.74% and 61.35 frames-per-second, respectively. As a second data-processing stage, a bidirectional long short-term memory neural network was implemented and analyzed for adding spelling correction capability to our system, which scored a training accuracy of 98.07% with a dictionary of 370 words, thus, increasing the robustness in completely unseen data, as shown in our experiments.


10.2196/25530 ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. e25530
Author(s):  
Taehyeong Kim ◽  
Sung Won Han ◽  
Minji Kang ◽  
Se Ha Lee ◽  
Jong-Ho Kim ◽  
...  

Background Existing bacterial culture test results for infectious diseases are written in unrefined text, resulting in many problems, including typographical errors and stop words. Effective spelling correction processes are needed to ensure the accuracy and reliability of data for the study of infectious diseases, including medical terminology extraction. If a dictionary is established, spelling algorithms using edit distance are efficient. However, in the absence of a dictionary, traditional spelling correction algorithms that utilize only edit distances have limitations. Objective In this research, we proposed a similarity-based spelling correction algorithm using pretrained word embedding with the BioWordVec technique. This method uses a character-level N-grams–based distributed representation through unsupervised learning rather than the existing rule-based method. In other words, we propose a framework that detects and corrects typographical errors when a dictionary is not in place. Methods For detected typographical errors not mapped to Systematized Nomenclature of Medicine (SNOMED) clinical terms, a correction candidate group with high similarity considering the edit distance was generated using pretrained word embedding from the clinical database. From the embedding matrix in which the vocabulary is arranged in descending order according to frequency, a grid search was used to search for candidate groups of similar words. Thereafter, the correction candidate words were ranked in consideration of the frequency of the words, and the typographical errors were finally corrected according to the ranking. Results Bacterial identification words were extracted from 27,544 bacterial culture and antimicrobial susceptibility reports, and 16 types of spelling errors and 914 misspelled words were found. The similarity-based spelling correction algorithm using BioWordVec proposed in this research corrected 12 types of typographical errors and showed very high performance in correcting 97.48% (based on F1 score) of all spelling errors. Conclusions This tool corrected spelling errors effectively in the absence of a dictionary based on bacterial identification words in bacterial culture and antimicrobial susceptibility reports. This method will help build a high-quality refined database of vast text data for electronic health records.


Sign in / Sign up

Export Citation Format

Share Document