Character-Based N-gram Model for Uyghur Text Retrieval

Author(s):  
Turdi Tohti ◽  
Lirui Xu ◽  
Jimmy Huang ◽  
Winira Musajan ◽  
Askar Hamdulla
Keyword(s):  
Author(s):  
Abolfazl Aleahmad ◽  
Parsia Hakimian ◽  
Farzad Mahdikhani ◽  
Farhad Oroumchian

2014 ◽  
Vol 12 (8) ◽  
pp. 3758-3767 ◽  
Author(s):  
Mostafa Ezzat ◽  
Tarek Ahmed ElGhazaly ◽  
Mervat Gheith

This paper provides a new model aimed to enhanceArabic OCR degraded text retrieval effectiveness. The proposed model based onsimulating the Arabic OCR recognition mistakesbased on both, word based and Character N-Gram approaches. Then we expand the user search query using the expected OCR errors. The resulting search query expanded gives high precision and recall values in searching Arabic OCR-Degraded text rather than the original query. The proposed model showed a significant increase in the degraded text retrieval effectiveness over the previous models. The retrieval effectiveness of the newmodel is %93, while the best effectiveness published for word based approach was %84 and the best effectiveness for character based approach was %56.


Author(s):  
Vitaly Kuznetsov ◽  
Hank Liao ◽  
Mehryar Mohri ◽  
Michael Riley ◽  
Brian Roark

2020 ◽  
Author(s):  
Grant P. Strimel ◽  
Ariya Rastrow ◽  
Gautam Tiwari ◽  
Adrien Piérard ◽  
Jon Webb

2019 ◽  
Vol 1193 ◽  
pp. 012032
Author(s):  
D Purwantoro ◽  
H Akbar ◽  
A Hidayati ◽  
Sfenrianto
Keyword(s):  

2020 ◽  
Vol 12 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Al Hafiz Akbar Maulana Siagian ◽  
Masayoshi Aritsugi
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document