scholarly journals Fast Phonetic Similarity Search over Large Repositories

Author(s):  
Hegler Tissot ◽  
Gabriel Peschl ◽  
Marcos Didonet Del Fabro
2019 ◽  
Vol 10 (S1) ◽  
Author(s):  
Hegler Tissot ◽  
Richard Dobson

Abstract Background There is an increasing amount of unstructured medical data that can be analysed for different purposes. However, information extraction from free text data may be particularly inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, coupled with a supporting dictionary. However, they are not rich enough to encode both typing and phonetic misspellings. Results Experimental results showed a joint string and language-dependent phonetic similarity is more accurate than traditional string distance metrics when identifying misspelt names of drugs in a set of medical records written in Portuguese. Conclusion We present a hybrid approach to efficiently perform similarity match that overcomes the loss of information inherit from using either exact match search or string based similarity search methods.


2009 ◽  
Vol 20 (10) ◽  
pp. 2867-2884 ◽  
Author(s):  
Feng WU ◽  
Yan ZHONG ◽  
Quan-Yuan WU ◽  
Yan JIA ◽  
Shu-Qiang YANG

2009 ◽  
Vol 28 (10) ◽  
pp. 2721-2721 ◽  
Author(s):  
Ai-guo LI ◽  
Hua ZHAO

2020 ◽  
Vol 16 (4) ◽  
pp. 473-485
Author(s):  
David Mary Rajathei ◽  
Subbiah Parthasarathy ◽  
Samuel Selvaraj

Background: Coronary heart disease generally occurs due to cholesterol accumulation in the walls of the heart arteries. Statins are the most widely used drugs which work by inhibiting the active site of 3-Hydroxy-3-methylglutaryl-CoA reductase (HMGCR) enzyme that is responsible for cholesterol synthesis. A series of atorvastatin analogs with HMGCR inhibition activity have been synthesized experimentally which would be expensive and time-consuming. Methods: In the present study, we employed both the QSAR model and chemical similarity search for identifying novel HMGCR inhibitors for heart-related diseases. To implement this, a 2D QSAR model was developed by correlating the structural properties to their biological activity of a series of atorvastatin analogs reported as HMGCR inhibitors. Then, the chemical similarity search of atorvastatin analogs was performed by using PubChem database search. Results and Discussion: The three-descriptor model of charge (GATS1p), connectivity (SCH-7) and distance (VE1_D) of the molecules is obtained for HMGCR inhibition with the statistical values of R2= 0.67, RMSEtr= 0.33, R2 ext= 0.64 and CCCext= 0.76. The 109 novel compounds were obtained by chemical similarity search and the inhibition activities of the compounds were predicted using QSAR model, which were close in the range of experimentally observed threshold. Conclusion: The present study suggests that the QSAR model and chemical similarity search could be used in combination for identification of novel compounds with activity by in silico with less computation and effort.


Author(s):  
Yu Chen ◽  
Yong Zhang ◽  
Jin Wang ◽  
Jiacheng Wu ◽  
Chunxiao Xing

2016 ◽  
Vol 51 (8) ◽  
pp. 1-12
Author(s):  
Sandeep R. Agrawal ◽  
Christopher M. Dee ◽  
Alvin R. Lebeck

Sign in / Sign up

Export Citation Format

Share Document