SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora

Author(s):  
Anni Coden ◽  
Daniel Gruhl ◽  
Neal Lewis ◽  
Michael Tanenblatt ◽  
Joe Terdiman
2021 ◽  
Vol 29 ◽  
pp. 115-124
Author(s):  
Xinlu Wang ◽  
Ahmed A.F. Saif ◽  
Dayou Liu ◽  
Yungang Zhu ◽  
Jon Atli Benediktsson

BACKGROUND: DNA sequence alignment is one of the most fundamental and important operation to identify which gene family may contain this sequence, pattern matching for DNA sequence has been a fundamental issue in biomedical engineering, biotechnology and health informatics. OBJECTIVE: To solve this problem, this study proposes an optimal multi pattern matching with wildcards for DNA sequence. METHODS: This proposed method packs the patterns and a sliding window of texts, and the window slides along the given packed text, matching against stored packed patterns. RESULTS: Three data sets are used to test the performance of the proposed algorithm, and the algorithm was seen to be more efficient than the competitors because its operation is close to machine language. CONCLUSIONS: Theoretical analysis and experimental results both demonstrate that the proposed method outperforms the state-of-the-art methods and is especially effective for the DNA sequence.


2015 ◽  
Vol 68 (5) ◽  
pp. 937-950 ◽  
Author(s):  
Lin Wu ◽  
Hubiao Wang ◽  
Hua Chai ◽  
Houtse Hsu ◽  
Yong Wang

A Relative Positions-Constrained pattern Matching (RPCM) method for underwater gravity-aided inertial navigation is presented in this paper. In this method the gravity patterns are constructed based on the relative positions of points in a trajectory, which are calculated by Inertial Navigation System (INS) indications. In these patterns the accumulated errors of INS indicated positions are cancelled and removed. Thus the new constructed gravity patterns are more accurate and reliable while the process of matching can be constrained, and the probability of mismatching also can be reduced. Two gravity anomaly maps in the South China Sea were chosen to construct a simulation test. Simulation results show that with this RPCM method, the shape of the trajectory in gravity-aided navigation is not as restricted as that in traditional Terrain Contour Matching (TERCOM) algorithms. Moreover, the performance included matching success rates and position accuracies are highly improved in the RPCM method, especially for the trajectories that are not in straight lines. Thus the proposed method is effective and suitable for practical navigation.


2016 ◽  
Vol 12 (4) ◽  
pp. 21-44 ◽  
Author(s):  
R. Hema ◽  
T. V. Geetha

The two main challenges in chemical entity recognition are: (i) New chemical compounds are constantly being synthesized infinitely. (ii) High ambiguity in chemical representation in which a chemical entity is being described by different nomenclatures. Therefore, the identification and maintenance of chemical terminologies is a tough task. Since most of the existing text mining methods followed the term-based approaches, the problems of polysemy and synonymy came into the picture. So, a Named Entity Recognition (NER) system based on pattern matching in chemical domain is developed to extract the chemical entities from chemical documents. The Tf-idf and PMI association measures are used to filter out the non-chemical terms. The F-score of 92.19% is achieved for chemical NER. This proposed method is compared with the baseline method and other existing approaches. As the final step, the filtered chemical entities are classified into sixteen functional groups. The classification is done using SVM One against All multiclass classification approach and achieved the accuracy of 87%. One-way ANOVA is used to test the quality of pattern matching method with the other existing chemical NER methods.


2003 ◽  
Vol 24 (2) ◽  
pp. 449-466 ◽  
Author(s):  
Guoya Dong ◽  
Richard H Bayford ◽  
Shangkai Gao ◽  
Yoshifuru Saito ◽  
Rebecca Yerworth ◽  
...  

2008 ◽  
Author(s):  
Dae-Jin Park ◽  
Jinyoung Choi ◽  
Hyoungsoon Yune ◽  
Jaeseung Choi ◽  
Cheolkyun Kim ◽  
...  

AIChE Journal ◽  
2010 ◽  
Vol 57 (3) ◽  
pp. 671-694 ◽  
Author(s):  
Yaohua He ◽  
Chi-Wai Hui

2006 ◽  
Vol 05 (04) ◽  
pp. 337-343
Author(s):  
Nadia Nedjah ◽  
Luiza De Macedo Mourelle

Pattern matching is essential in many applications such as information retrieval, logic programming, theorem-proving, term rewriting and DNA-computing. It usually breaks down into two categories: root and complete pattern matching. Root matching determines whether a subject term is an instance of a pattern in a pattern set while complete matching determines whether a subject term contains a sub-term that is an instance of a pattern in a pattern set. For the sake of efficiency, root pattern matching need to be deterministic and lazy. Furthermore, complete pattern matching also needs to be parallel. Unlike root pattern matching, complete matching received little interest from the researchers of the field. In this paper, we present a novel deterministic multi-threaded complete matching method. This method subsumes a deterministic lazy root matching technique that was developped by the authors in an earlier work. We evaluate the performance of proposed method using theorem-proving and DNA-computing applications.


Sign in / Sign up

Export Citation Format

Share Document