dictionary matching
Recently Published Documents


TOTAL DOCUMENTS

103
(FIVE YEARS 25)

H-INDEX

14
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Arslan Erdengasileng ◽  
Keqiao Li ◽  
Qing Han ◽  
Shubo Tian ◽  
Jian Wang ◽  
...  

Identification and indexing of chemical compounds in full-text articles are essential steps in biomedical article categorization, information extraction, and biological text mining. BioCreative Challenge was established to evaluate methods for biological text mining and information extraction. Track 2 of BioCreative VII (summer 2021) consists of two subtasks: chemical identification and chemical indexing in full-text PubMed articles. The chemical identification subtask also includes two parts: chemical named entity recognition (NER) and chemical normalization. In this paper, we present our work on developing a hybrid pipeline for chemical named entity recognition, chemical normalization, and chemical indexing in full-text PubMed articles. Specifically, we applied BERT-based methods for chemical NER and chemical indexing, and a sieve-based dictionary matching method for chemical normalization. For subtask 1, we used PubMedBERT with data augmentation on the chemical NER task. Several chemical-MeSH dictionaries including MeSH.XML, SUPP.XML, MRCONSO.RFF, and PubTator chemical annotations are used in a specific order to get the best performance on chemical normalization. We achieved an F1 score of 0.86 and 0.7668 on chemical NER and chemical normalization, respectively. For subtask 2, we formulated it as a binary prediction problem for each individual chemical compound name. We then used a BERT-based model with engineered features and achieved a strict F1 score of 0.4825 on the test set, which is substantially higher than the median F1 score (0.3971) of all the submissions.


Algorithmica ◽  
2021 ◽  
Author(s):  
Paweł Gawrychowski ◽  
Tatiana Starikovskaya
Keyword(s):  

Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1549
Author(s):  
Viktor Vegh ◽  
Shahrzad Moinian ◽  
Qianqian Yang ◽  
David C. Reutens

Mathematical models are becoming increasingly important in magnetic resonance imaging (MRI), as they provide a mechanistic approach for making a link between tissue microstructure and signals acquired using the medical imaging instrument. The Bloch equations, which describes spin and relaxation in a magnetic field, are a set of integer order differential equations with a solution exhibiting mono-exponential behaviour in time. Parameters of the model may be estimated using a non-linear solver or by creating a dictionary of model parameters from which MRI signals are simulated and then matched with experiment. We have previously shown the potential efficacy of a magnetic resonance fingerprinting (MRF) approach, i.e., dictionary matching based on the classical Bloch equations for parcellating the human cerebral cortex. However, this classical model is unable to describe in full the mm-scale MRI signal generated based on an heterogenous and complex tissue micro-environment. The time-fractional order Bloch equations have been shown to provide, as a function of time, a good fit of brain MRI signals. The time-fractional model has solutions in the form of Mittag–Leffler functions that generalise conventional exponential relaxation. Such functions have been shown by others to be useful for describing dielectric and viscoelastic relaxation in complex heterogeneous materials. Hence, we replaced the integer order Bloch equations with the previously reported time-fractional counterpart within the MRF framework and performed experiments to parcellate human gray matter, which consists of cortical brain tissue with different cyto-architecture at different spatial locations. Our findings suggest that the time-fractional order parameters, α and β, potentially associate with the effect of interareal architectonic variability, which hypothetically results in more accurate cortical parcellation.


2021 ◽  
Author(s):  
Ragnhildur Bjarnadottir ◽  
David Lindberg ◽  
Avirup Chakraborty ◽  
Mattia Prosperi ◽  
Marsha Crane ◽  
...  

BACKGROUND Around 1 million patients fall in US hospitals annually, with an associated direct medical cost of $50 billion dollars. Substantial nationwide efforts to reduce hospital falls in the past decade have yet to produce sustained results. This may in part be due to limited understanding of fall risk factors and overreliance on limited data. OBJECTIVE The purpose of our study was to explore text patterns associated with patient falls in a previously understudied data source: registered nurses’ electronic health record progress notes. METHODS This study employed supervised and unsupervised text-mining methods using data from medical/surgical units in a large academic health center in North Florida between 2013 and 2015. The data corpus consisted of registered nurses’ progress notes for patient cases who fell during their hospitalization and patient controls who were at risk during the same period but did not fall. RESULTS The analytical sample comprised of 107,842 progress notes for 2,171 patients (734 fallers and 1,437 non-fallers who had registered nurses’ progress notes documented during their stay). Supervised text-mining with dictionary matching revealed significantly more frequent documentation of cognitive patient factors and environmental factors in fallers’ progress notes compared to non-fallers. Unsupervised text-mining through topic modelling highlighted text patterns indicative of workflow or communication factors. Predictive models for both supervised and unsupervised text-mining features were developed, with an F1 score ranging from 0.184-0.591. CONCLUSIONS Findings of this study indicate that registered nurses’ progress notes contain factors associated with risk of falling that may not be captured in structured data. These include environmental factors, cognitive patient factors, and factors related to documentation practices. The findings highlight previously under-examined risk factors for hospital-acquired falls and can be used for hypothesis generation for further clinical research to prevent falls and improve patient safety at the bedside.


Algorithmica ◽  
2021 ◽  
Author(s):  
Panagiotis Charalampopoulos ◽  
Tomasz Kociumaka ◽  
Manal Mohamed ◽  
Jakub Radoszewski ◽  
Wojciech Rytter ◽  
...  
Keyword(s):  

2021 ◽  
pp. 1-10
Author(s):  
Wang Dong ◽  
Zhao Yong ◽  
Lin Hong ◽  
Zuo Xin

Chinese fill-in-the-blank questions contain both objective and subjective characteristics, and thus it has always been difficult to score them automatically. In this paper, fill-in-the-blank items are divided into those with word-level or sentence-level granularity; then, the items are automatically scored by different strategies. The automatic scoring framework combines semantic dictionary matching and semantic similarity calculations. First, fill-in-the-blank items with word-level granularity are divided into two types of test sites: the subject term test site, and the common word test site. We propose an algorithm for identifying an item’s test site. Then, a subject term dictionary with self-feedback learning ability is constructed to support the scoring of subject term test sites. The Tongyici Cilin semantic dictionary is used for scoring common word test sites. For fill-in-the-blank items with sentence-level granularity, an improved P-means model is used to generate a sentence vector of the standard answer and the examinee’s answer, and then the semantic similarity between the two answers is obtained by calculating the cosine distance of the sentence vector. Experimental results on actual test data show that the proposed algorithm has a maximum accuracy of 94.3% and achieves good results.


Sign in / Sign up

Export Citation Format

Share Document