sequence labeling Latest Research Papers

By defining the computable word segmentation unit and studying its probability characteristics, we establish an unsupervised statistical language model (SLM) for a new pre-trained sequence labeling framework in this article. The proposed SLM is an optimization model, and its objective is to maximize the total binding force of all candidate word segmentation units in sentences under the condition of no annotated datasets and vocabularies. To solve SLM, we design a recursive divide-and-conquer dynamic programming algorithm. By integrating SLM with the popular sequence labeling models, Vietnamese word segmentation, part-of-speech tagging and named entity recognition experiments are performed. The experimental results show that our SLM can effectively promote the performance of sequence labeling tasks. Just using less than 10% of training data and without using a dictionary, the performance of our sequence labeling framework is better than the state-of-the-art Vietnamese word segmentation toolkit VnCoreNLP on the cross-dataset test. SLM has no hyper-parameter to be tuned, and it is completely unsupervised and applicable to any other analytic language. Thus, it has good domain adaptability.

Download Full-text

Semantic Sequence Labeling Model of Power Dispatching Based on Deep Long Short Term Memory Network

Journal of Interconnection Networks ◽

10.1142/s0219265921430180 ◽

2021 ◽

Author(s):

Hu Feifei ◽

Zeng Shibo ◽

Hong Danke ◽

Zhang Situo ◽

Song yongwei ◽

...

Keyword(s):

Decision Making ◽

Short Term Memory ◽

Mechanism Analysis ◽

Short Term ◽

Term Memory ◽

Control Center ◽

Fine Grained ◽

Sequence Labeling ◽

Long Short Term Memory ◽

Power Dispatching

As the decision-making brain for power system operation, grid regulation and operation is a comprehensive decision-making control that combines a large amount of data, mechanism analysis, operating procedures and professional experience, and a new generation of artificial intelligence development ideas and evolution characterized by data-driven and knowledge-guided. The directions are very close. However, the current scheduling control is still based on experience and manual analysis. The massive and diverse data of the control center and the lack of logical models between the plans require a large amount of experience and knowledge associations by the control personnel. There are more repetitive human brain labor and relatively low intelligence. Therefore, deep learning is applied to the learning of power control knowledge, and a semantic understanding network based on deep Long Short Term Memory is proposed. It uses sequence labeling to extract in-depth semantic related information of different keywords and query questions, and finds key information about language problems in order to achieve fine-grained and precise query. Experiments show that the proposed network model is superior to the previous methods, and it achieves better performance in the joint extraction of fine-grained evaluation words and evaluation objects, extracts the key information and deep semantic information of query problems and corresponding cases, and realizes power scheduling based on voice interaction The model can be effectively applied in the field of power dispatching and solve a large number of problems in power dispatching and control.

Download Full-text

Improving deep learning method for biomedical named entity recognition by using entity definition information

BMC Bioinformatics ◽

10.1186/s12859-021-04236-y ◽

2021 ◽

Vol 22 (S1) ◽

Author(s):

Ying Xiong ◽

Shuai Chen ◽

Buzhou Tang ◽

Qingcai Chen ◽

Xiaolong Wang ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

The State ◽

Entity Recognition ◽

Biomedical Text ◽

Learning Methods ◽

Named Entity ◽

Sequence Labeling ◽

Biomedical Named Entity Recognition

Abstract Background Biomedical named entity recognition (NER) is a fundamental task of biomedical text mining that finds the boundaries of entity mentions in biomedical text and determines their entity type. To accelerate the development of biomedical NER techniques in Spanish, the PharmaCoNER organizers launched a competition to recognize pharmacological substances, compounds, and proteins. Biomedical NER is usually recognized as a sequence labeling task, and almost all state-of-the-art sequence labeling methods ignore the meaning of different entity types. In this paper, we investigate some methods to introduce the meaning of entity types in deep learning methods for biomedical NER and apply them to the PharmaCoNER 2019 challenge. The meaning of each entity type is represented by its definition information. Material and method We investigate how to use entity definition information in the following two methods: (1) SQuad-style machine reading comprehension (MRC) methods that treat entity definition information as query and biomedical text as context and predict answer spans as entities. (2) Span-level one-pass (SOne) methods that predict entity spans of one type by one type and introduce entity type meaning, which is represented by entity definition information. All models are trained and tested on the PharmaCoNER 2019 corpus, and their performance is evaluated by strict micro-average precision, recall, and F1-score. Results Entity definition information brings improvements to both SQuad-style MRC and SOne methods by about 0.003 in micro-averaged F1-score. The SQuad-style MRC model using entity definition information as query achieves the best performance with a micro-averaged precision of 0.9225, a recall of 0.9050, and an F1-score of 0.9137, respectively. It outperforms the best model of the PharmaCoNER 2019 challenge by 0.0032 in F1-score. Compared with the state-of-the-art model without using manually-crafted features, our model obtains a 1% improvement in F1-score, which is significant. These results indicate that entity definition information is useful for deep learning methods on biomedical NER. Conclusion Our entity definition information enhanced models achieve the state-of-the-art micro-average F1 score of 0.9137, which implies that entity definition information has a positive impact on biomedical NER detection. In the future, we will explore more entity definition information from knowledge graph.

Download Full-text

Parsing Clinical Trial Eligibility Criteria for Cohort Query by a Multi-Input Multi-Output Sequence Labeling Model

10.1101/2021.11.18.21266533 ◽

2021 ◽

Author(s):

Shubo Tian ◽

Pengfei Yin ◽

Hansi Zhang ◽

Arslan Erdengasileng ◽

Jiang Bian ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Language Processing ◽

Training Dataset ◽

Free Text ◽

Eligibility Criteria ◽

Database Queries ◽

Output Sequence ◽

Sequence Labeling ◽

Electronic Screening

To enable electronic screening of eligible patients for clinical trials, free-text clinical trial eligibility criteria should be translated to a computable format. Natural language processing (NLP) techniques have the potential to automate this process. In this study, we explored a supervised multi-input multi-output (MIMO) sequence labeling model to parse eligibility criteria into combinations of fact and condition tuples. Our experiments on a small manually annotated training dataset showed that that the performance of the MIMO framework with a BERT-based encoder using all the input sequences achieved an overall lenient-level AUROC of 0.61. Although the performance is suboptimal, representing eligibility criteria into logical and semantically clear tuples can potentially make subsequent translation of these tuples into database queries more reliable.

Download Full-text

FEE: An Event Extraction Model for Flood and Drought Disaster Based on Sequence Labeling

10.1109/crc52766.2021.9620138 ◽

2021 ◽

Author(s):

Zulong Ma ◽

Jiamin Lu ◽

Wei Wu ◽

Jun Feng

Keyword(s):

Event Extraction ◽

Sequence Labeling ◽

Drought Disaster ◽

Extraction Model

Download Full-text

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Soft Computing ◽

10.1007/s00500-021-06228-9 ◽

2021 ◽

Author(s):

Shunxiang Zhang ◽

Hanqing Xu ◽

Guangli Zhu ◽

Xiang Chen ◽

KuanChing Li

Keyword(s):

Data Processing ◽

Processing Method ◽

Syntactic Analysis ◽

Product Reviews ◽

Sequence Labeling ◽

Data Processing Method

Download Full-text

Feature Selection and Extraction in Sequence Labeling for Arrhythmia Detection

10.1109/balkancom53780.2021.9593247 ◽

2021 ◽

Author(s):

Minxiang Ye ◽

Vladimir Stankovic ◽

Lina Stankovic ◽

Srdjan Lulic ◽

Andras Anderla ◽

...

Keyword(s):

Feature Selection ◽

Arrhythmia Detection ◽

Sequence Labeling

Download Full-text

Auxiliary Sequence Labeling Tasks for Disfluency Detection

10.21437/interspeech.2021-400 ◽

2021 ◽

Author(s):

Dongyub Lee ◽

Byeongil Ko ◽

Myeong Cheol Shin ◽

Taesun Whang ◽

Daniel Lee ◽

...

Keyword(s):

Sequence Labeling

Download Full-text

A Data Processing Method Based on Sequence Labelling and Syntactic Analysis for Extracting New Sentiment Words from Product Reviews

10.21203/rs.3.rs-342296/v1 ◽

2021 ◽

Author(s):

Shunxiang Zhang ◽

Han qing Xu ◽

Guang li Zhu ◽

Xiang Chen ◽

Kuang Ching Li

Keyword(s):

Data Processing ◽

Information Service ◽

Recall Rate ◽

Processing Method ◽

Syntactic Analysis ◽

Product Reviews ◽

Sequence Labeling ◽

Data Processing Method ◽

Candidate Set ◽

Sentiment Word

Abstract New sentiment words in product reviews are valuable resources that are directly close to users. The data processing of new sentiment word extraction can provide information service better for users, and provide theoretical support for the related research of edge computing. Traditional methods for extracting new sentiment words generally ignored the context and syntactic information, which leads to the low accuracy and recall rate in the process of extracting new sentiment words. To tackle the mentioned issue, we proposed a data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews. Firstly, the probability that the new word is a sentiment word is calculated through the location rules derived from the sequence labeling result, and the candidate set of new sentiment words is obtained according to the probability. Then, the candidate set of new sentiment words is supplemented with the method of matching appositive words based on edit distance. Finally, the final set of new sentiment words is collected through fine-grained filtering, including the calculation of Point Mutual Information (PMI) and difference coefficient of positive and negative corpus (DC-PNC). The experimental results illustrate the effectiveness of new sentiment words extracted by the proposed method which can obviously improve the accuracy and recall rate of sentiment analysis.

Download Full-text

Meta Self-training for Few-shot Neural Sequence Labeling

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining ◽

10.1145/3447548.3467235 ◽

2021 ◽

Author(s):

Yaqing Wang ◽

Subhabrata Mukherjee ◽

Haoda Chu ◽

Yuancheng Tu ◽

Ming Wu ◽

...

Keyword(s):

Sequence Labeling

Download Full-text

sequence labeling
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese

Semantic Sequence Labeling Model of Power Dispatching Based on Deep Long Short Term Memory Network

Improving deep learning method for biomedical named entity recognition by using entity definition information

Parsing Clinical Trial Eligibility Criteria for Cohort Query by a Multi-Input Multi-Output Sequence Labeling Model

FEE: An Event Extraction Model for Flood and Drought Disaster Based on Sequence Labeling

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Feature Selection and Extraction in Sequence Labeling for Arrhythmia Detection

Auxiliary Sequence Labeling Tasks for Disfluency Detection

A Data Processing Method Based on Sequence Labelling and Syntactic Analysis for Extracting New Sentiment Words from Product Reviews

Meta Self-training for Few-shot Neural Sequence Labeling

Export Citation Format

sequence labelingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese

Semantic Sequence Labeling Model of Power Dispatching Based on Deep Long Short Term Memory Network

Improving deep learning method for biomedical named entity recognition by using entity definition information

Parsing Clinical Trial Eligibility Criteria for Cohort Query by a Multi-Input Multi-Output Sequence Labeling Model

FEE: An Event Extraction Model for Flood and Drought Disaster Based on Sequence Labeling

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Feature Selection and Extraction in Sequence Labeling for Arrhythmia Detection

Auxiliary Sequence Labeling Tasks for Disfluency Detection

A Data Processing Method Based on Sequence Labelling and Syntactic Analysis for Extracting New Sentiment Words from Product Reviews

Meta Self-training for Few-shot Neural Sequence Labeling

sequence labeling
Recently Published Documents