character vector Latest Research Papers

The medical information carried in electronic medical records has high clinical research value, and medical named entity recognition is the key to extracting valuable information from large-scale medical texts. At present, most of the studies on Chinese medical named entity recognition are based on character vector model or word vector model. Owing to the complexity and specificity of Chinese text, the existing methods may fail to achieve good performance. In this study, we propose a Chinese medical named entity recognition method that fuses character and word vectors. The method expresses Chinese texts as character vectors and word vectors separately and fuses them in the model for features. The proposed model can effectively avoid the problems of missing character vector information and inaccurate word vector partitioning. On the CCKS 2019 dataset for the named entity recognition task of Chinese electronic medical records, the proposed model achieves good performance and can effectively improve the accuracy of Chinese medical named entity recognition compared with other baseline models.

Download Full-text

Aspect-based sentiment analysis in Chinese based on mobile reviews for BiLSTM-CRF

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-192078 ◽

2021 ◽

pp. 1-11

Author(s):

Ya Lin Miao ◽

Wen Fang Cheng ◽

Yi Chun Ji ◽

Shun Zhang ◽

Yan Long Kong

Keyword(s):

Sentiment Analysis ◽

Conditional Random Fields ◽

Short Term Memory ◽

Recognition Rate ◽

Recall Rate ◽

Short Term ◽

Long Short Term Memory ◽

Improved Model ◽

Character Vector ◽

Precision Rate

Aiming at the problem that the Aspect-based sentiment analysis in Chinese has low recognition rate due to many steps, this paper proposes an improved BiLSTM-CRF model based on combine the Chinese character vector and Chinese words position feature, which can extract attribute words and sentiment words jointly simultaneously, while extracting Polarity judges of sentiment words. Experiments show that the improved model improves the precision rate by 9.2% 13.32%, recall rate improves 0.48% 21.29%, F-measure improves 7.33% 15.74% compared with Conditional Random Fields (CRF) model and Long Short Term Memory (LSTM) model on the self-built 6357 mobile reviews dataset. The experimental results show that the model improves the accuracy of Aspect-based sentiment analysis and can effectively obtain the information required by users need in evaluation texts.

Download Full-text

Chinese Named Entity Recognition Based on Character-Word Vector Fusion

Wireless Communications and Mobile Computing ◽

10.1155/2020/8866540 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Na Ye ◽

Xin Qin ◽

Lili Dong ◽

Xiang Zhang ◽

Kangkang Sun

Keyword(s):

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Entity Recognition ◽

Short Term ◽

Term Memory ◽

Named Entity ◽

The Neural Network ◽

Long Short Term Memory ◽

Character Vector

Due to the lack of explicit markers in Chinese text to define the boundaries of words, it is often more difficult to identify named entities in Chinese than in English. At present, the pretreatment of the character or word vector models is adopted in the training of the Chinese named entity recognition model. Aimed at the problems that taking character vector as an input of the neural network cannot use the words’ semantic meanings and give up the words’ explicit boundary information, and taking the word vector as an input of the neural network relies on the accuracy of the segmentation algorithms, a Chinese named entity recognition model based on character word vector fusion CWVF-BiLSTM-CRF (Character Word Vector Fusion-Bidirectional Long-Short Term Memory Networks-Conditional Random Field) is proposed in this paper. First, the Word2Vec is used to obtain the corresponding dictionaries of character-character vector and word-word vector. Second, the character-word vector is integrated as the input unit of the BiLSTM (Bidirectional Long-Short Term Memory) network, and then, the problem of an unreasonable tag sequence is solved using the CRF (conditional random field). By using the presented model, the dependence on the accuracy of the word segmentation algorithm is reduced, and the words’ semantic characteristics are effectively applied. The experimental results show that the model based on character-word vector fusion improves the recognition effect of the Chinese named entity.

Download Full-text

Encoder–Decoder Couplet Generation Model Based on ‘Trapezoidal Context’ Character Vector

The Computer Journal ◽

10.1093/comjnl/bxaa048 ◽

2020 ◽

Author(s):

Rui Gao ◽

Yuanyuan Zhu ◽

Mingye Li ◽

Shoufeng Li ◽

Xiaohu Shi

Keyword(s):

Real Data ◽

Vector Model ◽

Generation Model ◽

First Line ◽

Sequence Context ◽

Data Set ◽

Sequence Generation ◽

Word Context ◽

Proposed Model ◽

Character Vector

Abstract This paper studies the couplet generation model which automatically generates the second line of a couplet by giving the first line. Unlike other sequence generation problems, couplet generation not only considers the sequential context within a sentence line but also emphasizes the relationships between the corresponding words of first and second lines. Therefore, a trapezoidal context character embedding the vector model has been developed firstly, which considers the ‘sequence context’ and the ‘corresponding word context’ simultaneously. Afterwards, we chose the typical encoder–decoder framework to solve the sequence–sequence problems, of which the encoder and decoder are used by bi-directional GRU and GRU, respectively. In order to further increase the semantic consistency of the first and second lines of couplets, the pre-trained sentence vector of the first line is added to the attention mechanism in the model. To verify the effectiveness of the method, it is applied to the real data set. Experimental results show that our proposed model can compete with the up-to-date methods, and both adding sentence vectors to attention and using trapezoidal context character vectors can improve the effectiveness of the algorithm.

Download Full-text