Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification

2020 ◽  
Vol 541 ◽  
pp. 123288 ◽  
Author(s):  
Murat Aydoğan ◽  
Ali Karci
2021 ◽  
Vol 16 (1) ◽  
pp. 1-23
Author(s):  
Keyu Yang ◽  
Yunjun Gao ◽  
Lei Liang ◽  
Song Bian ◽  
Lu Chen ◽  
...  

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.


2017 ◽  
Vol 23 (5) ◽  
pp. 322-327
Author(s):  
Hwiyeol Jo ◽  
Jin-Hwa Kim ◽  
Kyung-Min Kim ◽  
Jeong-Ho Chang ◽  
Jae-Hong Eom ◽  
...  

2018 ◽  
Vol 8 (7) ◽  
pp. 1206 ◽  
Author(s):  
Aurelia Bustos ◽  
Antonio Pertusa

Interventional cancer clinical trials are generally too restrictive, and some patients are often excluded on the basis of comorbidity, past or concomitant treatments, or the fact that they are over a certain age. The efficacy and safety of new treatments for patients with these characteristics are, therefore, not defined. In this work, we built a model to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. We used protocols from cancer clinical trials that were available in public registries from the last 18 years to train word-embeddings, and we constructed a dataset of 6M short free-texts labeled as eligible or not eligible. A text classifier was trained using deep neural networks, with pre-trained word-embeddings as inputs, to predict whether or not short free-text statements describing clinical information were considered eligible. We additionally analyzed the semantic reasoning of the word-embedding representations obtained and were able to identify equivalent treatments for a type of tumor analogous with the drugs used to treat other tumors. We show that representation learning using deep neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols for potentially assisting practitioners when prescribing treatments.


Author(s):  
Jinjing Shi ◽  
Zhenhuan Li ◽  
Wei Lai ◽  
Fangfang Li ◽  
Ronghua Shi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document