Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.

Download Full-text

Large-Scale Text Classification with Deep Neural Networks

KIISE Transactions on Computing Practices ◽

10.5626/ktcp.2017.23.5.322 ◽

2017 ◽

Vol 23 (5) ◽

pp. 322-327

Author(s):

Hwiyeol Jo ◽

Jin-Hwa Kim ◽

Kyung-Min Kim ◽

Jeong-Ho Chang ◽

Jae-Hong Eom ◽

...

Keyword(s):

Neural Networks ◽

Text Classification ◽

Large Scale ◽

Deep Neural Networks

Download Full-text

Learning Eligibility in Cancer Clinical Trials Using Deep Neural Networks

Applied Sciences ◽

10.3390/app8071206 ◽

2018 ◽

Vol 8 (7) ◽

pp. 1206 ◽

Cited By ~ 5

Author(s):

Aurelia Bustos ◽

Antonio Pertusa

Keyword(s):

Neural Networks ◽

Clinical Trials ◽

Deep Neural Networks ◽

Medical Knowledge ◽

Clinical Information ◽

Representation Learning ◽

Free Text ◽

Cancer Clinical Trials ◽

Word Embeddings ◽

New Treatments

Interventional cancer clinical trials are generally too restrictive, and some patients are often excluded on the basis of comorbidity, past or concomitant treatments, or the fact that they are over a certain age. The efficacy and safety of new treatments for patients with these characteristics are, therefore, not defined. In this work, we built a model to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. We used protocols from cancer clinical trials that were available in public registries from the last 18 years to train word-embeddings, and we constructed a dataset of 6M short free-texts labeled as eligible or not eligible. A text classifier was trained using deep neural networks, with pre-trained word-embeddings as inputs, to predict whether or not short free-text statements describing clinical information were considered eligible. We additionally analyzed the semantic reasoning of the word-embedding representations obtained and were able to identify equivalent treatments for a type of tumor analogous with the drugs used to treat other tumors. We show that representation learning using deep neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols for potentially assisting practitioners when prescribing treatments.

Download Full-text