Research on text summarization classification based on crowdfunding projects

In recent years, artificial intelligence technologies represented by deep learning and natural language processing have made huge breakthroughs and have begun to emerge in the field of crowdfunding project analysis. Natural language processing technology enables machines to understand and analyze the text of crowdfunding projects, and classify them based on the summary description of the project, which can help companies and individuals improve the project pass rate, so it has received widespread attention. However, most of the current researches are mostly applied to topic modeling of project texts. Few studies have proposed effective solutions for classification prediction based on abstracts of crowdfunding projects. Therefore, this paper proposes a sequence-enhanced capsule network model for this problem. Specifically, based on the work of the capsule network, we propose to connect BiGRU and CapsNet in order to achieve the effect of considering both the sequence semantic information and spatial location information of the text. We apply the proposed method to the kickstarter-NLP dataset, and the experimental results prove that our model has a good classification effect in this case.

Download Full-text

Syntactic and semantic information extraction from NPP procedures utilizing natural language processing integrated with rules

Nuclear Engineering and Technology ◽

10.1016/j.net.2020.08.010 ◽

2020 ◽

Author(s):

Yongsun Choi ◽

Minh Duc Nguyen ◽

Thomas N. Kerr

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Semantic Information

Download Full-text

Natural Language Processing Techniques for the Extraction of Semantic Information in Web Services

2008 Seventh Mexican International Conference on Artificial Intelligence ◽

10.1109/micai.2008.50 ◽

2008 ◽

Cited By ~ 2

Author(s):

Maricela Bravo ◽

Azucena Montes ◽

Alejandro Reyes

Keyword(s):

Natural Language Processing ◽

Web Services ◽

Natural Language ◽

Language Processing ◽

Semantic Information ◽

Processing Techniques

Download Full-text

Citation Classification Prediction Implying Text Features Using Natural Language Processing and Supervised Machine Learning Algorithms

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0507-9_46 ◽

2021 ◽

pp. 540-552

Author(s):

Priya Porwal ◽

Manoj H. Devare

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Text Features ◽

Classification Prediction

Download Full-text

A FORMULA TO CALCULATE PRUNING THRESHOLD FOR THE PART-OF-SPEECH TAGGING PROBLEM

Vietnam Journal of Science and Technology ◽

10.15625/2525-2518/54/3a/11959 ◽

2018 ◽

Vol 54 (3A) ◽

pp. 64

Author(s):

Nguyen Chi Hieu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Information ◽

Viterbi Algorithm ◽

Wall Street ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

The Wall Street Journal ◽

Speech Tagging

The exact tagging of the words in the texts is a very important task in the natural language processing. It can support parsing the text, contribute to the solution of the polysemous word, and help to access a semantic information, etc. One of crucial factors in the POS (Part-of-Speech) tagging approaches based on the statistical method is the processing time. In this paper, we propose an approach to calculate the pruning threshold, which can apply into the Viterbi algorithm of Hidden Markov model for tagging the texts in the natural language processing. Experiment on the 1.000.000 words on the tag of the Wall Street Journal corpus showed that our proposed solution is satisfactory.

Download Full-text

Augment BERT with average pooling layer for Chinese summary generation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211229 ◽

2021 ◽

pp. 1-10

Author(s):

Shuai Zhao ◽

Fucheng You ◽

Wen Chang ◽

Tianyu Zhang ◽

Man Hu

Keyword(s):

Experimental Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Information ◽

Chinese Language ◽

Language Model ◽

Fine Tuning ◽

Generation Model ◽

Expected Effect

The BERT pre-trained language model has achieved good results in various subtasks of natural language processing, but its performance in generating Chinese summaries is not ideal. The most intuitive reason is that the BERT model is based on character-level composition, while the Chinese language is mostly in the form of phrases. Directly fine-tuning the BERT model cannot achieve the expected effect. This paper proposes a novel summary generation model with BERT augmented by the pooling layer. In our model, we perform an average pooling operation on token embedding to improve the model’s ability to capture phrase-level semantic information. We use LCSTS and NLPCC2017 to verify our proposed method. Experimental data shows that the average pooling model’s introduction can effectively improve the generated summary quality. Furthermore, different data needs to be set with varying pooling kernel sizes to achieve the best results through comparative analysis. In addition, our proposed method has strong generalizability. It can be applied not only to the task of generating summaries, but also to other natural language processing tasks.

Download Full-text

Unsupervised Post-Processing of Word Vectors via Conceptor Negation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016778 ◽

2019 ◽

Vol 33 ◽

pp. 6778-6785

Author(s):

Tianlin Liu ◽

Lyle Ungar ◽

João Sedoc

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Information ◽

State Of The Art ◽

Processing Technique ◽

Post Processing ◽

Latent Features ◽

State Tracking ◽

Natural Language Processing Task

Word vectors are at the core of many natural language processing tasks. Recently, there has been interest in post-processing word vectors to enrich their semantic information. In this paper, we introduce a novel word vector post-processing technique based on matrix conceptors (Jaeger 2014), a family of regularized identity maps. More concretely, we propose to use conceptors to suppress those latent features of word vectors having high variances. The proposed method is purely unsupervised: it does not rely on any corpus or external linguistic database. We evaluate the post-processed word vectors on a battery of intrinsic lexical evaluation tasks, showing that the proposed method consistently outperforms existing state-of-the-art alternatives. We also show that post-processed word vectors can be used for the downstream natural language processing task of dialogue state tracking, yielding improved results in different dialogue domains.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text

Natural Language Processing in the Clinical Setting

PsycEXTRA Dataset ◽

10.1037/e615572012-013 ◽

2012 ◽

Author(s):

Thomas H. Payne

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Setting

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text