F-HMTC: Detecting Financial Events for Investment Decisions Based on Neural Hierarchical Multi-Label Text Classification

The share prices of listed companies in the stock trading market are prone to be influenced by various events. Performing event detection could help people to timely identify investment risks and opportunities accompanying these events. The financial events inherently present hierarchical structures, which could be represented as tree-structured schemes in real-life applications, and detecting events could be modeled as a hierarchical multi-label text classification problem, where an event is designated to a tree node with a sequence of hierarchical event category labels. Conventional hierarchical multi-label text classification methods usually ignore the hierarchical relationships existing in the event classification scheme, and treat the hierarchical labels associated with an event as uniform labels, where correct or wrong label predictions are assigned with equal rewards or penalties. In this paper, we propose a neural hierarchical multi-label text classification method, namely F-HMTC, for a financial application scenario with massive event category labels. F-HMTC learns the latent features based on bidirectional encoder representations from transformers, and directly maps them to hierarchical labels with a delicate hierarchy-based loss layer. We conduct extensive experiments on a private financial dataset with elaborately-annotated labels, and F-HMTC consistently outperforms state-of-art baselines by substantial margins. We will release both the source codes and dataset on the first author's repository.

Download Full-text

Learning to Rank for Review Rating Prediction

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.6286 ◽

2014 ◽

Vol 556-562 ◽

pp. 6286-6289

Author(s):

Nian Li ◽

Li Yin ◽

Qing Xi Peng

Keyword(s):

Learning To Rank ◽

Empirical Studies ◽

Real Life ◽

Classification Problem ◽

Machine Learning Method ◽

Regression Problem ◽

The Public ◽

Rating Prediction ◽

Multi Classification ◽

New Machine

The Internet has experienced profound changes. Large amount of user-generated-contents provide valuable information to the public. Customers usually express their opinion in online shopping. After they finish the reviews, they give an overall rating to the product or service. In this paper, we focus on the review rating prediction problem. Previous studies usually regard this problem as a regression problem. We take a new machine learning method to solve the problem. Learning to rank method has been exploited to tackle the prediction. After feature selection, the maximum entropy classifier has been employed to solve the multi-classification problem. The real life dataset has been crawled to verify the proposed method. Empirical studies demonstrate the proposed method outperform the baseline methods.

Download Full-text

The Algorithm Study of Support Vector Machine Based on Social Network of Public Opinions

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.513-517.2394 ◽

2014 ◽

Vol 513-517 ◽

pp. 2394-2397

Author(s):

Hong Biao Xie ◽

Hong Jun Qiu

Keyword(s):

Public Opinion ◽

Social Network ◽

Text Classification ◽

Social Groups ◽

Classification Problem ◽

Support Vector ◽

Social Stability ◽

Social Phenomena ◽

Svm Algorithm ◽

Network Link

Public opinion refers to the certain social groups subjective reflection of certain social phenomena and reality within a period of time. The important measures to maintain social stability and the ruling party's ruling safety are to instantly master the dynamic public opinion and to actively guide social public opinion. In this paper, the author found the model of social network public opinion hotspot issues. The SVM algorithm is adopted to improve the information processing and analysis testing, effectively resolving the text classification problem. It verifies that this method plays an important role in the hot issues analyses of the network link.

Download Full-text

Multiple weak supervision for short text classification

Applied Intelligence ◽

10.1007/s10489-021-02958-3 ◽

2022 ◽

Author(s):

Li-Ming Chen ◽

Bao-Xin Xiu ◽

Zhao-Yun Ding

Keyword(s):

Text Classification ◽

Classification Problem ◽

Experimental Results ◽

Prior Work ◽

Weak Supervision ◽

Short Text ◽

Imbalanced Classification ◽

Distant Supervision ◽

Synthetic Datasets ◽

Independent Model

AbstractFor short text classification, insufficient labeled data, data sparsity, and imbalanced classification have become three major challenges. For this, we proposed multiple weak supervision, which can label unlabeled data automatically. Different from prior work, the proposed method can generate probabilistic labels through conditional independent model. What’s more, experiments were conducted to verify the effectiveness of multiple weak supervision. According to experimental results on public dadasets, real datasets and synthetic datasets, unlabeled imbalanced short text classification problem can be solved effectively by multiple weak supervision. Notably, without reducing precision, recall, and F1-score can be improved by adding distant supervision clustering, which can be used to meet different application needs.

Download Full-text

HONNs with Extreme Learning Machine to Handle Incomplete Datasets

Artificial Higher Order Neural Networks for Modeling and Simulation ◽

10.4018/978-1-4666-2175-6.ch013 ◽

2013 ◽

pp. 276-292

Author(s):

Shuxiang Xu

Keyword(s):

Neural Networks ◽

Missing Data ◽

Extreme Learning Machine ◽

Real Life ◽

Bankruptcy Prediction ◽

Share Prices ◽

Fast Learning ◽

Business Cases ◽

Learning Machine ◽

Hidden Layer

An Extreme Learning Machine (ELM) randomly chooses hidden neurons and analytically determines the output weights (Huang, et al., 2005, 2006, 2008). With the ELM algorithm, only the connection weights between hidden layer and output layer are adjusted. The ELM algorithm tends to generalize better at a very fast learning speed: it can learn thousands of times faster than conventionally popular learning algorithms (Huang, et al., 2006). Artificial Neural Networks (ANNs) have been widely used as powerful information processing models and adopted in applications such as bankruptcy prediction, predicting costs, forecasting revenue, forecasting share prices and exchange rates, processing documents, and many more. Higher Order Neural Networks (HONNs) are ANNs in which the net input to a computational neuron is a weighted sum of products of its inputs. Real life data are not usually perfect. They contain wrong, incomplete, or vague data. Hence, it is usual to find missing data in many information sources used. Missing data is a common problem in statistical analysis (Little & Rubin, 1987). This chapter uses the Extreme Learning Machine (ELM) algorithm for HONN models and applies it in several significant business cases, which involve missing datasets. The experimental results demonstrate that HONN models with the ELM algorithm offer significant advantages over standard HONN models, such as faster training, as well as improved generalization abilities.

Download Full-text

Detection of Economy-Related Turkish Tweets Based on Machine Learning Approaches

10.4018/978-1-7998-8413-2.ch008 ◽

2022 ◽

pp. 171-195

Author(s):

Jale Bektaş

Keyword(s):

Machine Learning ◽

Text Mining ◽

Text Classification ◽

Integration Method ◽

Classification Problem ◽

Feature Representation ◽

Learning Approaches ◽

Machine Learning Methods ◽

Linguistic Approach ◽

Turkish Language

Conducting NLP for Turkish is a lot harder than other Latin-based languages such as English. In this study, by using text mining techniques, a pre-processing frame is conducted in which TF-IDF values are calculated in accordance with a linguistic approach on 7,731 tweets shared by 13 famous economists in Turkey, retrieved from Twitter. Then, the classification results are compared with four common machine learning methods (SVM, Naive Bayes, LR, and integration LR with SVM). The features represented by the TF-IDF are experimented in different N-grams. The findings show the success of a text classification problem is relative with the feature representation methods, and the performance superiority of SVM is better compared to other ML methods with unigram feature representation. The best results are obtained via the integration method of SVM with LR with the Acc of 82.9%. These results show that these methodologies are satisfying for the Turkish language.

Download Full-text

TEXT CLASSIFICATION USING MODIFIED MULTI CLASS ASSOCIATION RULE

Jurnal Teknologi ◽

10.11113/jt.v78.9553 ◽

2016 ◽

Vol 78 (8-2) ◽

Author(s):

Siti Sakira Kamaruddin ◽

Yuhanis Yusof ◽

Husniza Husni ◽

Mohammad Hayel Al Refai

Keyword(s):

Text Classification ◽

Association Rule ◽

Classification Accuracy ◽

Classification Problem ◽

Frequent Pattern ◽

Associative Classification ◽

Vertical Data ◽

Rule Method ◽

Class Association Rule ◽

Two Stages

This paper presents text classification using a modified Multi Class Association Rule Method. The method is based on Associative Classification which combines classification with association rule discovery. Although previous work proved that Associative Classification produces better classification accuracy compared to typical classifiers, the study on applying Associative Classification to solve text classification problem are limited due to the common problem of high dimensionality of text data and this will consequently results in exponential number of generated classification rules. To overcome this problem the modified Multi-Class Association Rule Method was enhanced in two stages. In stage one the frequent pattern are represented using a proposed vertical data format to reduce the text dimensionality problem and in stage two the generated rule was pruned using a proposed Partial Rule Match to reduce the number of generated rules. The proposed method was tested on a text classification problem and the result shows that it performed better than the existing method in terms of classification accuracy and number of generated rules.

Download Full-text

Centroid Estimation Based on Symmetric KL Divergence for Multinomial Text Classification Problem

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) ◽

10.1109/icmla.2018.00189 ◽

2018 ◽

Cited By ~ 3

Author(s):

Jiangning Chen ◽

Heinrich Matzinger ◽

Haoyan Zhai ◽

Mi Zhou

Keyword(s):

Text Classification ◽

Classification Problem ◽

Kl Divergence ◽

Centroid Estimation

Download Full-text

Ant colony algorithm for text classification in multicore-multithread environment

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i3.pp1359-1366 ◽

2020 ◽

Vol 18 (3) ◽

pp. 1359

Author(s):

Ahmad Nazmi Fadzal ◽

Mazidah Puteh ◽

Nurazzah Abd Rahman

Keyword(s):

Text Classification ◽

Execution Time ◽

Ant Colony Algorithm ◽

Classification Problem ◽

Ant Colony ◽

Main Concept ◽

Processing Power ◽

Artificial Ants ◽

Positive Time ◽

Time Reduction

This paper presents about Ant Colony Algorithm (ACO) for Text Classification in Multicore-Multithread Environment in Artificial Intelligent domain. We had develop a software which assimilate concurrency concept to multiple artificial ants. Pheromone in ACO is the main concept used to solve the text classification problem. In regards to its role, pheromone value is changed depending on the solution finding that has been discovered at the pseudo random heuristic attempt in selecting path from text words. However, ACO can take up longer time to process larger training document. Based on the cooperative concept of ants living in colony, the ACO part is examined to work in multicore-multithread environment as to cater additional execution time benefit. In running multicore-multithread environment, the modification aims to make artificial ants actively communicate between multiple physical cores of processor. The execution time reduction is expected to show an improvement without compromising the original classification accuracy by the investment of trading on more processing power. The single and multicore-multithreaded version of ACO was compared statistically by conduction relevant test. It was found that the result shows a positive time reduction improvement.

Download Full-text

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/642 ◽

2018 ◽

Cited By ~ 3

Author(s):

Renjie Zheng ◽

Junkun Chen ◽

Xipeng Qiu

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Representation Learning ◽

Training Data ◽

Specific Information ◽

Distributed Representation ◽

Source Codes ◽

Multiple Tasks ◽

Classification Tasks

Distributed representation plays an important role in deep learning based natural language processing. However, the representation of a sentence often varies in different tasks, which is usually learned from scratch and suffers from the limited amounts of training data. In this paper, we claim that a good sentence representation should be invariant and can benefit the various subsequent tasks. To achieve this purpose, we propose a new scheme of information sharing for multi-task learning. More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanisms. The query vector of each task's attention could be either static parameters or generated dynamically. We conduct extensive experiments on 16 different text classification tasks, which demonstrate the benefits of our architecture. Source codes of this paper are available on Github.

Download Full-text

A comparative study on feature selection in Chinese text classification problem

2012 First National Conference for Engineering Sciences (FNCES 2012) ◽

10.1109/nces.2012.6544065 ◽

2012 ◽

Author(s):

Hu Li ◽

Peng Zou ◽

Weihong Han

Keyword(s):

Feature Selection ◽

Comparative Study ◽

Chinese Text ◽

Text Classification ◽

Classification Problem ◽

Chinese Text Classification

Download Full-text