Data Mining Applied to Transportation Mode Classification Problem

This chapter aims to give a comprehensive view about the links between fuzzy logic and data mining. It will be shown that knowledge extracted from simple data sets or huge databases can be represented by fuzzy rule-based expert systems. It is highlighted that both model performance and interpretability of the mined fuzzy models are of major importance, and effort is required to keep the resulting rule bases small and comprehensible. Therefore, in the previous years, soft computing based data mining algorithms have been developed for feature selection, feature extraction, model optimization, and model reduction (rule based simplification). Application of these techniques is illustrated using the wine data classification problem. The results illustrate that fuzzy tools can be applied in a synergistic manner through the nine steps of knowledge discovery.

Download Full-text

Bagging Probit Models for Unbalanced Classification

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch017 ◽

2010 ◽

pp. 290-296

Author(s):

Hualin Wang ◽

Xiaogang Su

Keyword(s):

Data Mining ◽

Credit Card ◽

Probit Model ◽

Classification Problem ◽

Probit Models ◽

Award Winning ◽

Integral Element ◽

Model Ensembles ◽

Unbalanced Classification ◽

Financial Company

This chapter presents an award-winning algorithm for the data mining competition of PAKDD 2007, in which the goal is to help a financial company to predict the likelihood of taking up a home loan for their credit card based customers. The involved data are very limited and characterized by very low buying rate. To tackle such an unbalanced classification problem, the authors apply a bagging algorithm based on probit model ensembles. One integral element of the algorithm is a special way of conducting the resampling in forming bootstrap samples. A brief justification is provided. This method offers a feasible and robust way to solve this difficult yet very common business problem.

Download Full-text

Data Mining for Automatic Linguistic Description of Data - Textual Weather Prediction as a Classification Problem

Proceedings of the International Conference on Agents and Artificial Intelligence ◽

10.5220/0005282905560562 ◽

2015 ◽

Cited By ~ 3

Author(s):

J. Janeiro ◽

I. Rodriguez-Fdez ◽

A. Ramos-Soto ◽

A. Bugarín

Keyword(s):

Data Mining ◽

Weather Prediction ◽

Classification Problem ◽

Linguistic Description

Download Full-text

Data Mining for Transportation Mode Recognition from Radio-data

10.1145/3460418.3479374 ◽

2021 ◽

Author(s):

Yida Zhu ◽

Haiyong Luo ◽

Song Guo ◽

Fang Zhao

Keyword(s):

Data Mining ◽

Transportation Mode ◽

Radio Data ◽

Mode Recognition

Download Full-text

Classification and Regression Trees

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch031 ◽

2011 ◽

pp. 192-195 ◽

Cited By ~ 1

Author(s):

Johannes Gehrke

Keyword(s):

Data Mining ◽

Linear Models ◽

Regression Tree ◽

Classification Problem ◽

Classification And Regression Tree ◽

Construction Methods ◽

Classification And Regression ◽

Log Linear ◽

Mining Model ◽

Decision Tables

It is the goal of classification and regression to build a data mining model that can be used for prediction. To construct such a model, we are given a set of training records, each having several attributes. These attributes can either be numerical (for example, age or salary) or categorical (for example, profession or gender). There is one distinguished attribute, the dependent attribute; the other attributes are called predictor attributes. If the dependent attribute is categorical, the problem is a classification problem. If the dependent attribute is numerical, the problem is a regression problem. It is the goal of classification and regression to construct a data mining model that predicts the (unknown) value for a record where the value of the dependent attribute is unknown. (We call such a record an unlabeled record.) Classification and regression have a wide range of applications, including scientific experiments, medical diagnosis, fraud detection, credit approval, and target marketing (Hand, 1997). Many classification and regression models have been proposed in the literature, among the more popular models are neural networks, genetic algorithms, Bayesian methods, linear and log-linear models and other statistical methods, decision tables, and tree-structured models, the focus of this chapter (Breiman, Friedman, Olshen, & Stone, 1984). Tree-structured models, socalled decision trees, are easy to understand, they are non-parametric and thus do not rely on assumptions about the data distribution, and they have fast construction methods even for large training datasets (Lim, Loh, & Shih, 2000). Most data mining suites include tools for classification and regression tree construction (Goebel & Gruenwald, 1999).

Download Full-text

Recognizing Event in Short Text Based on Decision Tree

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.237 ◽

2014 ◽

Vol 571-572 ◽

pp. 237-240

Author(s):

Jing Ya Lu ◽

Wan Li Zuo ◽

Liang Zhu

Keyword(s):

Data Mining ◽

Decision Tree ◽

Information Age ◽

Classification Problem ◽

Classification Algorithm ◽

Research Field ◽

Short Text ◽

Decision Tree Classification ◽

New Research ◽

Primary Problem

Mining newsworthy events from a large number of microblogging information is not only the primary problem that several big microblogging websites need to solve, but also a new research field in micro-information age. For now, a lot of study about even recognizing has been made at home and abroad, but relatively rarely contrapose short text (microblogging message). The paper considers newsworthy event recognizing in short text as classification problem, utilizes the decision tree classification algorithm in data mining, sufficiently mines features of event in short text, and then recognizes the newsworthy event in microblogging. In the last, we verify the effect of the model.

Download Full-text

A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification

Journal of Healthcare Engineering ◽

10.1155/2022/3498123 ◽

2022 ◽

Vol 2022 ◽

pp. 1-17

Author(s):

Rukhma Qasim ◽

Waqas Haider Bangyal ◽

Mohammed A. Alqarni ◽

Abdulwahab Ali Almazroi

Keyword(s):

Data Mining ◽

Social Media ◽

Transfer Learning ◽

Language Processing ◽

Text Classification ◽

Hate Speech ◽

Classification Problem ◽

Learning Approaches ◽

Fake News ◽

Targeted Marketing

Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.

Download Full-text