Locally Adaptive Techniques for Pattern Classification

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch182 ◽

2011 ◽

pp. 1170-1175

Author(s):

Carlotta Domeniconi ◽

Dimitrios Gunopulos

Keyword(s):

Data Mining ◽

Electronic Commerce ◽

Pattern Classification ◽

Credit Card ◽

Curse Of Dimensionality ◽

Ease Of Use ◽

Practical Significance ◽

High Dimensional ◽

Finite Samples ◽

Recent Developments

Pattern classification is a very general concept with numerous applications ranging from science, engineering, target marketing, medical diagnosis and electronic commerce to weather forecast based on satellite imagery. A typical application of pattern classification is mass mailing for marketing. For example, credit card companies often mail solicitations to consumers. Naturally, they would like to target those consumers who are most likely to respond. Often, demographic information is available for those who have responded previously to such solicitations, and this information may be used in order to target the most likely respondents. Another application is electronic commerce of the new economy. E-commerce provides a rich environment to advance the state-of-the-art in classification because it demands effective means for text classification in order to make rapid product and market recommendations. Recent developments in data mining have posed new challenges to pattern classification. Data mining is a knowledge discovery process whose aim is to discover unknown relationships and/or patterns from a large set of data, from which it is possible to predict future outcomes. As such, pattern classification becomes one of the key steps in an attempt to uncover the hidden knowledge within the data. The primary goal is usually predictive accuracy, with secondary goals being speed, ease of use, and interpretability of the resulting predictive model. While pattern classification has shown promise in many areas of practical significance, it faces difficult challenges posed by real world problems, of which the most pronounced is Bellman’s curse of dimensionality: it states the fact that the sample size required to perform accurate prediction on problems with high dimensionality is beyond feasibility. This is because in high dimensional spaces data become extremely sparse and are apart from each other. As a result, severe bias that affects any estimation process can be introduced in a high dimensional feature space with finite samples. Learning tasks with data represented as a collection of a very large number of features abound. For example, microarrays contain an overwhelming number of genes relative to the number of samples. The Internet is a vast repository of disparate information growing at an exponential rate. Efficient and effective document retrieval and classification systems are required to turn the ocean of bits around us into useful information, and eventually into knowledge. This is a challenging task, since a word level representation of documents easily leads 30000 or more dimensions. This chapter discusses classification techniques to mitigate the curse of dimensionality and reduce bias, by estimating feature relevance and selecting features accordingly. This issue has both theoretical and practical relevance, since many applications can benefit from improvement in prediction performance.

Download Full-text

Distance Based Pattern Driven Mining for Outlier Detection in High Dimensional Big Dataset

ACM Transactions on Management Information Systems ◽

10.1145/3469891 ◽

2022 ◽

Vol 13 (1) ◽

pp. 1-17

Author(s):

Ankit Kumar ◽

Abhishek Kumar ◽

Ali Kashif Bashir ◽

Mamoon Rashid ◽

V. D. Ambeth Kumar ◽

...

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Outlier Detection ◽

Credit Card ◽

High Dimensional ◽

Work Efficiency ◽

Average Value ◽

Novel Method ◽

Detection Of Outliers ◽

Better Than

Detection of outliers or anomalies is one of the vital issues in pattern-driven data mining. Outlier detection detects the inconsistent behavior of individual objects. It is an important sector in the data mining field with several different applications such as detecting credit card fraud, hacking discovery and discovering criminal activities. It is necessary to develop tools used to uncover the critical information established in the extensive data. This paper investigated a novel method for detecting cluster outliers in a multidimensional dataset, capable of identifying the clusters and outliers for datasets containing noise. The proposed method can detect the groups and outliers left by the clustering process, like instant irregular sets of clusters (C) and outliers (O), to boost the results. The results obtained after applying the algorithm to the dataset improved in terms of several parameters. For the comparative analysis, the accurate average value and the recall value parameters are computed. The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm.

Download Full-text

Locally Adaptive Techniques for Pattern Classification

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch130 ◽

2011 ◽

pp. 684-688

Author(s):

Carlotta Domeniconi ◽

Dimitrios Gunopulos

Keyword(s):

Electronic Commerce ◽

Pattern Classification ◽

Text Classification ◽

Medical Diagnosis ◽

Credit Card ◽

Effective Means ◽

Weather Forecast ◽

General Concept ◽

Typical Application ◽

Locally Adaptive

Pattern classification is a very general concept with numerous applications ranging from science, engineering, target marketing, medical diagnosis, and electronic commerce to weather forecast based on satellite imagery. A typical application of pattern classification is mass mailing for marketing. For example, credit card companies often mail solicitations to consumers. Naturally, they would like to target those consumers who are most likely to respond. Often, demographic information is available for those who have responded previously to such solicitations, and this information may be used in order to target the most likely respondents. Another application is electronic commerce of the new economy. E-commerce provides a rich environment to advance the state of the art in classification, because it demands effective means for text classification in order to make rapid product and market recommendations.

Download Full-text

Classifying High-Dimensional Patterns Using a Fuzzy Logic Discriminant Network

Advances in Fuzzy Systems ◽

10.1155/2012/920920 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7 ◽

Cited By ~ 4

Author(s):

Nick J. Pizzi ◽

Witold Pedrycz

Keyword(s):

Fuzzy Logic ◽

Pattern Classification ◽

Curse Of Dimensionality ◽

High Dimensional ◽

Discriminant Functions ◽

Adaptive Network ◽

Classification Techniques ◽

Classification Technique

Although many classification techniques exist to analyze patterns possessing straightforward characteristics, they tend to fail when the ratio of features to patterns is very large. This “curse of dimensionality” is especially prevalent in many complex, voluminous biomedical datasets acquired using the latest spectroscopic modalities. To address this pattern classification issue, we present a technique using an adaptive network of fuzzy logic connectives to combine class boundaries generated by sets of discriminant functions. We empirically evaluate the effectiveness of this classification technique by comparing it against two conventional benchmark approaches, both of which use feature averaging as a preprocessing phase.

Download Full-text

Literature Study –Data Mining Techniques on Detecting Fradulent Activities in Credit Card

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i10.68 ◽

2017 ◽

Vol 6 (10) ◽

pp. 60

Author(s):

S. K. Saravanan ◽

G. N. K. Suresh Babu

Keyword(s):

Data Mining ◽

Credit Card ◽

Data Transfer ◽

Study Data ◽

Time Requirement ◽

Literature Study ◽

Data Mining Techniques ◽

Online Transactions ◽

Secured Data ◽

Credit Card Usage

In contemporary days the more secured data transfer occurs almost through internet. At same duration the risk also augments in secure data transfer. Having the rise and also light progressiveness in e – commerce, the usage of credit card (CC) online transactions has been also dramatically augmenting. The CC (credit card) usage for a safety balance transfer has been a time requirement. Credit-card fraud finding is the most significant thing like fraudsters that are augmenting every day. The intention of this survey has been assaying regarding the issues associated with credit card deception behavior utilizing data-mining methodologies. Data mining has been a clear procedure which takes data like input and also proffers throughput in the models forms or patterns forms. This investigation is very beneficial for any credit card supplier for choosing a suitable solution for their issue and for the researchers for having a comprehensive assessment of the literature in this field.

Download Full-text

CLASSIFICATION OF HIGH-DIMENSIONAL MICROARRAY DATA WITH A TWO-STEP PROCEDURE VIA A WILCOXON CRITERION AND MULTILAYER PERCEPTRON

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026811002969 ◽

2011 ◽

Vol 10 (01) ◽

pp. 1-14

Author(s):

VLADIMIR NIKULIN ◽

TIAN-HSIANG HUANG ◽

GEOFFREY J. MCLACHLAN

Keyword(s):

Data Mining ◽

Feature Selection ◽

High Dimensional ◽

Second Step ◽

Support Vector ◽

Step Procedure ◽

Leave One Out ◽

Natural Combination ◽

Feature Selection Techniques

The method presented in this paper is novel as a natural combination of two mutually dependent steps. Feature selection is a key element (first step) in our classification system, which was employed during the 2010 International RSCTC data mining (bioinformatics) Challenge. The second step may be implemented using any suitable classifier such as linear regression, support vector machine or neural networks. We conducted leave-one-out (LOO) experiments with several feature selection techniques and classifiers. Based on the LOO evaluations, we decided to use feature selection with the separation type Wilcoxon-based criterion for all final submissions. The method presented in this paper was tested successfully during the RSCTC data mining Challenge, where we achieved the top score in the Basic track.

Download Full-text