A SPARSE GREEDY SELF-ADAPTIVE ALGORITHM FOR CLASSIFICATION OF DATA

Kernels have become an integral part of most data classification algorithms. However, the kernel parameters are generally not optimized during learning. In this work a novel adaptive technique called Sequential Function Approximation (SFA) has been developed for classification that determines the values of the control and kernel hyper-parameters during learning. This tool constructs sparse radial basis function networks in a greedy fashion. Experiments were carried out on synthetic and real-world data sets where SFA had comparable performance to other popular classification schemes with parameters optimized by an exhaustive grid search.

Download Full-text

Adaptive Linear and Normalized Combination of Radial Basis Function Networks for Function Approximation and Regression

Mathematical Problems in Engineering ◽

10.1155/2014/913897 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 8

Author(s):

Yunfeng Wu ◽

Xin Luo ◽

Fang Zheng ◽

Shanshan Yang ◽

Suxian Cai ◽

...

Keyword(s):

Radial Basis Function ◽

Basis Function ◽

Function Approximation ◽

Mean Squared Error ◽

Weighted Average ◽

Radial Basis Function Networks ◽

Data Sets ◽

Radial Basis ◽

Synthetic Function ◽

Adaptively Adjusting

This paper presents a novel adaptive linear and normalized combination (ALNC) method that can be used to combine the component radial basis function networks (RBFNs) to implement better function approximation and regression tasks. The optimization of the fusion weights is obtained by solving a constrained quadratic programming problem. According to the instantaneous errors generated by the component RBFNs, the ALNC is able to perform the selective ensemble of multiple leaners by adaptively adjusting the fusion weights from one instance to another. The results of the experiments on eight synthetic function approximation and six benchmark regression data sets show that the ALNC method can effectively help the ensemble system achieve a higher accuracy (measured in terms of mean-squared error) and the better fidelity (characterized by normalized correlation coefficient) of approximation, in relation to the popular simple average, weighted average, and the Bagging methods.

Download Full-text

COMBINING REGRESSION TREES AND RADIAL BASIS FUNCTION NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065700000363 ◽

2000 ◽

Vol 10 (06) ◽

pp. 453-465 ◽

Cited By ~ 18

Author(s):

MARK ORR ◽

JOHN HALLAM ◽

KUNIO TAKEZAWA ◽

ALAN MURRAY ◽

SEISHI NINOMIYA ◽

...

Keyword(s):

Radial Basis Function ◽

Basis Function ◽

Regression Trees ◽

Radial Basis Function Networks ◽

Data Sets ◽

Parametric Regression ◽

Real World Problem ◽

Radial Basis ◽

Soybean Plants

We describe a method for non-parametric regression which combines regression trees with radial basis function networks. The method is similar to that of Kubat,1 who was first to suggest such a combination, but has some significant improvements. We demonstrate the features of the new method, compare its performance with other methods on DELVE data sets and apply it to a real world problem involving the classification of soybean plants from digital images.

Download Full-text

Divide and Conquer approach for Genome Classification based on subclass characterization

10.1101/003475 ◽

2014 ◽

Author(s):

Siddanagouda Somanagouda Patil ◽

Narasimha Murty Musti ◽

Ulavappa Basvanneppa Angadi

Keyword(s):

Original Data ◽

Divide And Conquer ◽

Classification Algorithms ◽

Data Sets ◽

Genome Data ◽

Data Mining Algorithms ◽

Functional Behavior ◽

A Genome ◽

Mining Algorithms

Classification of large grass genome sequences has major challenges in functional genomes. The presence of motifs in grass genome chains can make the prediction of the functional behavior of grass genome possible. The correlation between grass genome properties and their motifs is not always obvious, since more than one motif may exist within a genome chain. Due to the complexity of this association most pattern classification algorithms are either vain or time consuming. Attempted to a reduction of high dimensional data that utilizes DAC technique is presented. Data are disjoining into equal multiple sets while preserving the original data distribution in each set. Then, multiple modules are created by using the data sets as independent training sets and classified into respective modules. Finally, the modules are combined to produce the final classification rules, containing all the previously extracted information. The methodology is tested using various grass genome data sets. Results indicate that the time efficiency of our algorithm is improved compared to other known data mining algorithms.

Download Full-text

Predictive aspect-based sentiment classification of online tourist reviews

Journal of Information Science ◽

10.1177/0165551518789872 ◽

2018 ◽

Vol 45 (3) ◽

pp. 341-363 ◽

Cited By ~ 4

Author(s):

Muhammad Afzaal ◽

Muhammad Usman ◽

Alvis Fong

Keyword(s):

Prediction Accuracy ◽

Sentiment Classification ◽

Semantic Relations ◽

Automatic Extraction ◽

Data Sets ◽

Classification Methods ◽

Real World Data ◽

The Past ◽

Negative Orientation

With the increase of online tourists reviews, discovering sentimental idea regarding a tourist place through the posted reviews is becoming a challenging task. The presence of various aspects discussed in user reviews makes it even harder to accurately extract and classify the sentiments. Aspect-based sentiment analysis aims to extract and classify user’s positive or negative orientation towards each aspect. Although several aspect-based sentiment classification methods have been proposed in the past, limited work has been targeted towards the automatic extraction of implicit, infrequent and co-referential aspects. Moreover, existing methods lack the ability to accurately classify the overall polarity of multi-aspect sentiments. This study aims to develop a predictive framework for aspect-based extraction and classification. The proposed framework utilises the semantic relations among review phrases to extract implicit and infrequent aspects for accurate sentiment predictions. Experiments have been performed using real-world data sets crawled from predominant tourist websites such as TripAdvisor and OpenTable. Experimental results and comparison with previously reported findings prove that the predictive framework not only extracts the aspects effectively but also improves the prediction accuracy of aspects.

Download Full-text

A parsimonious SVM model selection criterion for classification of real-world data sets via an adaptive population-based algorithm

Neural Computing and Applications ◽

10.1007/s00521-017-2930-y ◽

2017 ◽

Vol 30 (11) ◽

pp. 3421-3429 ◽

Cited By ~ 2

Author(s):

Omid Naghash Almasi ◽

Mohammad Hassan Khooban

Keyword(s):

Model Selection ◽

Real World ◽

Selection Criterion ◽

Population Based ◽

Data Sets ◽

Real World Data ◽

World Data ◽

Svm Model ◽

Population Based Algorithm

Download Full-text

DISCRIMINATION OF URBAN SETTLEMENT TYPES BASED ON SPACE-BORNE SAR DATASETS AND A CONDITIONAL RANDOM FIELDS MODEL

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-ii-3-w4-143-2015 ◽

2015 ◽

Vol II-3/W4 ◽

pp. 143-148 ◽

Cited By ~ 1

Author(s):

T. Novack ◽

U. Stilla

Keyword(s):

Random Fields ◽

Conditional Random Fields ◽

Residential Buildings ◽

Classification Algorithms ◽

Data Sets ◽

Single Family ◽

Urban Settlement ◽

Membership Value ◽

Multi Class Classification

In this work we focused on the classification of Urban Settlement Types (USTs) based on two datasets from the TerraSAR-X satellite acquired at ascending and descending look directions. These data sets comprise the intensity, amplitude and coherence images from the ascending and descending datasets. In accordance to most official UST maps, the urban blocks of our study site were considered as the elements to be classified. The considered USTs classes in this paper are: Vegetated Areas, Single-Family Houses and Commercial and Residential Buildings. Three different groups of image attributes were utilized, namely: Relative Areas, Histogram of Oriented Gradients and geometrical and contextual attributes extracted from the nodes of a Max-Tree Morphological Profile. These image attributes were submitted to three powerful soft multi-class classification algorithms. In this way, each classifier output a membership value to each of the classes. This membership values were then treated as the potentials of the unary factors of a Conditional Random Fields (CRFs) model. The pairwise factors of the CRFs model were parameterised with a Potts function. The reclassification performed with the CRFs model enabled a slight increase of the classification’s accuracy from 76% to 79% out of 1926 urban blocks.

Download Full-text

Provissional Access For Improving Classification Accuracy On Diabetes Dataset

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f9389.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 5245-5248

Keyword(s):

Data Mining ◽

Experimental System ◽

Classification Algorithms ◽

Data Mining Technique ◽

Real World Data ◽

Mining Technique ◽

Complete Set ◽

The Impact ◽

Speed And Accuracy

Data mining helps to solve many problems in the area of medical diagnosis using real-world data. However, much of the data is unrealizable as it does not have desirable features and contains a lot of gaps and errors. A complete set of data is a prerequisite for precise grouping and classification of a dataset. Preprocessing is a data mining technique that transforms the unrefined dataset into reliable and useful data. It is used for resolving the issues and changes raw data for next level processing. Discretization is a necessary step for data preprocessing task. It reduces the large chunks of numeric values to a group of well-organized values. It offers remarkable improvements in speed and accuracy in classification. This paper investigates the impact of preprocessing on the classification process. This work implements three techniques such as NaiveBayes, Logistic Regression, and SVM to classify Diabetes dataset. The experimental system is validated using discretize techniques and various classification algorithms.

Download Full-text

AMS-Net: An Attention-Based Multi-Scale Network for Classification of 3D Terracotta Warrior Fragments

Remote Sensing ◽

10.3390/rs13183713 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3713

Author(s):

Jie Liu ◽

Xin Cao ◽

Pingchuan Zhang ◽

Xueli Xu ◽

Yangyang Liu ◽

...

Keyword(s):

Real World ◽

Data Sets ◽

Semantic Features ◽

Real World Data ◽

Global Features ◽

Data Set ◽

Multi Scale ◽

Public Data ◽

High Level

As an essential step in the restoration of Terracotta Warriors, the results of fragments classification will directly affect the performance of fragments matching and splicing. However, most of the existing methods are based on traditional technology and have low accuracy in classification. A practical and effective classification method for fragments is an urgent need. In this case, an attention-based multi-scale neural network named AMS-Net is proposed to extract significant geometric and semantic features. AMS-Net is a hierarchical structure consisting of a multi-scale set abstraction block (MS-BLOCK) and a fully connected (FC) layer. MS-BLOCK consists of a local-global layer (LGLayer) and an improved multi-layer perceptron (IMLP). With a multi-scale strategy, LGLayer can parallel extract the local and global features from different scales. IMLP can concatenate the high-level and low-level features for classification tasks. Extensive experiments on the public data set (ModelNet40/10) and the real-world Terracotta Warrior fragments data set are conducted. The accuracy results with normal can achieve 93.52% and 96.22%, respectively. For real-world data sets, the accuracy is best among the existing methods. The robustness and effectiveness of the performance on the task of 3D point cloud classification are also investigated. It proves that the proposed end-to-end learning network is more effective and suitable for the classification of the Terracotta Warrior fragments.

Download Full-text

DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets

Information ◽

10.3390/info11120557 ◽

2020 ◽

Vol 11 (12) ◽

pp. 557

Author(s):

Alexandre M. de Carvalho ◽

Ronaldo C. Prati

Keyword(s):

Machine Learning ◽

Geometric Mean ◽

Imbalanced Data ◽

Sampling Technique ◽

Classification Algorithms ◽

Data Sets ◽

Delaunay Tessellation ◽

Minority Class ◽

Imbalanced Data Sets

One of the significant challenges in machine learning is the classification of imbalanced data. In many situations, standard classifiers cannot learn how to distinguish minority class examples from the others. Since many real problems are unbalanced, this problem has become very relevant and deeply studied today. This paper presents a new preprocessing method based on Delaunay tessellation and the preprocessing algorithm SMOTE (Synthetic Minority Over-sampling Technique), which we call DTO-SMOTE (Delaunay Tessellation Oversampling SMOTE). DTO-SMOTE constructs a mesh of simplices (in this paper, we use tetrahedrons) for creating synthetic examples. We compare results with five preprocessing algorithms (GEOMETRIC-SMOTE, SVM-SMOTE, SMOTE-BORDERLINE-1, SMOTE-BORDERLINE-2, and SMOTE), eight classification algorithms, and 61 binary-class data sets. For some classifiers, DTO-SMOTE has higher performance than others in terms of Area Under the ROC curve (AUC), Geometric Mean (GEO), and Generalized Index of Balanced Accuracy (IBA).

Download Full-text

Swarm Intelligence Algorithms in Gene Selection Profile Based on Classification of Microarray Data: A Review

Journal of Applied Science and Technology Trends ◽

10.38094/jastt20161 ◽

2021 ◽

Vol 2 (01) ◽

pp. 01-09

Author(s):

Alan Jahwar ◽

Nawzat Ahmed

Keyword(s):

Swarm Intelligence ◽

Microarray Data ◽

Gene Selection ◽

Classification Algorithms ◽

Data Sets ◽

Paper Briefly ◽

Large Gene ◽

Microarray Datasets ◽

Selection Of

Microarray data plays a major role in diagnosing and treating cancer. In several microarray data sets, many gene fragments are not associated with the target diseases. A solution to the gene selection problem might become important when analyzing large gene datasets. The key task is to better represent genes through optimum accuracy in classifying the samples. Different gene classification algorithms have been provided in past studies; after all, they suffered due to the selection of several genes mostly in high-dimensional microarray data. This paper aims to review classification and feature selection with different microarray datasets focused on swarm intelligence algorithms. We explain microarray data and its types in this paper briefly. Moreover, our paper presents an introduction to most common swarm intelligence algorithms. A review on swarm intelligence algorithms in gene selection profile based on classification of Microarray Data is presented in this paper.

Download Full-text