A SPARSE GREEDY SELF-ADAPTIVE ALGORITHM FOR CLASSIFICATION OF DATA

2010 ◽  
Vol 02 (01) ◽  
pp. 97-114 ◽  
Author(s):  
ANKUR SRIVASTAVA ◽  
ANDREW J. MEADE

Kernels have become an integral part of most data classification algorithms. However, the kernel parameters are generally not optimized during learning. In this work a novel adaptive technique called Sequential Function Approximation (SFA) has been developed for classification that determines the values of the control and kernel hyper-parameters during learning. This tool constructs sparse radial basis function networks in a greedy fashion. Experiments were carried out on synthetic and real-world data sets where SFA had comparable performance to other popular classification schemes with parameters optimized by an exhaustive grid search.

2014 ◽  
Vol 2014 ◽  
pp. 1-14 ◽  
Author(s):  
Yunfeng Wu ◽  
Xin Luo ◽  
Fang Zheng ◽  
Shanshan Yang ◽  
Suxian Cai ◽  
...  

This paper presents a novel adaptive linear and normalized combination (ALNC) method that can be used to combine the component radial basis function networks (RBFNs) to implement better function approximation and regression tasks. The optimization of the fusion weights is obtained by solving a constrained quadratic programming problem. According to the instantaneous errors generated by the component RBFNs, the ALNC is able to perform the selective ensemble of multiple leaners by adaptively adjusting the fusion weights from one instance to another. The results of the experiments on eight synthetic function approximation and six benchmark regression data sets show that the ALNC method can effectively help the ensemble system achieve a higher accuracy (measured in terms of mean-squared error) and the better fidelity (characterized by normalized correlation coefficient) of approximation, in relation to the popular simple average, weighted average, and the Bagging methods.


2000 ◽  
Vol 10 (06) ◽  
pp. 453-465 ◽  
Author(s):  
MARK ORR ◽  
JOHN HALLAM ◽  
KUNIO TAKEZAWA ◽  
ALAN MURRAY ◽  
SEISHI NINOMIYA ◽  
...  

We describe a method for non-parametric regression which combines regression trees with radial basis function networks. The method is similar to that of Kubat,1 who was first to suggest such a combination, but has some significant improvements. We demonstrate the features of the new method, compare its performance with other methods on DELVE data sets and apply it to a real world problem involving the classification of soybean plants from digital images.


2014 ◽  
Author(s):  
Siddanagouda Somanagouda Patil ◽  
Narasimha Murty Musti ◽  
Ulavappa Basvanneppa Angadi

Classification of large grass genome sequences has major challenges in functional genomes. The presence of motifs in grass genome chains can make the prediction of the functional behavior of grass genome possible. The correlation between grass genome properties and their motifs is not always obvious, since more than one motif may exist within a genome chain. Due to the complexity of this association most pattern classification algorithms are either vain or time consuming. Attempted to a reduction of high dimensional data that utilizes DAC technique is presented. Data are disjoining into equal multiple sets while preserving the original data distribution in each set. Then, multiple modules are created by using the data sets as independent training sets and classified into respective modules. Finally, the modules are combined to produce the final classification rules, containing all the previously extracted information. The methodology is tested using various grass genome data sets. Results indicate that the time efficiency of our algorithm is improved compared to other known data mining algorithms.


2018 ◽  
Vol 45 (3) ◽  
pp. 341-363 ◽  
Author(s):  
Muhammad Afzaal ◽  
Muhammad Usman ◽  
Alvis Fong

With the increase of online tourists reviews, discovering sentimental idea regarding a tourist place through the posted reviews is becoming a challenging task. The presence of various aspects discussed in user reviews makes it even harder to accurately extract and classify the sentiments. Aspect-based sentiment analysis aims to extract and classify user’s positive or negative orientation towards each aspect. Although several aspect-based sentiment classification methods have been proposed in the past, limited work has been targeted towards the automatic extraction of implicit, infrequent and co-referential aspects. Moreover, existing methods lack the ability to accurately classify the overall polarity of multi-aspect sentiments. This study aims to develop a predictive framework for aspect-based extraction and classification. The proposed framework utilises the semantic relations among review phrases to extract implicit and infrequent aspects for accurate sentiment predictions. Experiments have been performed using real-world data sets crawled from predominant tourist websites such as TripAdvisor and OpenTable. Experimental results and comparison with previously reported findings prove that the predictive framework not only extracts the aspects effectively but also improves the prediction accuracy of aspects.


Author(s):  
T. Novack ◽  
U. Stilla

In this work we focused on the classification of Urban Settlement Types (USTs) based on two datasets from the TerraSAR-X satellite acquired at ascending and descending look directions. These data sets comprise the intensity, amplitude and coherence images from the ascending and descending datasets. In accordance to most official UST maps, the urban blocks of our study site were considered as the elements to be classified. The considered USTs classes in this paper are: Vegetated Areas, Single-Family Houses and Commercial and Residential Buildings. Three different groups of image attributes were utilized, namely: Relative Areas, Histogram of Oriented Gradients and geometrical and contextual attributes extracted from the nodes of a Max-Tree Morphological Profile. These image attributes were submitted to three powerful soft multi-class classification algorithms. In this way, each classifier output a membership value to each of the classes. This membership values were then treated as the potentials of the unary factors of a Conditional Random Fields (CRFs) model. The pairwise factors of the CRFs model were parameterised with a Potts function. The reclassification performed with the CRFs model enabled a slight increase of the classification’s accuracy from 76% to 79% out of 1926 urban blocks.


Data mining helps to solve many problems in the area of medical diagnosis using real-world data. However, much of the data is unrealizable as it does not have desirable features and contains a lot of gaps and errors. A complete set of data is a prerequisite for precise grouping and classification of a dataset. Preprocessing is a data mining technique that transforms the unrefined dataset into reliable and useful data. It is used for resolving the issues and changes raw data for next level processing. Discretization is a necessary step for data preprocessing task. It reduces the large chunks of numeric values to a group of well-organized values. It offers remarkable improvements in speed and accuracy in classification. This paper investigates the impact of preprocessing on the classification process. This work implements three techniques such as NaiveBayes, Logistic Regression, and SVM to classify Diabetes dataset. The experimental system is validated using discretize techniques and various classification algorithms.


2021 ◽  
Vol 13 (18) ◽  
pp. 3713
Author(s):  
Jie Liu ◽  
Xin Cao ◽  
Pingchuan Zhang ◽  
Xueli Xu ◽  
Yangyang Liu ◽  
...  

As an essential step in the restoration of Terracotta Warriors, the results of fragments classification will directly affect the performance of fragments matching and splicing. However, most of the existing methods are based on traditional technology and have low accuracy in classification. A practical and effective classification method for fragments is an urgent need. In this case, an attention-based multi-scale neural network named AMS-Net is proposed to extract significant geometric and semantic features. AMS-Net is a hierarchical structure consisting of a multi-scale set abstraction block (MS-BLOCK) and a fully connected (FC) layer. MS-BLOCK consists of a local-global layer (LGLayer) and an improved multi-layer perceptron (IMLP). With a multi-scale strategy, LGLayer can parallel extract the local and global features from different scales. IMLP can concatenate the high-level and low-level features for classification tasks. Extensive experiments on the public data set (ModelNet40/10) and the real-world Terracotta Warrior fragments data set are conducted. The accuracy results with normal can achieve 93.52% and 96.22%, respectively. For real-world data sets, the accuracy is best among the existing methods. The robustness and effectiveness of the performance on the task of 3D point cloud classification are also investigated. It proves that the proposed end-to-end learning network is more effective and suitable for the classification of the Terracotta Warrior fragments.


Information ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 557
Author(s):  
Alexandre M. de Carvalho ◽  
Ronaldo C. Prati

One of the significant challenges in machine learning is the classification of imbalanced data. In many situations, standard classifiers cannot learn how to distinguish minority class examples from the others. Since many real problems are unbalanced, this problem has become very relevant and deeply studied today. This paper presents a new preprocessing method based on Delaunay tessellation and the preprocessing algorithm SMOTE (Synthetic Minority Over-sampling Technique), which we call DTO-SMOTE (Delaunay Tessellation Oversampling SMOTE). DTO-SMOTE constructs a mesh of simplices (in this paper, we use tetrahedrons) for creating synthetic examples. We compare results with five preprocessing algorithms (GEOMETRIC-SMOTE, SVM-SMOTE, SMOTE-BORDERLINE-1, SMOTE-BORDERLINE-2, and SMOTE), eight classification algorithms, and 61 binary-class data sets. For some classifiers, DTO-SMOTE has higher performance than others in terms of Area Under the ROC curve (AUC), Geometric Mean (GEO), and Generalized Index of Balanced Accuracy (IBA).


2021 ◽  
Vol 2 (01) ◽  
pp. 01-09
Author(s):  
Alan Jahwar ◽  
Nawzat Ahmed

Microarray data plays a major role in diagnosing and treating cancer. In several microarray data sets, many gene fragments are not associated with the target diseases. A solution to the gene selection problem might become important when analyzing large gene datasets. The key task is to better represent genes through optimum accuracy in classifying the samples. Different gene classification algorithms have been provided in past studies; after all, they suffered due to the selection of several genes mostly in high-dimensional microarray data. This paper aims to review classification and feature selection with different microarray datasets focused on swarm intelligence algorithms. We explain microarray data and its types in this paper briefly. Moreover, our paper presents an introduction to most common swarm intelligence algorithms. A review on swarm intelligence algorithms in gene selection profile based on classification of Microarray Data is presented in this paper.


Sign in / Sign up

Export Citation Format

Share Document