An Efficient Classification Algorithm for Traditional Textile Patterns from Different Cultures Based on Structures

Textiles have an important role in many cultures and have been digitised. They are three-dimensional objects and have complex structures, especially archaeological fabric specimens and artifact textiles created manually by traditional craftsmen. In this article, we propose a novel algorithm for textile classification based on their structures. First, a hypergraph is used to represent the textile structure. Second, multisets of k -neighbourhoods are extracted from the hypergraph and converted to one feature vector for representation of each textile. Then, the k -neighbourhood vectors are classified using seven most popular supervised learning methods. Finally, we evaluate experimentally the different variants of our approach on a data set of 1,600 textile samples with the 4-fold cross-validation technique. The experimental results indicate that comparing the variants, the best classification accuracies are 0.999 with LR, 0.994 with LDA, 0.996 with KNN, 0.994 with CART, 0.998 with NB, 0.974 with SVM, and 0.999 with NNM.

Download Full-text

Rancang Bangun Sistem Pakar untuk Deteksi Dini Katarak Menggunakan Algoritma C4.5

Jurnal ULTIMA Computing ◽

10.31937/sk.v7i2.232 ◽

2016 ◽

Vol 7 (2) ◽

pp. 48-58 ◽

Cited By ~ 1

Author(s):

Ivana Herliana W. Jayawardanu ◽

Seng Hansun

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Training Data ◽

Data Set ◽

Decision Factors ◽

C4.5 Algorithm ◽

Index Terms ◽

Validation Technique ◽

Fold Cross Validation

In 2010, 51% of 39 million blindness are caused by cataract. In 2013, there are 1.8% of 1.027.763 Indonesian people who suffered from cataract. Half of them are not treated yet due to their ignorance on the cataract disease. Therefore, in this research, we tried to build a system that can detect early cataract disease as the ophthalmologist would do. The system will use C4.5 algorithm that receives 150 training data set as an input, resulting in a set of rules which can be used as decision factors. To test the system, k-fold cross validation technique is been used with k equals to 10. From the analysis result, the accuracy of the system is 93.2% to detect cataract disease and 80.5% to detect the type of cataract disease one might suffered. Index terms-C4.5 algorithm, cataract, k-fold cross validation, machine learning

Download Full-text

The Animal Classification: An Evaluation of Different Transfer Learning Pipeline

Mekatronika ◽

10.15282/mekatronika.v3i1.6680 ◽

2021 ◽

Vol 3 (1) ◽

pp. 27-31

Author(s):

Ken-ji Ee ◽

Ahmad Fakhri Bin Ab. Nasir ◽

Anwar P. P. Abdul Majeed ◽

Mohd Azraai Mohd Razman ◽

Nur Hafieza Ismail

Keyword(s):

Transfer Learning ◽

Classification System ◽

Cross Validation ◽

Support Vector ◽

Svm Classifier ◽

Average Classification Accuracy ◽

Validation Technique ◽

Search Approach ◽

Fold Cross Validation

The animal classification system is a technology to classify the animal class (type) automatically and useful in many applications. There are many types of learning models applied to this technology recently. Nonetheless, it is worth noting that the extraction of the features and the classification of the animal features is non-trivial, particularly in the deep learning approach for a successful animal classification system. The use of Transfer Learning (TL) has been demonstrated to be a powerful tool in the extraction of essential features. However, the employment of such a method towards animal classification applications are somewhat limited. The present study aims to determine a suitable TL-conventional classifier pipeline for animal classification. The VGG16 and VGG19 were used in extracting features and then coupled with either k-Nearest Neighbour (k-NN) or Support Vector Machine (SVM) classifier. Prior to that, a total of 4000 images were gathered consisting of a total of five classes which are cows, goats, buffalos, dogs, and cats. The data was split into the ratio of 80:20 for train and test. The classifiers hyper parameters are tuned by the Grids Search approach that utilises the five-fold cross-validation technique. It was demonstrated from the study that the best TL pipeline identified is the VGG16 along with an optimised SVM, as it was able to yield an average classification accuracy of 0.975. The findings of the present investigation could facilitate animal classification application, i.e. for monitoring animals in wildlife.

Download Full-text

A Feature Selection Algorithm for Anomaly Detection in Grid Environment Using k-fold Cross Validation Technique

Advances in Intelligent Systems and Computing - Recent Advances on Soft Computing and Data Mining ◽

10.1007/978-3-319-51281-5_62 ◽

2016 ◽

pp. 619-630

Author(s):

Dahliyusmanto ◽

Tutut Herawan ◽

Syefrida Yulina ◽

Abdul Hanan Abdullah

Keyword(s):

Feature Selection ◽

Anomaly Detection ◽

Cross Validation ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Grid Environment ◽

Validation Technique ◽

Fold Cross Validation

Download Full-text

Coordinate Transformation between Global and Local Datums Based on Artificial Neural Network with K-Fold Cross-Validation: A Case Study, Ghana

Earth Sciences Research Journal ◽

10.15446/esrj.v23n1.63860 ◽

2019 ◽

Vol 23 (1) ◽

pp. 67-77 ◽

Cited By ~ 3

Author(s):

Yao Yevenyo Ziggah ◽

Hu Youjian ◽

Alfonso Rodrigo Tierra ◽

Prosper Basommi Laari

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Coordinate Transformation ◽

Cross Validation ◽

Data Partitioning ◽

Transformation Model ◽

Data Partition ◽

Data Set ◽

Artificial Neural ◽

Fold Cross Validation

The popularity of Artificial Neural Network (ANN) methodology has been growing in a wide variety of areas in geodesy and geospatial sciences. Its ability to perform coordinate transformation between different datums has been well documented in literature. In the application of the ANN methods for the coordinate transformation, only the train-test (hold-out cross-validation) approach has usually been used to evaluate their performance. Here, the data set is divided into two disjoint subsets thus, training (model building) and testing (model validation) respectively. However, one major drawback in the hold-out cross-validation procedure is inappropriate data partitioning. Improper split of the data could lead to a high variance and bias in the results generated. Besides, in a sparse dataset situation, the hold-out cross-validation is not suitable. For these reasons, the K-fold cross-validation approach has been recommended. Consequently, this study, for the first time, explored the potential of using K-fold cross-validation method in the performance assessment of radial basis function neural network and Bursa-Wolf model under data-insufficient situation in Ghana geodetic reference network. The statistical analysis of the results revealed that incorrect data partition could lead to a false reportage on the predictive performance of the transformation model. The findings revealed that the RBFNN and Bursa-Wolf model produced a transformation accuracy of 0.229 m and 0.469 m, respectively. It was also realised that a maximum horizontal error of 0.881 m and 2.131 m was given by the RBFNN and Bursa-Wolf. The obtained results per the cadastral surveying and plan production requirement set by the Ghana Survey and Mapping Division are applicable. This study will contribute to the usage of K-fold cross-validation approach in developing countries having the same sparse dataset situation like Ghana as well as in the geodetic sciences where ANN users seldom apply the statistical resampling technique.

Download Full-text

Classification of Brain Tumors from MRI Images Using a Convolutional Neural Network

Applied Sciences ◽

10.3390/app10061999 ◽

2020 ◽

Vol 10 (6) ◽

pp. 1999 ◽

Cited By ~ 7

Author(s):

Milica M. Badža ◽

Marko Č. Barjaktarović

Keyword(s):

Neural Network ◽

Machine Learning ◽

Brain Tumors ◽

Convolutional Neural Network ◽

Cross Validation ◽

Magnetic Resonance Images ◽

Generalization Capability ◽

Data Set ◽

Fold Cross Validation

The classification of brain tumors is performed by biopsy, which is not usually conducted before definitive brain surgery. The improvement of technology and machine learning can help radiologists in tumor diagnostics without invasive measures. A machine-learning algorithm that has achieved substantial results in image segmentation and classification is the convolutional neural network (CNN). We present a new CNN architecture for brain tumor classification of three tumor types. The developed network is simpler than already-existing pre-trained networks, and it was tested on T1-weighted contrast-enhanced magnetic resonance images. The performance of the network was evaluated using four approaches: combinations of two 10-fold cross-validation methods and two databases. The generalization capability of the network was tested with one of the 10-fold methods, subject-wise cross-validation, and the improvement was tested by using an augmented image database. The best result for the 10-fold cross-validation method was obtained for the record-wise cross-validation for the augmented data set, and, in that case, the accuracy was 96.56%. With good generalization capability and good execution speed, the new developed CNN architecture could be used as an effective decision-support tool for radiologists in medical diagnostics.

Download Full-text

Modeling Baseline Energy Using Artificial Neural Network – A Small Dataset Approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v12.i2.pp662-669 ◽

2018 ◽

Vol 12 (2) ◽

pp. 662

Author(s):

Wan Nazirah Wan Md Adnan ◽

Nofri Yenita Dahlan ◽

Ismail Musirin

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Cross Validation ◽

Model Development ◽

Energy Model ◽

Ann Model ◽

Data Set ◽

Validation Technique ◽

Artificial Neural ◽

Small Dataset

In this work, baseline energy model development using Artificial Neural Network (ANN) with resampling techniques; Cross Validation (CV) and Bootstrap (BS) are presented. Resampling techniques are used to examine the ability of the ANN model to deal with a small dataset. Working days, class days and Cooling Degree Days (CDD) are used as ANN input meanwhile the ANN output is monthly electricity consumption. The coefficient of correlation (R) is used as performance function to evaluate the model accuracy. For this analysis, R is calculated for the entire data set (R_all) and separately for training set (R_train), validation set (R_valid) dan testing set (R_test). The closer R to 1, the higher similarities between targeted and predicted output. The total of two different models with several number of neurons are developed and compared. It can be concluded that all models are capable to train the network. Artificial Neural Network with Bootstrap Cross Validation technique (ANN-BSCV) outperforms Artificial Neural Network with Cross Validation technique (ANN-CV). The 3-6-1 ANN-BSCV, with R_train = 0.95668, R_valid = 0.97553, R_test = 0.85726 and R_all = 0.94079 is selected as the baseline energy model to predict energy consumption for Option C IPMVP.

Download Full-text

Analyzing performance of classifiers for medical datasets

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.15.11370 ◽

2018 ◽

Vol 7 (2.15) ◽

pp. 136 ◽

Cited By ~ 1

Author(s):

Rosaida Rosly ◽

Mokhairi Makhtar ◽

Mohd Khalid Awang ◽

Mohd Isa Awang ◽

Mohd Nordin Abdul Rahman

Keyword(s):

Breast Cancer ◽

Cross Validation ◽

Ensemble Methods ◽

Data Sets ◽

Ensemble Classifiers ◽

Classification Models ◽

Data Set ◽

Mining Tool ◽

Fold Cross Validation

This paper analyses the performance of classification models using single classification and combination of ensemble method, which are Breast Cancer Wisconsin and Hepatitis data sets as training datasets. This paper presents a comparison of different classifiers based on a 10-fold cross validation using a data mining tool. In this experiment, various classifiers are implemented including three popular ensemble methods which are boosting, bagging and stacking for the combination. The result shows that for the classification of the Breast Cancer Wisconsin data set, the single classification of Naïve Bayes (NB) and a combination of bagging+NB algorithm displayed the highest accuracy at the same percentage (97.51%) compared to other combinations of ensemble classifiers. For the classification of the Hepatitisdata set, the result showed that the combination of stacking+Multi-Layer Perception (MLP) algorithm achieved a higher accuracy at 86.25%. By using the ensemble classifiers, the result may be improved. In future, a multi-classifier approach will be proposed by introducing a fusion at the classification level between these classifiers to obtain classification with higher accuracies.

Download Full-text

Multiclass Kernel Function Evaluation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.542-543.1438 ◽

2012 ◽

Vol 542-543 ◽

pp. 1438-1442

Author(s):

Ting Hua Wang ◽

Cai Yun Cai ◽

Yan Liao

Keyword(s):

Cross Validation ◽

Selection Criterion ◽

Feature Space ◽

Function Evaluation ◽

Support Vector ◽

Computationally Efficient ◽

Computational Overhead ◽

Vector Machines ◽

Validation Technique ◽

Fold Cross Validation

Kernel is a key component of the support vector machines (SVMs) and other kernel methods. Based on the data distributions of classes in the feature space, this paper proposed a model selection criterion to evaluate the goodness of a kernel in multiclass classification scenario. This criterion is computationally efficient and is differentiable with respect to the kernel parameters. Compared with the k-fold cross validation technique which is often regarded as a benchmark, this criterion is found to yield about the same performance with much less computational overhead.

Download Full-text

Importance of Holidays for Short Term Load Forecasting Using Adaptive Neural Fuzzy Inference System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.3959 ◽

2012 ◽

Vol 433-440 ◽

pp. 3959-3963 ◽

Cited By ~ 1

Author(s):

Bayram Akdemir ◽

Nurettin Çetinkaya

Keyword(s):

Energy Level ◽

Cross Validation ◽

Fuzzy Inference ◽

Load Forecasting ◽

Percentage Error ◽

Data Sets ◽

Data Set ◽

Inference System ◽

Peak Energy ◽

Fold Cross Validation

In distributing systems, load forecasting is one of the major management problems to carry on energy flowing; protect the systems, and economic management. In order to manage the system, next step of the load characteristics must be inform from historical data sets. For the forecasting, not only historical parameters are used but also external parameters such as weather conditions, seasons and populations and etc. have much importance to forecast the next behavior of the load characteristic. Holidays and week days have different affects on energy consumption in any country. In this study, target is to forecast the peak energy level the next an hour and to compare affects of week days and holidays on peak energy needs. Energy consumption data sets have nonlinear characteristics and it is not easy to fit any curve due to its nonlinearity and lots of parameters. In order to forecast peak energy level, Adaptive neural fuzzy inference system is used for hourly affects of holidays and week days on peak energy level is argued. The obtained values from output of the artificial intelligence are evaluated two fold cross validation and mean absolute percentage error. The obtained two fold cross validation error as mean absolute percentage error is 3.51 and included holidays data set has more accuracy than the data set without holiday. Total success increased 2.4%.

Download Full-text

High Accurate and a Variant of k-fold Cross Validation Technique for Predicting the Decision Tree Classifier Accuracy

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8403.0110321 ◽

2021 ◽

Vol 10 (2) ◽

pp. 105-110

Author(s):

D. Mabuni ◽

S. Aquter Babu

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Classification Accuracy ◽

Cross Validation ◽

Training Dataset ◽

Decision Tree Classification ◽

Testing Dataset ◽

Tree Classifier ◽

Validation Technique ◽

Fold Cross Validation

In machine learning data usage is the most important criterion than the logic of the program. With very big and moderate sized datasets it is possible to obtain robust and high classification accuracies but not with small and very small sized datasets. In particular only large training datasets are potential datasets for producing robust decision tree classification results. The classification results obtained by using only one training and one testing dataset pair are not reliable. Cross validation technique uses many random folds of the same dataset for training and validation. In order to obtain reliable and statistically correct classification results there is a need to apply the same algorithm on different pairs of training and validation datasets. To overcome the problem of the usage of only a single training dataset and a single testing dataset the existing k-fold cross validation technique uses cross validation plan for obtaining increased decision tree classification accuracy results. In this paper a new cross validation technique called prime fold is proposed and it is experimentally tested thoroughly and then verified correctly using many bench mark UCI machine learning datasets. It is observed that the prime fold based decision tree classification accuracy results obtained after experimentation are far better than the existing techniques of finding decision tree classification accuracies.

Download Full-text