PENGARUH SELEKSI FITUR CITRA TERHADAP KLASIFIKASI TINGKAT KESEGARAN DAGING SAPI LOKAL

Identifying beef manually has some drawbacks because human visual has limitations and there are differences of human perception in assessing object quality. Several researches developed beef quality assessment methods based on image feature extraction. However, not all features support for obtaining the classification results that have high accuracy. The efficiency will be achieved if the classification analyzes only the relevant features. Therefore, a feature selection process is required to select relevant features and to eliminate irrelevant features to obtain more accurate and faster classification results. One of the feature selection algorithms is the F-Score which is a simple technique that measures the discrimination of two sets of real numbers. The features with the lowest ranking from the F-Score will be eliminated one by one until the most relevant features are obtained. The test is carried out by analyzing the classification results in the form of sensitivity, specificity, and accuracy values. The results of this research showed that by using the F-Score feature, the most relevant features for the classification of freshness level of local beef are obtained using the K-Nearest Neighbor (KNN) method. These features include the average color intensity R and standard deviation with a sensitivity of 0.8, a specificity of 0.93, and an accuracy of 86%. Keywords: Classification, Fiture Selection, F-Score, K-Nearest Neighbor, Local beef

Download Full-text

Health State Classification of a Spherical Tank Using a Hybrid Bag of Features and K-Nearest Neighbor

Applied Sciences ◽

10.3390/app10072525 ◽

2020 ◽

Vol 10 (7) ◽

pp. 2525 ◽

Cited By ~ 4

Author(s):

Md Junayed Hasan ◽

Jaeyoung Kim ◽

Cheol Hong Kim ◽

Jong-Myon Kim

Keyword(s):

Nearest Neighbor ◽

Selection Process ◽

Health State ◽

K Nearest Neighbor ◽

Bag Of Features ◽

State Classification ◽

Spherical Tank ◽

Art Works ◽

Health State Classification

Feature analysis puts a great impact in determining the various health conditions of mechanical vessels. To achieve balance between traditional feature extraction and the automated feature selection process, a hybrid bag of features (HBoF) is designed for multiclass health state classification of spherical tanks in this paper. The proposed HBoF is composed of (a) the acoustic emission (AE) features and (b) the time and frequency based statistical features. A wrapper-based feature chooser algorithm, Boruta, is utilized to extract the most intrinsic feature set from HBoF. The selective feature matrix is passed to the multi-class k-nearest neighbor (k-NN) algorithm to differentiate among normal condition (NC) and two faulty conditions (FC1 and FC2). Experimental results demonstrate that the proposed methodology generates an average 99.7% accuracy for all working conditions. Moreover, it outperforms the existing state-of-art works by achieving at least 19.4%.

Download Full-text

Feature Selection Algorithm for Hyperlipidemia Classification

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.701-702.110 ◽

2014 ◽

Vol 701-702 ◽

pp. 110-113

Author(s):

Qi Rui Zhang ◽

He Xian Wang ◽

Jiang Wei Qin

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Information Gain ◽

Classification Systems ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Document Frequency ◽

Selection Algorithms ◽

Term Weights

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ2 statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averagingF1 measure is used. DF is suitable for the task of large text classification.

Download Full-text

Classification of Hadith Topic of Indonesian Translation Using K-Nearest Neighbor and Chi-Square

International Journal on Information and Communication Technology (IJoICT) ◽

10.21108/ijoict.v7i2.573 ◽

2021 ◽

Vol 7 (2) ◽

pp. 11-22

Author(s):

Ghinaa Zain Nabiilah ◽

Said Al Faraby ◽

Mahendra Dwifebri Purbolaksono

Keyword(s):

Feature Selection ◽

Everyday Life ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Way Of Life ◽

Chi Square ◽

The Law ◽

Test Scenarios

Hadith is the main way of life for Muslims besides the Qur'an whose can be applied in everyday life. Hadith also contains all the words or deeds of the Prophet Muhammad which are used as a source of the law of Islam. Therefore, many readers, especially Muslims, are interested in studying hadith. However, the large number of hadiths makes it difficult for readers or those who are still unfamiliar with Islam to read them. Therefore, we conducted a study to classify hadith textually based on the type of teaching, so that readers can get an overview or other reference in reading and searching for hadith based on the type of teaching more easily. This study uses KNN and chi-square methods as feature selection. We also carried out several test scenarios, including implementing stopword removal modifications in preprocessing and experimenting with selecting k values for KNN to determine the best performance. The best performance was obtained by using the value of k = 7 on KNN without implementing chi-square and with stopword removal modification with a hammer loss value of 0.1042 or about 89.58% of the data correctly classified.

Download Full-text

An Ensemble-Based Feature Selection and Classification of Gene Expression using Support Vector Machine, K-Nearest Neighbor, Decision Tree

2019 International Conference on Communication and Electronics Systems (ICCES) ◽

10.1109/icces45898.2019.9002041 ◽

2019 ◽

Author(s):

Anu J Nair ◽

Rizwana Rasheed ◽

KM Maheeshma ◽

LS Aiswarya ◽

K R Kavitha

Keyword(s):

Gene Expression ◽

Support Vector Machine ◽

Feature Selection ◽

Decision Tree ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Product Review Based Customer Sentiment Analysis using an Ensemble of mRMR and Forest Optimization Algorithm (FOA)

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2022010107 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Optimization Algorithm ◽

Nearest Neighbor ◽

Hybrid Approach ◽

Support Vector ◽

K Nearest Neighbor ◽

Feature Selection Technique ◽

Feature Selection Problem

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.

Download Full-text

SELEÇÃO DE VARIÁVEIS PARA CATEGORIZAÇÃO DE AMOSTRAS QUÍMICAS

Eclética Química Journal ◽

10.26850/1678-4618eqj.v36.4.2011.p28-33 ◽

2017 ◽

Vol 36 (4) ◽

pp. 28

Author(s):

M. J. Anzanello

Keyword(s):

Nearest Neighbor ◽

Selection Process ◽

Pls Regression ◽

K Nearest Neighbor ◽

Wine Analysis ◽

Data Mining Tool ◽

Classification Technique ◽

Mining Tool ◽

Sensitivity Specificity ◽

Neighbor Classification

This paper presents a method to select the best variables to categorize chemical samples into two classes, say conforming or non-conforming. For that matter, PLS regression is combined with a data mining tool, the k-Nearest Neighbor classification technique, through an iterative variable selection process. The recommended subset of variables is chosen based on several criteria: sensitivity, specificity and percent of retained variables. When applied to two datasets related to wine analysis and one associated to QSAR, the proposed method significantly reduced the number of variables required for classification, while yielding superior categorization performance when compared to using all original variables.

Download Full-text

Feature Selection and Classification of Microarray Data using MapReduce based ANOVA and K-Nearest Neighbor

Procedia Computer Science ◽

10.1016/j.procs.2015.06.035 ◽

2015 ◽

Vol 54 ◽

pp. 301-310 ◽

Cited By ~ 16

Author(s):

Mukesh Kumar ◽

Nitish Kumar Rath ◽

Amitav Swain ◽

Santanu Kumar Rath

Keyword(s):

Feature Selection ◽

Microarray Data ◽

Nearest Neighbor ◽

K Nearest Neighbor

Download Full-text

Feature Selection Using Genetic Algorithms for the Generation of a Recognition and Classification of Children Activities Model Using Environmental Sound

Mobile Information Systems ◽

10.1155/2020/8617430 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Antonio García-Dominguez ◽

Carlos E. Galván-Tejada ◽

Laura A. Zanella-Calzada ◽

Hamurabi Gamboa-Rosales ◽

Jorge I. Galván-Tejada ◽

...

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Recursive Partitioning ◽

Feature Model ◽

K Nearest Neighbor ◽

Audio Signals ◽

Environmental Sound ◽

Learning Techniques ◽

Original Size

In the area of recognition and classification of children activities, numerous works have been proposed that make use of different data sources. In most of them, sensors embedded in children’s garments are used. In this work, the use of environmental sound data is proposed to generate a recognition and classification of children activities model through automatic learning techniques, optimized for application on mobile devices. Initially, the use of a genetic algorithm for a feature selection is presented, reducing the original size of the dataset used, an important aspect when working with the limited resources of a mobile device. For the evaluation of this process, five different classification methods are applied, k-nearest neighbor (k-NN), nearest centroid (NC), artificial neural networks (ANNs), random forest (RF), and recursive partitioning trees (Rpart). Finally, a comparison of the models obtained, based on the accuracy, is performed, in order to identify the classification method that presents the best performance in the development of a model that allows the identification of children activity based on audio signals. According to the results, the best performance is presented by the five-feature model developed through RF, obtaining an accuracy of 0.92, which allows to conclude that it is possible to automatically classify children activity based on a reduced set of features with significant accuracy.

Download Full-text

Comparative between optimization feature selection by using classifiers algorithms on spam email

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i6.pp5479-5485 ◽

2019 ◽

Vol 9 (6) ◽

pp. 5479

Author(s):

Ghada Rawashdeh ◽

Rabiei Mamat ◽

Zuriana Binti Abu Bakar ◽

Noor Hafhizah Abd Rahim

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Harmony Search ◽

High Growth ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Spam Filter ◽

Simulating Annealing

<span lang="EN-US">Spam mail has become a rising phenomenon in a world that has recently witnessed high growth in the volume of emails. This indicates the need to develop an effective spam filter. At the present time, Classification algorithms for text mining are used for the classification of emails. This paper provides a description and evaluation of the effectiveness of three popular classifiers using optimization feature selections, such as Genetic algorithm, Harmony search, practical swarm optimization, and simulating annealing. The research focuses on a comparison of the effect of classifiers using K-nearest Neighbor (KNN), Naïve Bayesian (NB), and Support Vector Machine (SVM) on spam classifiers (without using feature selection) also enhances the reliability of feature selection by proposing optimization feature selection to reduce number of features that are not important.</span>

Download Full-text

M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification

Journal of Intelligent Systems ◽

10.1515/jisys-2019-0064 ◽

2019 ◽

Vol 29 (1) ◽

pp. 1453-1467 ◽

Cited By ~ 6

Author(s):

Ritam Guha ◽

Manosij Ghosh ◽

Pawan Kumar Singh ◽

Ram Sarkar ◽

Mita Nasipuri

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Selection Process ◽

Indian Subcontinent ◽

Classification Problem ◽

Feature Subset ◽

K Nearest Neighbor ◽

Multi Objective ◽

Statistical Institute ◽

Comparison Of The Results

Abstract The feature selection process is very important in the field of pattern recognition, which selects the informative features so as to reduce the curse of dimensionality, thus improving the overall classification accuracy. In this paper, a new feature selection approach named Memory-Based Histogram-Oriented Multi-objective Genetic Algorithm (M-HMOGA) is introduced to identify the informative feature subset to be used for a pattern classification problem. The proposed M-HMOGA approach is applied to two recently used feature sets, namely Mojette transform and Regional Weighted Run Length features. The experimentations are carried out on Bangla, Devanagari, and Roman numeral datasets, which are the three most popular scripts used in the Indian subcontinent. In-house Bangla and Devanagari script datasets and Competition on Handwritten Digit Recognition (HDRC) 2013 Roman numeral dataset are used for evaluating our model. Moreover, as proof of robustness, we have applied an innovative approach of using different datasets for training and testing. We have used in-house Bangla and Devanagari script datasets for training the model, and the trained model is then tested on Indian Statistical Institute numeral datasets. For Roman numerals, we have used the HDRC 2013 dataset for training and the Modified National Institute of Standards and Technology dataset for testing. Comparison of the results obtained by the proposed model with existing HMOGA and MOGA techniques clearly indicates the superiority of M-HMOGA over both of its ancestors. Moreover, use of K-nearest neighbor as well as multi-layer perceptron as classifiers speaks for the classifier-independent nature of M-HMOGA. The proposed M-HMOGA model uses only about 45–50% of the total feature set in order to achieve around 1% increase when the same datasets are partitioned for training-testing and a 2–3% increase in the classification ability while using only 35–45% features when different datasets are used for training-testing with respect to the situation when all the features are used for classification.

Download Full-text