scholarly journals PENGARUH SELEKSI FITUR CITRA TERHADAP KLASIFIKASI TINGKAT KESEGARAN DAGING SAPI LOKAL

Author(s):  
Titin Yulianti ◽  
Mareli Telaumbanua ◽  
Hery Dian Septama ◽  
Helmy Fitriawan ◽  
Afri Yudamson

Identifying beef manually has some drawbacks because human visual has limitations and there are differences of human perception in assessing object quality. Several researches developed beef quality assessment methods based on image feature extraction. However, not all features support for obtaining the classification results that have high accuracy. The efficiency will be achieved if the classification analyzes only the relevant features. Therefore, a feature selection process is required to select relevant features and to eliminate irrelevant features to obtain more accurate and faster classification results. One of the feature selection algorithms is the F-Score which is a simple technique that measures the discrimination of two sets of real numbers. The features with the lowest ranking from the F-Score will be eliminated one by one until the most relevant features are obtained. The test is carried out by analyzing the classification results in the form of sensitivity, specificity, and accuracy values. The results of this research showed that by using the F-Score feature, the most relevant features for the classification of freshness level of local beef are obtained using the K-Nearest Neighbor (KNN) method. These features include the average color intensity R and standard deviation with a sensitivity of 0.8, a specificity of 0.93, and an accuracy of 86%.  Keywords:  Classification, Fiture Selection, F-Score, K-Nearest Neighbor, Local beef

2020 ◽  
Vol 10 (7) ◽  
pp. 2525 ◽  
Author(s):  
Md Junayed Hasan ◽  
Jaeyoung Kim ◽  
Cheol Hong Kim ◽  
Jong-Myon Kim

Feature analysis puts a great impact in determining the various health conditions of mechanical vessels. To achieve balance between traditional feature extraction and the automated feature selection process, a hybrid bag of features (HBoF) is designed for multiclass health state classification of spherical tanks in this paper. The proposed HBoF is composed of (a) the acoustic emission (AE) features and (b) the time and frequency based statistical features. A wrapper-based feature chooser algorithm, Boruta, is utilized to extract the most intrinsic feature set from HBoF. The selective feature matrix is passed to the multi-class k-nearest neighbor (k-NN) algorithm to differentiate among normal condition (NC) and two faulty conditions (FC1 and FC2). Experimental results demonstrate that the proposed methodology generates an average 99.7% accuracy for all working conditions. Moreover, it outperforms the existing state-of-art works by achieving at least 19.4%.


2014 ◽  
Vol 701-702 ◽  
pp. 110-113
Author(s):  
Qi Rui Zhang ◽  
He Xian Wang ◽  
Jiang Wei Qin

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ2 statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averagingF1 measure is used. DF is suitable for the task of large text classification.


Author(s):  
Ghinaa Zain Nabiilah ◽  
Said Al Faraby ◽  
Mahendra Dwifebri Purbolaksono

Hadith is the main way of life for Muslims besides the Qur'an whose can be applied in everyday life. Hadith also contains all the words or deeds of the Prophet Muhammad which are used as a source of the law of Islam. Therefore, many readers, especially Muslims, are interested in studying hadith. However, the large number of hadiths makes it difficult for readers or those who are still unfamiliar with Islam to read them. Therefore, we conducted a study to classify hadith textually based on the type of teaching, so that readers can get an overview or other reference in reading and searching for hadith based on the type of teaching more easily. This study uses KNN and chi-square methods as feature selection. We also carried out several test scenarios, including implementing stopword removal modifications in preprocessing and experimenting with selecting k values ​​for KNN to determine the best performance. The best performance was obtained by using the value of k = 7 on KNN without implementing chi-square and with stopword removal modification with a hammer loss value of 0.1042 or about 89.58% of the data correctly classified.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.


2017 ◽  
Vol 36 (4) ◽  
pp. 28
Author(s):  
M. J. Anzanello

This paper presents a method to select the best variables to categorize chemical samples into two classes, say conforming or non-conforming. For that matter, PLS regression is combined with a data mining tool, the k-Nearest Neighbor classification technique, through an iterative variable selection process. The recommended subset of variables is chosen based on several criteria: sensitivity, specificity and percent of retained variables. When applied to two datasets related to wine analysis and one associated to QSAR, the proposed method significantly reduced the number of variables required for classification, while yielding superior categorization performance when compared to using all original variables.


2015 ◽  
Vol 54 ◽  
pp. 301-310 ◽  
Author(s):  
Mukesh Kumar ◽  
Nitish Kumar Rath ◽  
Amitav Swain ◽  
Santanu Kumar Rath

2020 ◽  
Vol 2020 ◽  
pp. 1-12 ◽  
Author(s):  
Antonio García-Dominguez ◽  
Carlos E. Galván-Tejada ◽  
Laura A. Zanella-Calzada ◽  
Hamurabi Gamboa-Rosales ◽  
Jorge I. Galván-Tejada ◽  
...  

In the area of recognition and classification of children activities, numerous works have been proposed that make use of different data sources. In most of them, sensors embedded in children’s garments are used. In this work, the use of environmental sound data is proposed to generate a recognition and classification of children activities model through automatic learning techniques, optimized for application on mobile devices. Initially, the use of a genetic algorithm for a feature selection is presented, reducing the original size of the dataset used, an important aspect when working with the limited resources of a mobile device. For the evaluation of this process, five different classification methods are applied, k-nearest neighbor (k-NN), nearest centroid (NC), artificial neural networks (ANNs), random forest (RF), and recursive partitioning trees (Rpart). Finally, a comparison of the models obtained, based on the accuracy, is performed, in order to identify the classification method that presents the best performance in the development of a model that allows the identification of children activity based on audio signals. According to the results, the best performance is presented by the five-feature model developed through RF, obtaining an accuracy of 0.92, which allows to conclude that it is possible to automatically classify children activity based on a reduced set of features with significant accuracy.


Author(s):  
Ghada Rawashdeh ◽  
Rabiei Mamat ◽  
Zuriana Binti Abu Bakar ◽  
Noor Hafhizah Abd Rahim

<span lang="EN-US">Spam mail has become a rising phenomenon in a world that has recently witnessed high growth in the volume of emails. This indicates the need to develop an effective spam filter. At the present time, Classification algorithms for text mining are used for the classification of emails. This paper provides a description and evaluation of the effectiveness of three popular classifiers using optimization feature selections, such as Genetic algorithm, Harmony search, practical swarm optimization, and simulating annealing. The research focuses on a comparison of the effect of classifiers using K-nearest Neighbor (KNN), Naïve Bayesian (NB), and Support Vector Machine (SVM) on spam classifiers (without using feature selection) also enhances the reliability of feature selection by proposing optimization feature selection to reduce number of features that are not important.</span>


2019 ◽  
Vol 29 (1) ◽  
pp. 1453-1467 ◽  
Author(s):  
Ritam Guha ◽  
Manosij Ghosh ◽  
Pawan Kumar Singh ◽  
Ram Sarkar ◽  
Mita Nasipuri

Abstract The feature selection process is very important in the field of pattern recognition, which selects the informative features so as to reduce the curse of dimensionality, thus improving the overall classification accuracy. In this paper, a new feature selection approach named Memory-Based Histogram-Oriented Multi-objective Genetic Algorithm (M-HMOGA) is introduced to identify the informative feature subset to be used for a pattern classification problem. The proposed M-HMOGA approach is applied to two recently used feature sets, namely Mojette transform and Regional Weighted Run Length features. The experimentations are carried out on Bangla, Devanagari, and Roman numeral datasets, which are the three most popular scripts used in the Indian subcontinent. In-house Bangla and Devanagari script datasets and Competition on Handwritten Digit Recognition (HDRC) 2013 Roman numeral dataset are used for evaluating our model. Moreover, as proof of robustness, we have applied an innovative approach of using different datasets for training and testing. We have used in-house Bangla and Devanagari script datasets for training the model, and the trained model is then tested on Indian Statistical Institute numeral datasets. For Roman numerals, we have used the HDRC 2013 dataset for training and the Modified National Institute of Standards and Technology dataset for testing. Comparison of the results obtained by the proposed model with existing HMOGA and MOGA techniques clearly indicates the superiority of M-HMOGA over both of its ancestors. Moreover, use of K-nearest neighbor as well as multi-layer perceptron as classifiers speaks for the classifier-independent nature of M-HMOGA. The proposed M-HMOGA model uses only about 45–50% of the total feature set in order to achieve around 1% increase when the same datasets are partitioned for training-testing and a 2–3% increase in the classification ability while using only 35–45% features when different datasets are used for training-testing with respect to the situation when all the features are used for classification.


Sign in / Sign up

Export Citation Format

Share Document