A Bonferroni Mean Based Fuzzy K Nearest Centroid Neighbor Classifier

K-nearest neighbor (KNN) is an effective nonparametric classifier that determines the neighbors of a point based only on distance proximity. The classification performance of KNN is disadvantaged by the presence of outliers in small sample size datasets and its performance deteriorates on datasets with class imbalance. We propose a local Bonferroni Mean based Fuzzy K-Nearest Centroid Neighbor (BM-FKNCN) classifier that assigns class label of a query sample dependent on the nearest local centroid mean vector to better represent the underlying statistic of the dataset. The proposed classifier is robust towards outliers because the Nearest Centroid Neighborhood (NCN) concept also considers spatial distribution and symmetrical placement of the neighbors. Also, the proposed classifier can overcome class domination of its neighbors in datasets with class imbalance because it averages all the centroid vectors from each class to adequately interpret the distribution of the classes. The BM-FKNCN classifier is tested on datasets from the Knowledge Extraction based on Evolutionary Learning (KEEL) repository and benchmarked with classification results from the KNN, Fuzzy-KNN (FKNN), BM-FKNN and FKNCN classifiers. The experimental results show that the BM-FKNCN achieves the highest overall average classification accuracy of 89.86% compared to the other four classifiers.

Download Full-text

A New Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance

Information ◽

10.3390/info9090234 ◽

2018 ◽

Vol 9 (9) ◽

pp. 234 ◽

Cited By ~ 9

Author(s):

Sumet Mehta ◽

Xiangjun Shen ◽

Jiangping Gou ◽

Dejiao Niu

Keyword(s):

Small Sample Size ◽

Classification Performance ◽

Error Rates ◽

Small Sample ◽

Harmonic Mean ◽

Local Means ◽

Local Mean ◽

Real World Datasets ◽

Nearest Neighbour Classifier ◽

Query Sample

The K-nearest neighbour classifier is very effective and simple non-parametric technique in pattern classification; however, it only considers the distance closeness, but not the geometricalplacement of the k neighbors. Also, its classification performance is highly influenced by the neighborhood size k and existing outliers. In this paper, we propose a new local mean based k-harmonic nearest centroid neighbor (LMKHNCN) classifier in orderto consider both distance-based proximity, as well as spatial distribution of k neighbors. In our method, firstly the k nearest centroid neighbors in each class are found which are used to find k different local mean vectors, and then employed to compute their harmonic mean distance to the query sample. Lastly, the query sample is assigned to the class with minimum harmonic mean distance. The experimental results based on twenty-six real-world datasets shows that the proposed LMKHNCN classifier achieves lower error rates, particularly in small sample-size situations, and that it is less sensitive to parameter k when compared to therelated four KNN-based classifiers.

Download Full-text

Fuzzy Maximum Scatter Discriminant Analysis and its Application to Face Recognition

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.317-319.150 ◽

2011 ◽

Vol 317-319 ◽

pp. 150-153

Author(s):

Wan Li Feng ◽

Shang Bing Gao

Keyword(s):

Discriminant Analysis ◽

Nearest Neighbor ◽

Small Sample Size ◽

Small Sample ◽

K Nearest Neighbor ◽

Small Sample Size Problem ◽

Fisher Discriminant Analysis ◽

Singularity Problem ◽

Scatter Matrix ◽

Fisher Discriminant

In this paper, a reformative scatter difference discriminant criterion (SDDC) with fuzzy set theory is studied. The scatter difference between between-class and within-class as discriminant criterion is effective to overcome the singularity problem of the within-class scatter matrix due to small sample size problem occurred in classical Fisher discriminant analysis. However, the conventional SDDC assumes the same level of relevance of each sample to the corresponding class. So, a fuzzy maximum scatter difference analysis (FMSDA) algorithm is proposed, in which the fuzzy k-nearest neighbor (FKNN) is implemented to achieve the distribution information of original samples, and this information is utilized to redefine corresponding scatter matrices which are different to the conventional SDDC and effective to extract discriminative features from overlapping (outlier) samples. Experiments conducted on FERET face databases demonstrate the effectiveness of the proposed method.

Download Full-text

Comparative analysis of breast cancer detection in mammograms and thermograms

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2014-0047 ◽

2015 ◽

Vol 60 (1) ◽

Cited By ~ 7

Author(s):

Marina Milosevic ◽

Dragan Jankovic ◽

Aleksandar Peulic

Keyword(s):

Nearest Neighbor ◽

Region Of Interest ◽

Texture Features ◽

Classification Performance ◽

Support Vector ◽

K Nearest Neighbor ◽

Characteristic Analysis ◽

Analysis Society ◽

Fold Cross Validation ◽

Neighbor Classifier

AbstractIn this paper, we present a system based on feature extraction techniques for detecting abnormal patterns in digital mammograms and thermograms. A comparative study of texture-analysis methods is performed for three image groups: mammograms from the Mammographic Image Analysis Society mammographic database; digital mammograms from the local database; and thermography images of the breast. Also, we present a procedure for the automatic separation of the breast region from the mammograms. Computed features based on gray-level co-occurrence matrices are used to evaluate the effectiveness of textural information possessed by mass regions. A total of 20 texture features are extracted from the region of interest. The ability of feature set in differentiating abnormal from normal tissue is investigated using a support vector machine classifier, Naive Bayes classifier and K-Nearest Neighbor classifier. To evaluate the classification performance, five-fold cross-validation method and receiver operating characteristic analysis was performed.

Download Full-text

Clustering of Cancer Data Based on Stiefel Manifold for Multiple Views

10.21203/rs.3.rs-154286/v1 ◽

2021 ◽

Author(s):

Jing Tian ◽

Jianping Zhao ◽

Chun-hou Zheng

Keyword(s):

Nearest Neighbor ◽

Search Algorithm ◽

Small Sample Size ◽

Small Sample ◽

Stiefel Manifold ◽

K Nearest Neighbor ◽

Cancer Data ◽

Clustering Problem ◽

Multiple Datasets ◽

Cluster Class

Abstract Background: In recent years, various sequencing techniques have been used to collect biomedical omics datasets. It is usually possible to obtain multiple types of omics data from a single patient sample. Clustering of these datasets has proved to be valuable for biological and medical research and helpful to reveal data structures from multiple collections. However, such data often have small sample size and high dimension. It is difficult to find a suitable integration method for structural analysis of multiple datasets. Results: In this paper, a multi-view clustering based on Stiefel manifold method (MCSM) is proposed. Firstly, we established a binary optimization model for the simultaneous clustering problem. Secondly, the optimization problem solved by linear search algorithm based on Stiefel manifold. Finally, we integrated the clustering results obtained from three omics by using k-nearest neighbor method. We applied this approach to four cancer datasets on TCGA. The result shows that our method is superior to several state-of-art methods, which depends on the hypothesis that the underlying omics cluster class is the same.Conclusion: Particularly, our approach has better performs when the underlying clusters are inconsistent. For patients with different subtypes, both consistent and differential clusters can be identified at the same time.

Download Full-text

A new fuzzy k-nearest neighbor classifier based on the Bonferroni mean

Pattern Recognition Letters ◽

10.1016/j.patrec.2020.10.005 ◽

2020 ◽

Vol 140 ◽

pp. 172-178 ◽

Cited By ~ 1

Author(s):

Mahinda Mailagaha Kumbure ◽

Pasi Luukka ◽

Mikael Collan

Keyword(s):

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Bonferroni Mean ◽

Neighbor Classifier

Download Full-text

Penerapan Algoritme Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance (LMKHNCN) Untuk Klasifikasi Hasil Kinerja Pegawai Negeri Sipil

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2021834431 ◽

2021 ◽

Vol 8 (6) ◽

pp. 1287

Author(s):

Adam Syarif Hidayatullah ◽

Fitra Abdurrachman Bachtiar ◽

Imam Cholissodin

Keyword(s):

Cross Validation ◽

Nearest Neighbor ◽

Classification Performance ◽

Harmonic Mean ◽

K Nearest Neighbor ◽

Original Algorithm ◽

Local Means ◽

Good Classification Performance ◽

Fold Cross Validation ◽

Neighbor Classifier

Keberhasilan sebuah perusahaan terjadi karena dapat mengelola sumber daya manusianya dengan baik begitu juga sebaliknya. Salah satu instansi yang mengelola sumber daya manusia menggunakan Manajemen Talenta adalah Badan Kepegawaian Daerah (BKD) kota Malang, dengan mengevaluasi pegawainya setiap tahunnya setelah pekerjaan selesai dilakukan. Hal ini menyebabkan hasil pekerjaan yang telah dilakukan tidak optimal, sehingga perlu identifikasi dini pegawai yang memiliki kinerja dibawah rata – rata sehingga dapat dievaluasi dan meminimalisir hasil pekerjaan yang tidak optimal dengan menggunakan teknik klasifikasi. Penelitian ini menggunakan teknik klasifikasi Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance (LMKHNCN). Metode ini merupakan metode modifikasi dari metode K-Nearest Neighbor (KNN) dan dibuktikan memiliki performa lebih baik dibandingkan dengan metode aslinya KNN. Dilakukan pengujian F1-Score dan akurasi menggunakan K-Fold Cross Validation untuk mengetahui persebaran akurasi dan juga pengujian mengenai pengaruh normalisasi karena tidak ada informasi normalisasi pada penelitian sebelumnya. Metode pada kasus ini menghasilkan performa klasifikasi yang baik, dibuktikan bahwa hasil akurasi dan F1-Score oleh metode ini berturut – turut ialah mencapai 98,8% dan 98,1%. AbstractThe success of company occurs because is manage human resources well and vice versa. One of institute that mange human resource using Talent Management is Malang city Badan Kepegawaian Daerah (BKD), which evaluates its employee annually after the work is completed. This can cause not optimal work result, so it necessary to early identification of employees who have performance below average performance so that can be evaluated and minimize not optimal result. This study is use classification technique Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance (LMKHNCN). This method is modified base algorithm of K-Nearest Neighbor (KNN). F1-Score and Accuracy using K-Fold Cross Validation to measure performance of this method and normalization testing due to no any information about that in previous study. This method is proven to have better performance compared to it original algorithm KNN. The method in this study has produced good classification performance. The result of classification accuracy and F1-Score by this method reach 98,8% dan 98,1%.

Download Full-text

A High-Voltage Electric Switch Classification System Based on K-Nearest Neighbor Classifier

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344925 ◽

2020 ◽

Author(s):

Haien Wang ◽

Jing Zhang ◽

Yang Zhao ◽

Jun Wang ◽

Xiaorong Du

Keyword(s):

High Voltage ◽

Classification System ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Neighbor Classifier

Download Full-text

Quad division prototype selection-based k-nearest neighbor classifier for click fraud detection from highly skewed user click dataset

Engineering Science and Technology an International Journal ◽

10.1016/j.jestch.2021.05.015 ◽

2021 ◽

Author(s):

Deepti Sisodia ◽

Dilip Singh Sisodia

Keyword(s):

Nearest Neighbor ◽

Fraud Detection ◽

Prototype Selection ◽

K Nearest Neighbor ◽

Click Fraud ◽

Nearest Neighbor Classifier ◽

Neighbor Classifier

Download Full-text

k-Nearest Neighbor Classifier and Supervised Clustering

Data Mining ◽

10.1201/b15288-7 ◽

2013 ◽

pp. 117-137

Author(s):

Nong Ye

Keyword(s):

Nearest Neighbor ◽

K Nearest Neighbor ◽

Supervised Clustering ◽

Nearest Neighbor Classifier ◽

Neighbor Classifier

Download Full-text

Stronger Automation for Flyspeck by Feature Weighting and Strategy Evolution

10.29007/5gzr ◽

2018 ◽

Author(s):

Cezary Kaliszyk ◽

Josef Urban

Keyword(s):

Nearest Neighbor ◽

Feature Weighting ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Hol Light ◽

Distance Weighted ◽

Neighbor Classifier

Two complementary AI methods are used to improve the strength of the AI/ATP service for proving conjectures over the HOL Light and Flyspeck corpora. First, several schemes for frequency-based feature weighting are explored in combination with distance-weighted k-nearest-neighbor classifier. This results in 16% improvement (39.0% to 45.5% Flyspeck problems solved) of the overall strength of the service when using 14 CPUs and 30 seconds. The best premise-selection/ATP combination is improved from 24.2% to 31.4%, i.e. by 30%. A smaller improvement is obtained by evolving targetted E prover strategies on two particular premise selections, using the Blind Strategymaker (BliStr) system. This raises the performance of the best AI/ATP method from 31.4% to 34.9%, i.e. by 11%, and raises the current 14-CPU power of the service to 46.9%.

Download Full-text