Review of Gene Subset Selection using Modified K-Nearest Neighbor Clustering Algorithm

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classiﬁcation was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

Mathematical Problems in Engineering ◽

10.1155/2015/535932 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Cheng Lu ◽

Shiji Song ◽

Cheng Wu

Keyword(s):

Clustering Analysis ◽

Incomplete Data ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Interval Data ◽

Similarity Function ◽

K Nearest Neighbor ◽

Partial Data ◽

Missing Attributes ◽

Ap Clustering

The Affinity Propagation (AP) algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based onK-nearest neighbor intervals (KNNI) for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

Download Full-text

A Deep Learning Based Method for the Non-Destructive Measuring of Rock Strength through Hammering Sound

Applied Sciences ◽

10.3390/app9173484 ◽

2019 ◽

Vol 9 (17) ◽

pp. 3484

Author(s):

Shuai Han ◽

Heng Li ◽

Mingchao Li ◽

Timothy Rose

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Rock Strength ◽

Support Vector ◽

K Nearest Neighbor ◽

Strength Measurement ◽

Regression Algorithms ◽

Almost All ◽

The Relationship ◽

Non Destructive

Hammering rocks of different strengths can make different sounds. Geological engineers often use this method to approximate the strengths of rocks in geology surveys. This method is quick and convenient but subjective. Inspired by this problem, we present a new, non-destructive method for measuring the surface strengths of rocks based on deep neural network (DNN) and spectrogram analysis. All the hammering sounds are transformed into spectrograms firstly, and a clustering algorithm is presented to filter out the outliers of the spectrograms automatically. One of the most advanced image classification DNN, the Inception-ResNet-v2, is then re-trained with the spectrograms. The results show that the training accurate is up to 94.5%. Following this, three regression algorithms, including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF) are adopted to fit the relationship between the outputs of the DNN and the strength values. The tests show that KNN has the highest fitting accuracy, and SVM has the strongest generalization ability. The strengths (represented by rebound values) of almost all the samples can be predicted within an error of [−5, 5]. Overall, the proposed method has great potential in supporting the implementation of efficient rock strength measurement methods in the field.

Download Full-text

MRI brain tumor detection using optimal possibilistic fuzzy C-means clustering algorithm and adaptive k-nearest neighbor classifier

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-020-02444-7 ◽

2020 ◽

Author(s):

D. Maruthi Kumar ◽

D. Satyanarayana ◽

M. N. Giri Prasad

Keyword(s):

Brain Tumor ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Tumor Detection ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Fuzzy C Means ◽

Mri Brain ◽

Fuzzy C Means Clustering ◽

Neighbor Classifier

Download Full-text

3D Point Cloud Simplification Based on k-Nearest Neighbor and Clustering

Advances in Multimedia ◽

10.1155/2020/8825205 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Abdelaaziz Mahdaoui ◽

El Hassan Sbai

Keyword(s):

Point Cloud ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

3D Point Cloud ◽

K Nearest Neighbor ◽

Test Dataset ◽

New Approach ◽

Entropy Estimation ◽

3D Objects ◽

Point Cloud Simplification

While the reconstruction of 3D objects is increasingly used today, the simplification of 3D point cloud, however, becomes a substantial phase in this process of reconstruction. This is due to the huge amounts of dense 3D point cloud produced by 3D scanning devices. In this paper, a new approach is proposed to simplify 3D point cloud based on k-nearest neighbor (k-NN) and clustering algorithm. Initially, 3D point cloud is divided into clusters using k-means algorithm. Then, an entropy estimation is performed for each cluster to remove the ones that have minimal entropy. In this paper, MATLAB is used to carry out the simulation, and the performance of our method is testified by test dataset. Numerous experiments demonstrate the effectiveness of the proposed simplification method of 3D point cloud.

Download Full-text

Clustering algorithm based on mutual K-nearest neighbor relationships

Statistical Analysis and Data Mining The ASA Data Science Journal ◽

10.1002/sam.10149 ◽

2012 ◽

Vol 5 (2) ◽

pp. 100-113 ◽

Cited By ~ 8

Author(s):

Zhen Hu ◽

Raj Bhatnagar

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Neighbor Relationships

Download Full-text

Pear Defect Detection Method Based on ResNet and DCGAN

Information ◽

10.3390/info12100397 ◽

2021 ◽

Vol 12 (10) ◽

pp. 397

Author(s):

Yan Zhang ◽

Shiyun Wa ◽

Pengshuo Sun ◽

Yaojun Wang

Keyword(s):

Defect Detection ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Detection System ◽

Implementation Process ◽

Support Vector ◽

Detection Accuracy ◽

K Nearest Neighbor ◽

Generation Network ◽

Low Efficiency

To address the current situation, in which pear defect detection is still based on a workforce with low efficiency, we propose the use of the CNN model to detect pear defects. Since it is challenging to obtain defect images in the implementation process, a deep convolutional adversarial generation network was used to augment the defect images. As the experimental results indicated, the detection accuracy of the proposed method on the 3000 validation set was as high as 97.35%. Variant mainstream CNNs were compared to evaluate the model’s performance thoroughly, and the top performer was selected to conduct further comparative experiments with traditional machine learning methods, such as support vector machine algorithm, random forest algorithm, and k-nearest neighbor clustering algorithm. Moreover, the other two varieties of pears that have not been trained were chosen to validate the robustness and generalization capability of the model. The validation results illustrated that the proposed method is more accurate than the commonly used algorithms for pear defect detection. It is robust enough to be generalized well to other datasets. In order to allow the method proposed in this paper to be applied in agriculture, an intelligent pear defect detection system was built based on an iOS device.

Download Full-text

Log-Based Anomaly Detection with the Improved K-Nearest Neighbor

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194020500114 ◽

2020 ◽

Vol 30 (02) ◽

pp. 239-262 ◽

Cited By ~ 1

Author(s):

Bingming Wang ◽

Shi Ying ◽

Guoli Cheng ◽

Rui Wang ◽

Zhe Yang ◽

...

Keyword(s):

Anomaly Detection ◽

Large Scale ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Keyword Search ◽

Mean Shift ◽

Recall Rate ◽

K Nearest Neighbor ◽

Mean Shift Clustering ◽

Negative Effect

Logs play an important role in the maintenance of large-scale systems. The number of logs which indicate normal (normal logs) differs greatly from the number of logs that indicate anomalies (abnormal logs), and the two types of logs have certain differences. To automatically obtain faults by K-Nearest Neighbor (KNN) algorithm, an outlier detection method with high accuracy, is an effective way to detect anomalies from logs. However, logs have the characteristics of large scale and very uneven samples, which will affect the results of KNN algorithm on log-based anomaly detection. Thus, we propose an improved KNN algorithm-based method which uses the existing mean-shift clustering algorithm to efficiently select the training set from massive logs. Then we assign different weights to samples with different distances, which reduces the negative effect of unbalanced distribution of the log samples on the accuracy of KNN algorithm. By comparing experiments on log sets from five supercomputers, the results show that the method we proposed can be effectively applied to log-based anomaly detection, and the accuracy, recall rate and F measure with our method are higher than those of traditional keyword search method.

Download Full-text

Received Signal Strength-Based Indoor Localization Using Hierarchical Classification

Sensors ◽

10.3390/s20041067 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1067 ◽

Cited By ~ 6

Author(s):

Chenbin Zhang ◽

Ningning Qin ◽

Yanbo Xue ◽

Le Yang

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Indoor Localization ◽

Hierarchical Classification ◽

Signal Strength ◽

Received Signal Strength ◽

Support Vector ◽

K Nearest Neighbor ◽

Position Information ◽

Area Of Interest

Commercial interests in indoor localization have been increasing in the past decade. The success of many applications relies at least partially on indoor localization that is expected to provide reliable indoor position information. Wi-Fi received signal strength (RSS)-based indoor localization techniques have attracted extensive attentions because Wi-Fi access points (APs) are widely deployed and we can obtain the Wi-Fi RSS measurements without extra hardware cost. In this paper, we propose a hierarchical classification-based method as a new solution to the indoor localization problem. Within the developed approach, we first adopt an improved K-Means clustering algorithm to divide the area of interest into several zones and they are allowed to overlap with one another to improve the generalization capability of the following indoor positioning process. To find the localization result, the K-Nearest Neighbor (KNN) algorithm and support vector machine (SVM) with the one-versus-one strategy are employed. The proposed method is implemented on a tablet, and its performance is evaluated in real-world environments. Experiment results reveal that the proposed method offers an improvement of 1.4% to 3.2% in terms of position classification accuracy and a reduction of 10% to 22% in terms of average positioning error compared with several benchmark methods.

Download Full-text