Review of Gene Subset Selection using Modified K-Nearest Neighbor Clustering Algorithm

Author(s):  
Abhishek Kumar ◽  
K. Vengatesan ◽  
R Rajesh ◽  
M. Parthibhan ◽  
Achintya Singhal
2015 ◽  
pp. 125-138 ◽  
Author(s):  
I. V. Goncharenko

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classification was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.


2021 ◽  
Vol 25 (6) ◽  
pp. 1453-1471
Author(s):  
Chunhua Tang ◽  
Han Wang ◽  
Zhiwen Wang ◽  
Xiangkun Zeng ◽  
Huaran Yan ◽  
...  

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Cheng Lu ◽  
Shiji Song ◽  
Cheng Wu

The Affinity Propagation (AP) algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based onK-nearest neighbor intervals (KNNI) for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.


2019 ◽  
Vol 9 (17) ◽  
pp. 3484
Author(s):  
Shuai Han ◽  
Heng Li ◽  
Mingchao Li ◽  
Timothy Rose

Hammering rocks of different strengths can make different sounds. Geological engineers often use this method to approximate the strengths of rocks in geology surveys. This method is quick and convenient but subjective. Inspired by this problem, we present a new, non-destructive method for measuring the surface strengths of rocks based on deep neural network (DNN) and spectrogram analysis. All the hammering sounds are transformed into spectrograms firstly, and a clustering algorithm is presented to filter out the outliers of the spectrograms automatically. One of the most advanced image classification DNN, the Inception-ResNet-v2, is then re-trained with the spectrograms. The results show that the training accurate is up to 94.5%. Following this, three regression algorithms, including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF) are adopted to fit the relationship between the outputs of the DNN and the strength values. The tests show that KNN has the highest fitting accuracy, and SVM has the strongest generalization ability. The strengths (represented by rebound values) of almost all the samples can be predicted within an error of [−5, 5]. Overall, the proposed method has great potential in supporting the implementation of efficient rock strength measurement methods in the field.


2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Abdelaaziz Mahdaoui ◽  
El Hassan Sbai

While the reconstruction of 3D objects is increasingly used today, the simplification of 3D point cloud, however, becomes a substantial phase in this process of reconstruction. This is due to the huge amounts of dense 3D point cloud produced by 3D scanning devices. In this paper, a new approach is proposed to simplify 3D point cloud based on k-nearest neighbor (k-NN) and clustering algorithm. Initially, 3D point cloud is divided into clusters using k-means algorithm. Then, an entropy estimation is performed for each cluster to remove the ones that have minimal entropy. In this paper, MATLAB is used to carry out the simulation, and the performance of our method is testified by test dataset. Numerous experiments demonstrate the effectiveness of the proposed simplification method of 3D point cloud.


Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 397
Author(s):  
Yan Zhang ◽  
Shiyun Wa ◽  
Pengshuo Sun ◽  
Yaojun Wang

To address the current situation, in which pear defect detection is still based on a workforce with low efficiency, we propose the use of the CNN model to detect pear defects. Since it is challenging to obtain defect images in the implementation process, a deep convolutional adversarial generation network was used to augment the defect images. As the experimental results indicated, the detection accuracy of the proposed method on the 3000 validation set was as high as 97.35%. Variant mainstream CNNs were compared to evaluate the model’s performance thoroughly, and the top performer was selected to conduct further comparative experiments with traditional machine learning methods, such as support vector machine algorithm, random forest algorithm, and k-nearest neighbor clustering algorithm. Moreover, the other two varieties of pears that have not been trained were chosen to validate the robustness and generalization capability of the model. The validation results illustrated that the proposed method is more accurate than the commonly used algorithms for pear defect detection. It is robust enough to be generalized well to other datasets. In order to allow the method proposed in this paper to be applied in agriculture, an intelligent pear defect detection system was built based on an iOS device.


Author(s):  
Bingming Wang ◽  
Shi Ying ◽  
Guoli Cheng ◽  
Rui Wang ◽  
Zhe Yang ◽  
...  

Logs play an important role in the maintenance of large-scale systems. The number of logs which indicate normal (normal logs) differs greatly from the number of logs that indicate anomalies (abnormal logs), and the two types of logs have certain differences. To automatically obtain faults by K-Nearest Neighbor (KNN) algorithm, an outlier detection method with high accuracy, is an effective way to detect anomalies from logs. However, logs have the characteristics of large scale and very uneven samples, which will affect the results of KNN algorithm on log-based anomaly detection. Thus, we propose an improved KNN algorithm-based method which uses the existing mean-shift clustering algorithm to efficiently select the training set from massive logs. Then we assign different weights to samples with different distances, which reduces the negative effect of unbalanced distribution of the log samples on the accuracy of KNN algorithm. By comparing experiments on log sets from five supercomputers, the results show that the method we proposed can be effectively applied to log-based anomaly detection, and the accuracy, recall rate and F measure with our method are higher than those of traditional keyword search method.


Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1067 ◽  
Author(s):  
Chenbin Zhang ◽  
Ningning Qin ◽  
Yanbo Xue ◽  
Le Yang

Commercial interests in indoor localization have been increasing in the past decade. The success of many applications relies at least partially on indoor localization that is expected to provide reliable indoor position information. Wi-Fi received signal strength (RSS)-based indoor localization techniques have attracted extensive attentions because Wi-Fi access points (APs) are widely deployed and we can obtain the Wi-Fi RSS measurements without extra hardware cost. In this paper, we propose a hierarchical classification-based method as a new solution to the indoor localization problem. Within the developed approach, we first adopt an improved K-Means clustering algorithm to divide the area of interest into several zones and they are allowed to overlap with one another to improve the generalization capability of the following indoor positioning process. To find the localization result, the K-Nearest Neighbor (KNN) algorithm and support vector machine (SVM) with the one-versus-one strategy are employed. The proposed method is implemented on a tablet, and its performance is evaluated in real-world environments. Experiment results reveal that the proposed method offers an improvement of 1.4% to 3.2% in terms of position classification accuracy and a reduction of 10% to 22% in terms of average positioning error compared with several benchmark methods.


Sign in / Sign up

Export Citation Format

Share Document