A new k-nearest neighbor density-based clustering method and its application to hyperspectral images

Biofilms formed on the surface of agro-food processing facilities can cause food poisoning by providing an environment in which bacteria can be cultured. Therefore, hygiene management through initial detection is important. This study aimed to assess the feasibility of detecting Escherichia coli (E. coli) and Salmonella typhimurium (S. typhimurium) on the surface of food processing facilities by using fluorescence hyperspectral imaging. E. coli and S. typhimurium were cultured on high-density polyethylene and stainless steel coupons, which are the main materials used in food processing facilities. We obtained fluorescence hyperspectral images for the range of 420–730 nm by emitting UV light from a 365 nm UV light source. The images were used to perform discriminant analyses (linear discriminant analysis, k-nearest neighbor analysis, and partial-least squares discriminant analysis) to identify and classify coupons on which bacteria could be cultured. The discriminant performances of specificity and sensitivity for E. coli (1–4 log CFU·cm−2) and S. typhimurium (1–6 log CFU·cm−2) were over 90% for most machine learning models used, and the highest performances were generally obtained from the k-nearest neighbor (k-NN) model. The application of the learning model to the hyperspectral image confirmed that the biofilm detection was well performed. This result indicates the possibility of rapidly inspecting biofilms using fluorescence hyperspectral images.

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

A Novel clustering method based on hybrid K-nearest-neighbor graph

Pattern Recognition ◽

10.1016/j.patcog.2017.09.008 ◽

2018 ◽

Vol 74 ◽

pp. 1-14 ◽

Cited By ~ 19

Author(s):

Yikun Qin ◽

Zhu Liang Yu ◽

Chang-Dong Wang ◽

Zhenghui Gu ◽

Yuanqing Li

Keyword(s):

Nearest Neighbor ◽

K Nearest Neighbor ◽

Clustering Method ◽

Neighbor Graph ◽

Nearest Neighbor Graph

Download Full-text

KNN-DBSCAN: Using k-nearest neighbor information for parameter-free density based clustering

2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT) ◽

10.1109/icicict1.2017.8342664 ◽

2017 ◽

Cited By ~ 7

Author(s):

Ankush Sharma ◽

Amit Sharma

Keyword(s):

Nearest Neighbor ◽

K Nearest Neighbor ◽

Density Based Clustering

Download Full-text

A SCALABLE CLUSTERING METHOD FOR CATEGORICAL SEQUENCE DATA

International Journal of Computational Methods ◽

10.1142/s0219876205000417 ◽

2005 ◽

Vol 02 (02) ◽

pp. 167-180

Author(s):

SEUNG-JOON OH ◽

JAE-YEARN KIM

Keyword(s):

Nearest Neighbor ◽

Sequence Data ◽

Clustering Algorithms ◽

K Nearest Neighbor ◽

Clustering Method ◽

Scalable Clustering ◽

Log Files ◽

Web Access ◽

Better Than

Clustering of sequences is relatively less explored but it is becoming increasingly important in data mining applications such as web usage mining and bioinformatics. The web user segmentation problem uses web access log files to partition a set of users into clusters such that users within one cluster are more similar to one another than to the users in other clusters. Similarly, grouping protein sequences that share a similar structure can help to identify sequences with similar functions. However, few clustering algorithms consider sequentiality. In this paper, we study how to cluster sequence datasets. Due to the high computational complexity of hierarchical clustering algorithms for clustering large datasets, a new clustering method is required. Therefore, we propose a new scalable clustering method using sampling and a k-nearest-neighbor method. Using a splice dataset and a synthetic dataset, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional algorithms.

Download Full-text

Nearest neighbor-density-based clustering methods for large hyperspectral images

Image and Signal Processing for Remote Sensing XXIII ◽

10.1117/12.2278221 ◽

2017 ◽

Author(s):

Claude Cariou ◽

Kacem Chehdi

Keyword(s):

Nearest Neighbor ◽

Hyperspectral Images ◽

Clustering Methods ◽

Density Based Clustering

Download Full-text

Improving K-Nearest Neighbor Approaches for Density-Based Pixel Clustering in Hyperspectral Remote Sensing Images

Remote Sensing ◽

10.3390/rs12223745 ◽

2020 ◽

Vol 12 (22) ◽

pp. 3745

Author(s):

Claude Cariou ◽

Steven Le Moan ◽

Kacem Chehdi

Keyword(s):

Image Analysis ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Local Density ◽

Hyperspectral Images ◽

K Nearest Neighbor ◽

Kappa Index ◽

Graph Regularization ◽

Hyperspectral Image Analysis ◽

Density Peaks

We investigated nearest-neighbor density-based clustering for hyperspectral image analysis. Four existing techniques were considered that rely on a K-nearest neighbor (KNN) graph to estimate local density and to propagate labels through algorithm-specific labeling decisions. We first improved two of these techniques, a KNN variant of the density peaks clustering method dpc, and a weighted-mode variant of knnclust, so the four methods use the same input KNN graph and only differ by their labeling rules. We propose two regularization schemes for hyperspectral image analysis: (i) a graph regularization based on mutual nearest neighbors (MNN) prior to clustering to improve cluster discovery in high dimensions; (ii) a spatial regularization to account for correlation between neighboring pixels. We demonstrate the relevance of the proposed methods on synthetic data and hyperspectral images, and show they achieve superior overall performances in most cases, outperforming the state-of-the-art methods by up to 20% in kappa index on real hyperspectral images.

Download Full-text

Improved Nearest Neighbor Density-Based Clustering Techniques with Application to Hyperspectral Images

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053489 ◽

2020 ◽

Author(s):

Claude Cariou ◽

Kacem Chehdi ◽

Steven Le Moan

Keyword(s):

Nearest Neighbor ◽

Hyperspectral Images ◽

Clustering Techniques ◽

Density Based Clustering

Download Full-text

Variety Identification of Chinese Walnuts Using Hyperspectral Imaging Combined with Chemometrics

Applied Sciences ◽

10.3390/app11199124 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9124

Author(s):

Hongzhe Jiang ◽

Liancheng Ye ◽

Xingpeng Li ◽

Minghong Shi

Keyword(s):

Hyperspectral Imaging ◽

Nearest Neighbor ◽

Characteristic Curve ◽

Principal Component ◽

Hyperspectral Images ◽

Variety Identification ◽

Support Vector ◽

Identification System ◽

K Nearest Neighbor ◽

Chinese Walnut

Chinese walnuts have extraordinary nutritional and organoleptic qualities, and counterfeit Chinese walnut products are pervasive in the market. The aim of this study was to investigate the feasibility of hyperspectral imaging (HSI) technique to accurately identify and visualize Chinese walnut varieties. Hyperspectral images of 400 Chinese walnuts including 200 samples of Ningguo variety and 200 samples of Lin’an variety were acquired in range of 400–1000 nm. Spectra were extracted from representative regions of interest (ROIs), and principal component analysis (PCA) of spectra showed that the characteristic second principal component (PC2) was potentially effective in variety identification. The PC transformation was also conducted to hyperspectral images to make an exploratory visualization according to pixel-wise PC scores. Three different modeling methods including partial least squares-discriminant analysis (PLS-DA), k-nearest neighbor (KNN), and support vector machine (SVM) were individually employed to develop classification models. Results indicated that raw full spectra constructed PLS-DA model performed best with correct classification rates (CCRs) of 97.33%, 95.33%, and 92.00% in calibration, cross-validation, and prediction sets, respectively. Successful projects algorithm (SPA), competitive adaptive reweighted sampling (CARS), and PC loadings were individually used for effective wavelengths selection. Subsequently, simplified PLS-DA model based on wavelengths selected by CARS yielded the best 96.33%, 95.67% and 91.00% CCRs in the three sets. This optimal CARS-PLS-DA model acquired a sensitivity of 93.62%, a specificity of 88.68%, the area under the receiver operating characteristic curve (AUC) value of 0.91, and Kappa coefficient of 0.82 in prediction set. Classification maps were finally generated by classifying the varieties of each pixel in multispectral images at CARS-selected wavelengths, and the general variety was then readily discernible. These results demonstrated that features extracted from HSI had outstanding ability, and could be applied as a reliable tool for the further development of an on-line identification system for Chinese walnut variety.

Download Full-text

Machine Learning Verdict of EEG Signals in Brain Computer Interface

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1838114 ◽

2018 ◽

pp. 429-441

Author(s):

M. Jeyanthi ◽

C. Velayutham

Keyword(s):

Nearest Neighbor ◽

Technology Development ◽

Vital Role ◽

Svm Classifier ◽

K Nearest Neighbor ◽

Data Mining Technique ◽

Data Set ◽

Eeg Data ◽

Irrelevant Attributes

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.

Download Full-text