Modeling River Ice Breakup Dates by k-Nearest Neighbor Ensemble

Forecasting of river ice breakup timing is directly related to the local ice-caused flooding management. However, river ice forecasting using k-nearest neighbor (kNN) algorithms is limited. Thus, a kNN stacking ensemble learning (KSEL) method was developed and applied to forecasting breakup dates (BDs) for the Athabasca River at Fort McMurray in Canada. The kNN base models with diverse inputs and distance functions were developed and their outputs were further combined. The performance of these models was examined using the leave-one-out cross validation method based on the historical BDs and corresponding climate and river conditions in 1980–2015. The results indicated that the kNN with the Chebychev distance functions generally outperformed other kNN base models. Through the simple average methods, the ensemble kNN models using multiple-type (Mahalanobis and Chebychev) distance functions had the overall optimal performance among all models. The improved performance indicates that the kNN ensemble is a promising tool for river ice forecasting. The structure of optimal models also implies that the breakup timing is mainly linked with temperature and water flow conditions before breakup as well as during and just after freeze up.

Download Full-text

Discrimination of Chinese Liquors Based on Electronic Nose and Fuzzy Discriminant Principal Component Analysis

Foods ◽

10.3390/foods8010038 ◽

2019 ◽

Vol 8 (1) ◽

pp. 38 ◽

Cited By ~ 2

Author(s):

Xiaohong Wu ◽

Jin Zhu ◽

Bin Wu ◽

Chao Zhao ◽

Jun Sun ◽

...

Keyword(s):

Principal Component Analysis ◽

Feature Extraction ◽

Electronic Nose ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor ◽

Knn Classifier ◽

Extraction Algorithm ◽

Leave One Out

The detection of liquor quality is an important process in the liquor industry, and the quality of Chinese liquors is partly determined by the aromas of the liquors. The electronic nose (e-nose) refers to an artificial olfactory technology. The e-nose system can quickly detect different types of Chinese liquors according to their aromas. In this study, an e-nose system was designed to identify six types of Chinese liquors, and a novel feature extraction algorithm, called fuzzy discriminant principal component analysis (FDPCA), was developed for feature extraction from e-nose signals by combining discriminant principal component analysis (DPCA) and fuzzy set theory. In addition, principal component analysis (PCA), DPCA, K-nearest neighbor (KNN) classifier, leave-one-out (LOO) strategy and k-fold cross-validation (k = 5, 10, 20, 25) were employed in the e-nose system. The maximum classification accuracy of feature extraction for Chinese liquors was 98.378% using FDPCA, showing this algorithm to be extremely effective. The experimental results indicate that an e-nose system coupled with FDPCA is a feasible method for classifying Chinese liquors.

Download Full-text

Optimized Cost Model for k-NN Queries in R*-Trees over Random Distribution

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.3315 ◽

2011 ◽

Vol 403-408 ◽

pp. 3315-3321

Author(s):

Sirisala Nageswara Rao

Keyword(s):

Nearest Neighbor ◽

Cost Model ◽

Neighbor Join ◽

Multidimensional Data ◽

Analysis Model ◽

K Nearest Neighbor ◽

Storage And Retrieval ◽

Key Issues ◽

Improved Performance ◽

The Impact

Efficient storage and retrieval of multidimensional data in large volumes has become one of the key issues in the design and implementation of commercial and application software. The kind of queries posted on such data is also multifarious. Nearest neighbor queries are one such category and have more significance in GIS type of application. R-tree and its sequel are data partitioned hierarchical multidimensional indexing structures that help in this purpose. Today’s research has turned towards the development of powerful analytical method to predict the performance of such indexing structures such as for varies categories of queries such as range, nearest neighbor, join, etc .This paper focuses on performance of R*-tree for k nearest neighbor (kNN) queries. While general approaches are available in literature that works better for larger k over uniform data, few have explored the impact of small values of k. This paper proposes improved performance analysis model for kNN query for small k over random data. The results are tabulated and compared with existing models, the proposed model out performs the existing models in a significant way for small k

Download Full-text

Moment Invariant Features Extraction for Hand Gesture Recognition of Sign Language based on SIBI

EMITTER International Journal of Engineering Technology ◽

10.24003/emitter.v5i1.173 ◽

2017 ◽

Vol 5 (1) ◽

pp. 119-138 ◽

Cited By ~ 2

Author(s):

Angga Rahagiyanto ◽

Achmad Basuki ◽

Riyanto Sigit

Keyword(s):

Sign Language ◽

Nearest Neighbor ◽

Sensor Data ◽

Deaf People ◽

K Nearest Neighbor ◽

Clock Rate ◽

Moment Invariant ◽

Myo Armband ◽

Length Data ◽

Leave One Out

Myo Armband became an immersive technology to help deaf people for communication each other. The problem on Myo sensor is unstable clock rate. It causes the different length data for the same period even on the same gesture. This research proposes Moment Invariant Method to extract the feature of sensor data from Myo. This method reduces the amount of data and makes the same length of data. This research is user-dependent, according to the characteristics of Myo Armband. The testing process was performed by using alphabet A to Z on SIBI, Indonesian Sign Language, with static and dynamic finger movements. There are 26 class of alphabets and 10 variants in each class. We use min-max normalization for guarantying the range of data. We use K-Nearest Neighbor method to classify dataset. Performance analysis with leave-one-out-validation method produced an accuracy of 82.31%. It requires a more advanced method of classification to improve the performance on the detection results.

Download Full-text

On approximate k-nearest neighbor searches based on the earth mover’s distance for efficient content-based multimedia information retrieval

Computer Science and Information Systems ◽

10.2298/csis181010012 ◽

2019 ◽

Vol 16 (2) ◽

pp. 615-638 ◽

Cited By ~ 1

Author(s):

Min-Hee Jang ◽

Sang-Wook Kim ◽

Woong-Kee Loh ◽

Jung-Im Won

Keyword(s):

Nearest Neighbor ◽

Previous Method ◽

Multimedia Databases ◽

Distance Functions ◽

Index Structure ◽

Earth Mover’S Distance ◽

Post Processing ◽

K Nearest Neighbor ◽

Earth Mover's Distance ◽

The Earth

The Earth Mover's Distance (EMD) is one of the most-widely used distance functions to measure the similarity between two multimedia objects. While providing good search results, the EMD is too much time consuming to be used in large multimedia databases. To solve the problem, we propose an approximate k-nearest neighbor (k-NN) search method based on the EMD. In the proposed method, the overhead for both disk accesses and EMD computations is reduced significantly, thanks to the approximation. First, the proposed method builds an index using the M-tree, a distance-based multi-dimensional index structure, to reduce the disk access overhead. When building the index, we reduce the number of features in the multimedia objects through dimensionalityreduction. When performing the k-NN search on the M-tree, we find a small set of candidates from the disk using the index and then perform the post-processing on them. Second, the proposed method uses the approximate EMD for index retrieval and post-processing to reduce the computational overhead of the EMD. To compensate the errors due to the approximation, the method provides a way of accuracy improvement of the approximate EMD. We performed extensive experiments to show the efficiency of the proposed method. As a result, the method achieves significant improvement in performance with only small errors: the proposed method outperforms the previous method by up to 67.3% with only 3.5% error.

Download Full-text

Raga classification based on pitch co-occurrence based features

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i1.pp157-166 ◽

2021 ◽

Vol 24 (1) ◽

pp. 157

Author(s):

Vibhavari Rajadnya ◽

Kalyani R. Joshi

Keyword(s):

Nearest Neighbor ◽

Spectral Power ◽

Multimedia Data ◽

Experimental Result ◽

K Nearest Neighbor ◽

Audio Recordings ◽

Temporal Aspects ◽

Occurrence Pattern ◽

Occurrence Matrix ◽

Leave One Out

<p><span>Analysis and classification of raga is the need of time especially in music industry. With the presence of abundance of multimedia data on internet, it is imperative to develop appropriate tools to classify ragas. In this work, an attempt has been made to use occurrence pattern of pitch based svara (note) for classification. Sequence of notes is an important cue in the raga classification. Pitch based svara (note) profile is formed. This pattern presents in the signal along with its statistical distribution can be characterized using co-occurrence matrix. Proposed note co-occurrence matrix summarizes this aspect. This matrix captures both tonal and temporal aspects of melody. Ragas differ in terms of distribution of spectral power. K-nearest neighbor (KNN) has been used as the classifier. Publicly available database consisting of 300 recordings of 30 Hindustani ragas consisting of 130 hours of audio recordings stored as 160 kbps mp3 fileswhich is part of CompMusic project is used. Leave one out validation strategy is used to evaluate the performance. Experimental result indicates the effectiveness of the proposed scheme which is giving accuracy of 93.7%.</span></p>

Download Full-text

Analysis of Braycurtis, Canberra and Euclidean Distance in KNN Algorithm

SinkrOn ◽

10.33395/sinkron.v4i1.10207 ◽

2019 ◽

Vol 4 (1) ◽

pp. 74 ◽

Cited By ~ 1

Author(s):

Annisa Fadhillah Pulungan ◽

Muhammad Zarlis ◽

Saib Suwilo

Keyword(s):

Euclidean Distance ◽

Evaluation Method ◽

Nearest Neighbor ◽

Distance Matrix ◽

Training Data ◽

Distance Functions ◽

Classification Model ◽

K Nearest Neighbor ◽

Distance Method ◽

Canberra Distance

Classification is a technique used to build a classification model from a sample of training data. One of the most popular classification techniques is The K-Nearest Neighbor (KNN). The KNN algorithm has important parameter that affect the performance of the KNN Algorithm. The parameter is the value of the K and distance matrix. The distance between two points is determined by the calculation of the distance matrix before classification process by the KNN. The purpose of this study was to analyze and compare performance of the KNN using the distance function. The distance functions are Braycurtis Distance, Canberra Distance and Euclidean Distance based on an accuracy perspective. This study uses the Iris Dataset from the UCI Machine Learning Repository. The evaluation method used id 10-Fold Cross-Validation. The result showed that the Braycurtis distance method had better performance that Canberra Distance and Euclidean Distance methods at K=6, K=7, K=8 ad K=10 with accuracy values of 96 %.

Download Full-text

Pengenalan Alfabet American Sign Language Menggunakan K-Nearest Neighbors Dengan Ekstraksi Fitur Histogram Of Oriented Gradients

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v5i3.1936 ◽

2020 ◽

Vol 5 (3) ◽

Author(s):

Muhammad Ezar Al Rivan ◽

Hafiz Irsyad ◽

Kevin Kevin ◽

Arta Tri Narta

Keyword(s):

American Sign Language ◽

Sign Language ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Manhattan Distance ◽

American Sign ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Histogram Of Oriented Gradient ◽

Chebychev Distance

Sign Language use to communicate to people with dissabilities. American Sign Language (ASL) one of popular sign language. Histogram of Oriented Gradient (HOG) can be use as feature extraction. Then feature stored in database. K-Nearest Neighbor use to measure distance between feature train and feature test. There are three distance use in this paper consist of Euclidean Distance, Manhattan Distance and Chebychev Distance. The best result are 0,99 when using Euclidean Distance and Manhattan Distance with k=3 dan k=5

Download Full-text

Differential mobility spectrometry classification of bacteria

Future Microbiology ◽

10.2217/fmb-2019-0192 ◽

2020 ◽

Vol 15 (4) ◽

pp. 233-240

Author(s):

Lauri Hokkinen ◽

Artturi Kesti ◽

Jaakko Lepomäki ◽

Osmo Anttalainen ◽

Anton Kontunen ◽

...

Keyword(s):

Nearest Neighbor ◽

Bacterial Species ◽

Rapid Identification ◽

Differential Mobility Spectrometry ◽

K Nearest Neighbor ◽

Differential Mobility ◽

Classification Rate ◽

Timely Initiation ◽

Initiation Of Therapy ◽

Leave One Out

Aim: Rapid identification of bacteria would facilitate timely initiation of therapy and improve cost–effectiveness of treatment. Traditional methods (culture, PCR) require reagents, consumables and hours to days to complete the identification. In this study, we examined whether differential mobility spectrometry could classify most common bacterial species, genera and between Gram status within minutes. Materials & methods: Cultured bacterial sample gaseous headspaces were measured with differential mobility spectrometry and data analyzed using k-nearest-neighbor and leave-one-out cross-validation. Results: Differential mobility spectrometry achieved a correct classification rate 70.7% for all bacterial species. For bacterial genera, the rate was 77.6% and between Gram status, 89.1%. Conclusion: Largest difficulties arose in distinguishing bacteria of the same genus. Future improvement of the sensor characteristics may improve the classification accuracy.

Download Full-text

PREDICTIVE QSAR MODELING OF PYRIDAZINYL DERIVATIVES USING K-NEAREST NEIGHBOR AND PHARMACOPHORE APPROACH

INDIAN DRUGS ◽

10.53879/id.54.07.10951 ◽

2017 ◽

Vol 54 (07) ◽

pp. 10-17

Author(s):

M.C. Sharma ◽

◽

D.V. Kohli

Keyword(s):

Correlation Coefficient ◽

Nearest Neighbor ◽

Predictive Ability ◽

3D Qsar ◽

Qsar Model ◽

Field Analysis ◽

Angiotensin Ii Receptor ◽

K Nearest Neighbor ◽

Qsar Modeling ◽

Leave One Out

This study was carried out elucidate the structural properties required for pyridazinyl derivatives to exhibit angiotensin II receptor activity. The best 2D-QSAR model was selected, having correlation coefficient r2 = 0.8156, cross validated squared correlation coefficient q2 = 0.7348 and predictive ability of the selected model was also confirmed by leave one out cross validation method. Further analysis was carried out using 3D-QSAR method k-nearest neighbor molecular field analysis approach; a leave-one-out crossvalidated correlation coefficient of 0.7188 and a predictivity for the external test set (0.7613) were obtained. By studying the QSAR models, one can select the suitable substituent for active compound with maximum potency.

Download Full-text

Combining Multiple k-Nearest Neighbor Classifiers Using Different Distance Functions

Lecture Notes in Computer Science - Intelligent Data Engineering and Automated Learning – IDEAL 2004 ◽

10.1007/978-3-540-28651-6_93 ◽

2004 ◽

pp. 634-641 ◽

Cited By ~ 12

Author(s):

Yongguang Bao ◽

Naohiro Ishii ◽

Xiaoyong Du

Keyword(s):

Nearest Neighbor ◽

Distance Functions ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifiers

Download Full-text