Efficiently Producing the K Nearest Neighbors in the Skyline on Vertically Partitioned Tables

Criteria that induce a Skyline naturally represent user's preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large, making unfeasible for users to process this set of points. To identify the best points among the Skyline, the Top-k Skyline approach has been proposed. Top-k Skyline uses discriminatory criteria to induce a total order of the points that comprise the Skyline, and recognizes the best or top-k points based on these criteria. In this article the authors model queries as multi-dimensional points that represent bounds of VPT (Vertically Partitioned Table) property values, and datasets as sets of multi-dimensional points; the problem is to locate the k best tuples in the dataset whose distance to the query is minimized. A tuple is among the k best tuples whenever there is not another tuple that is better in all dimensions, and that is closer to the query point, i.e., the k best tuples correspond to the k nearest points to the query that are incomparable or belong to the skyline. The authors name these tuples the k nearest neighbors in the skyline. The authors propose a hybrid approach that combines Skyline and Top-k solutions and develop two algorithms: TKSI and k-NNSkyline. The proposed algorithms identify among the skyline tuples, the k ones with the lowest values of the distance metric, i.e., the k nearest neighbors to the multi-dimensional query that are incomparable. Empirically, we study the performance and quality of TKSI and k-NNSkyline. The authors’ experimental results show the TKSI is able to speed up the computation of the Top-k Skyline in at least 50% percent with respect to the state-of-the-art solutions, whenever k is smaller than the size of the Skyline. Additionally, the authors’ results suggest that k-NNSkyline outperforms existing solutions by up to three orders of magnitude.

Download Full-text

A Machine Vision Rapid Method to Determine the Ripeness Degree of Olive Lots

Sensors ◽

10.3390/s21092940 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2940

Author(s):

Luciano Ortenzi ◽

Simone Figorilli ◽

Corrado Costa ◽

Federico Pallottino ◽

Simona Violino ◽

...

Keyword(s):

Machine Vision ◽

Nearest Neighbors ◽

Calibration Method ◽

Mobile App ◽

Harvest Time ◽

Automatic Assessment ◽

K Nearest Neighbors ◽

Olive Ripeness ◽

Olive Ripening

The degree of olive maturation is a very important factor to consider at harvest time, as it influences the organoleptic quality of the final product, for both oil and table use. The Jaén index, evaluated by measuring the average coloring of olive fruits (peel and pulp), is currently considered to be one of the most indicative methods to determine the olive ripening stage, but it is a slow assay and its results are not objective. The aim of this work is to identify the ripeness degree of olive lots through a real-time, repeatable, and objective machine vision method, which uses RGB image analysis based on a k-nearest neighbors classification algorithm. To overcome different lighting scenarios, pictures were subjected to an automatic colorimetric calibration method—an advanced 3D algorithm using known values. To check the performance of the automatic machine vision method, a comparison was made with two visual operator image evaluations. For 10 images, the number of black, green, and purple olives was also visually evaluated by these two operators. The accuracy of the method was 60%. The system could be easily implemented in a specific mobile app developed for the automatic assessment of olive ripeness directly in the field, for advanced georeferenced data analysis.

Download Full-text

Identifying buzz in social media: a hybrid approach using artificial bee colony and k-nearest neighbors for outlier detection

Social Network Analysis and Mining ◽

10.1007/s13278-017-0461-2 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 16

Author(s):

Reema Aswani ◽

S. P. Ghrera ◽

Arpan Kumar Kar ◽

Satish Chandra

Keyword(s):

Social Media ◽

Outlier Detection ◽

Artificial Bee Colony ◽

Hybrid Approach ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Bee Colony

Download Full-text

Recommender System Using K-Nearest Neighbors and Singular Value Decomposition Algorithms: A Hybrid Approach

Advances in Intelligent Systems and Computing - Progress in Computing, Analytics and Networking ◽

10.1007/978-981-15-2414-1_50 ◽

2020 ◽

pp. 497-504

Author(s):

Rounick Palit ◽

Rajdeep Chatterjee

Keyword(s):

Singular Value Decomposition ◽

Recommender System ◽

Hybrid Approach ◽

Nearest Neighbors ◽

Singular Value ◽

Decomposition Algorithms ◽

K Nearest Neighbors ◽

Value Decomposition

Download Full-text

A similarity metric designed to speed up, using hardware, the recommender systems k-nearest neighbors algorithm

Knowledge-Based Systems ◽

10.1016/j.knosys.2013.06.010 ◽

2013 ◽

Vol 51 ◽

pp. 27-34 ◽

Cited By ~ 26

Author(s):

Jesús Bobadilla ◽

Fernando Ortega ◽

Antonio Hernando ◽

Guillermo Glez-de-Rivera

Keyword(s):

Recommender Systems ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Similarity Metric ◽

Speed Up

Download Full-text

Automatic Classification for Fruits’ Types and Identification of Rotten Ones Using k-NN and SVM

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v15i03.9832 ◽

2019 ◽

Vol 15 (03) ◽

pp. 47 ◽

Cited By ~ 2

Author(s):

Ann Nosseir ◽

Seif Eldin A. Ahmed

Keyword(s):

Texture Feature ◽

Texture Features ◽

Nearest Neighbors ◽

Support Vector ◽

K Nearest Neighbors ◽

A Value ◽

Different Types ◽

Linear Svm ◽

Occurrence Matrix

Having a system that classifies different types of fruits and identifies the quality of fruits will be of a value in various areas especially in an area of mass production of fruits’ products. This paper presents a novel system that differentiates between four fruits types and identifies the decayed ones from the fresh. The algorithms used are based on the colour and the texture features of the fruits’ images. The algorithms extract the RGB values and the first statistical order and second statistical of the Gray Level Co-occurrence Matrix (GLCM) values. To segregate between the fruits’ types, Fine, Medium, Coarse, Cosine, Cubic, and Weighted K-Nearest Neighbors algorithms are applied. The accuracy percentages of each are 96.3%, 93.8%, 25%, 83.8%, 90%, and 95% respectively. These steps are tested with 46 pictures taken from a mobile phone of seasonal fruits at the time i.e., banana, apple, and strawberry. All types were accurately identifying. To tell apart the decayed fruits from the fresh, the linear and quadratic Support Vector Machine (SVM) algorithms differentiated between them based on the colour segmentation and the texture feature algorithms values of each fruit image. The accuracy of the linear SVM is 96% and quadratic SVM 98%.

Download Full-text

System Model for Prediction Analytics Using K-Nearest Neighbors Algorithm

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8536 ◽

2019 ◽

Vol 16 (10) ◽

pp. 4425-4430 ◽

Cited By ~ 1

Author(s):

Devendra Prasad ◽

Sandip Kumar Goyal ◽

Avinash Sharma ◽

Amit Bindal ◽

Virendra Singh Kushwah

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Research Work ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

System Model ◽

K Nearest Neighbors ◽

Prediction Analysis ◽

Pros And Cons

Machine Learning is a growing area in computer science in today’s era. This article is focusing on prediction analysis using K-Nearest Neighbors (KNN) Machine Learning algorithm. Data in the dataset are processed, analyzed and predicated using the specified algorithm. Introduction of various Machine Learning algorithms, its pros and cons have been discussed. The KNN algorithm with detail study is given and it is implemented on the specified data with certain parameters. The research work elucidates prediction analysis and explicates the prediction of quality of restaurants.

Download Full-text

Improving an AI-Based Algorithm to Automatically Generate Concept Maps

Computer and Information Science ◽

10.5539/cis.v12n4p72 ◽

2019 ◽

Vol 12 (4) ◽

pp. 72

Author(s):

Sara Alomari ◽

Salha Abdullah

Keyword(s):

Text Analysis ◽

Classification Accuracy ◽

Concept Maps ◽

Concept Map ◽

Nearest Neighbors ◽

Learning Method ◽

K Nearest Neighbors ◽

Intelligent Algorithms ◽

Effective Learning

Concept maps have been used to assist learners as an effective learning method in identifying relationships between information, especially when teaching materials have many topics or concepts. However, making a manual concept map is a long and tedious task. It is time-consuming and demands an intensive effort in reading the full content and reasoning the relationships among concepts. Due to this inefficiency, many studies are carried out to develop intelligent algorithms using several data mining techniques. In this research, the authors aim at improving Text Analysis-Association Rules Mining (TA-ARM) algorithm using the weighted K-nearest neighbors (KNN) algorithm instead of the traditional KNN. The weighted KNN is expected to optimize the classification accuracy, which will, eventually, enhance the quality of the generated concept map.

Download Full-text

Nearest Neighbors via a Hybrid Approach in Large Datasets: A Speed up

Proceedings of International Conference on Computational Intelligence and Data Engineering - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-981-15-8767-2_4 ◽

2020 ◽

pp. 41-55

Author(s):

Y. Narasimhulu ◽

Raghunadh Pasunuri ◽

V China Venkaiah

Keyword(s):

Hybrid Approach ◽

Nearest Neighbors ◽

Large Datasets ◽

Speed Up

Download Full-text

Machine Learning Approach to Dysphonia Detection

Applied Sciences ◽

10.3390/app8101927 ◽

2018 ◽

Vol 8 (10) ◽

pp. 1927 ◽

Cited By ~ 1

Author(s):

Zuzana Dankovičová ◽

Dávid Sovák ◽

Peter Drotár ◽

Liberios Vokorokos

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Nearest Neighbors ◽

Classification Model ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Machine Learning Approach ◽

Speech Features

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.

Download Full-text

Localized Incomplete Multiple Kernel k-means

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/454 ◽

2018 ◽

Cited By ~ 4

Author(s):

Xinzhong Zhu ◽

Xinwang Liu ◽

Miaomiao Li ◽

En Zhu ◽

Li Liu ◽

...

Keyword(s):

Iterative Algorithm ◽

Local Structure ◽

Optimization Problem ◽

Clustering Algorithm ◽

Recent Literature ◽

State Of The Art ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Multiple Kernel ◽

Benchmark Datasets

The recently proposed multiple kernel k-means with incomplete kernels (MKKM-IK) optimally integrates a group of pre-specified incomplete kernel matrices to improve clustering performance. Though it demonstrates promising performance in various applications, we observe that it does not \emph{sufficiently consider the local structure among data and indiscriminately forces all pairwise sample similarity to equally align with their ideal similarity values}. This could make the incomplete kernels less effectively imputed, and in turn adversely affect the clustering performance. In this paper, we propose a novel localized incomplete multiple kernel k-means (LI-MKKM) algorithm to address this issue. Different from existing MKKM-IK, LI-MKKM only requires the similarity of a sample to its k-nearest neighbors to align with their ideal similarity values. This helps the clustering algorithm to focus on closer sample pairs that shall stay together and avoids involving unreliable similarity evaluation for farther sample pairs. We carefully design a three-step iterative algorithm to solve the resultant optimization problem and theoretically prove its convergence. Comprehensive experiments on eight benchmark datasets demonstrate that our algorithm significantly outperforms the state-of-the-art comparable algorithms proposed in the recent literature, verifying the advantage of considering local structure.

Download Full-text