nearest neighbors Latest Research Papers

The authorship identification task aims at identifying the original author of an anonymous text sample from a set of candidate authors. It has several application domains such as digital text forensics and information retrieval. These application domains are not limited to a specific language. However, most of the authorship identification studies are focused on English and limited attention has been paid to Urdu. However, existing Urdu authorship identification solutions drop accuracy as the number of training samples per candidate author reduces and when the number of candidate authors increases. Consequently, these solutions are inapplicable to real-world cases. Moreover, due to the unavailability of reliable POS taggers or sentence segmenters, all existing authorship identification studies on Urdu text are limited to the word n-grams features only. To overcome these limitations, we formulate a stylometric feature space, which is not limited to the word n-grams feature only. Based on this feature space, we use an authorship identification solution that transforms each text sample into a point set, retrieves candidate text samples, and relies on the nearest neighbors classifier to predict the original author of the anonymous text sample. To evaluate our solution, we create a significantly larger corpus than existing studies and conduct several experimental studies that show that our solution can overcome the limitations of existing studies and report an accuracy level of 94.03%, which is higher than all previous authorship identification works.

Download Full-text

A Large-Scale k -Nearest Neighbor Classification Algorithm Based on Neighbor Relationship Preservation

Wireless Communications and Mobile Computing ◽

10.1155/2022/7409171 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Yunsheng Song ◽

Xiaohan Kong ◽

Chao Zhang

Keyword(s):

Theoretical Analysis ◽

Large Scale ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Classification Algorithm ◽

Data Partition ◽

Knn Classification ◽

Test Instance ◽

Neighbor Relationship ◽

Neighbor Classification

Owing to the absence of hypotheses of the underlying distributions of the data and the strong generation ability, the k -nearest neighbor (kNN) classification algorithm is widely used to face recognition, text classification, emotional analysis, and other fields. However, kNN needs to compute the similarity between the unlabeled instance and all the training instances during the prediction process; it is difficult to deal with large-scale data. To overcome this difficulty, an increasing number of acceleration algorithms based on data partition are proposed. However, they lack theoretical analysis about the effect of data partition on classification performance. This paper has made a theoretical analysis of the effect using empirical risk minimization and proposed a large-scale k -nearest neighbor classification algorithm based on neighbor relationship preservation. The process of searching the nearest neighbors is converted to a constrained optimization problem. Then, it gives the estimation of the difference on the objective function value under the optimal solution with data partition and without data partition. According to the obtained estimation, minimizing the similarity of the instances in the different divided subsets can largely reduce the effect of data partition. The minibatch k -means clustering algorithm is chosen to perform data partition for its effectiveness and efficiency. Finally, the nearest neighbors of the test instance are continuously searched from the set generated by successively merging the candidate subsets until they do not change anymore, where the candidate subsets are selected based on the similarity between the test instance and cluster centers. Experiment results on public datasets show that the proposed algorithm can largely keep the same nearest neighbors and no significant difference in classification accuracy as the original kNN classification algorithm and better results than two state-of-the-art algorithms.

Download Full-text

Surface roughness prediction in micro-plasma transferred arc metal additive manufacturing process using K-nearest neighbors algorithm

The International Journal of Advanced Manufacturing Technology ◽

10.1007/s00170-021-08639-2 ◽

2022 ◽

Author(s):

Pravin Kumar ◽

Neelesh Kumar Jain

Keyword(s):

Surface Roughness ◽

Additive Manufacturing ◽

Manufacturing Process ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Metal Additive Manufacturing ◽

Plasma Transferred Arc ◽

Transferred Arc ◽

Roughness Prediction ◽

Metal Additive

Download Full-text

Decision Tree vs K-Nearest Neighbors: Machine Learning Based Wind Estimation for Unmanned Aerial Vehicles

10.2514/6.2022-2500 ◽

2022 ◽

Author(s):

Ahmed Baraka ◽

Nathan Lindsay ◽

Liang Sun ◽

George Gorospe

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Unmanned Aerial Vehicles ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Wind Estimation ◽

Aerial Vehicles

Download Full-text

A new $k$-nearest neighbors classifier for functional data

Statistics and Its Interface ◽

10.4310/20-sii650 ◽

2022 ◽

Vol 15 (2) ◽

pp. 247-260

Author(s):

Jin-Ting Zhang ◽

Tianming Zhu

Keyword(s):

Functional Data ◽

Nearest Neighbors ◽

K Nearest Neighbors

Download Full-text

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Lecture Notes in Electrical Engineering - Genetic and Evolutionary Computing ◽

10.1007/978-981-16-8430-2_40 ◽

2022 ◽

pp. 438-448

Author(s):

Zhihe Wang ◽

Gang Zhang ◽

Hui Du ◽

Yiyang Ni

Keyword(s):

Nearest Neighbors ◽

Affinity Propagation ◽

K Nearest Neighbors ◽

Affinity Propagation Clustering

Download Full-text

Parallel Nearest Neighbors in Low Dimensions with Batch Updates

10.1137/1.9781611977042.16 ◽

2022 ◽

pp. 195-208

Author(s):

Guy E. Blelloch ◽

Magdalen Dobson

Keyword(s):

Nearest Neighbors ◽

Low Dimensions

Download Full-text

Development of a weed detection system using machine learning and neural network algorithms

Eastern-European Journal of Enterprise Technologies ◽

10.15587/1729-4061.2021.246706 ◽

2021 ◽

Vol 6 (2 (114)) ◽

Author(s):

Baydaulet Urmashev ◽

Zholdas Buribayev ◽

Zhazira Amirgaliyeva ◽

Aisulu Ataniyazova ◽

Mukhtar Zhassuzak ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Detection System ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Plant Diseases ◽

Weed Detection ◽

K Nearest Neighbors

The detection of weeds at the stages of cultivation is very important for detecting and preventing plant diseases and eliminating significant crop losses, and traditional methods of performing this process require large costs and human resources, in addition to exposing workers to the risk of contamination with harmful chemicals. To solve the above tasks, also in order to save herbicides and pesticides, to obtain environmentally friendly products, a program for detecting agricultural pests using the classical K-Nearest Neighbors, Random Forest and Decision Tree algorithms, as well as YOLOv5 neural network, is proposed. After analyzing the geographical areas of the country, from the images of the collected weeds, a proprietary database with more than 1000 images for each class was formed. A brief review of the researchers' scientific papers describing the methods they developed for identifying, classifying and discriminating weeds based on machine learning algorithms, convolutional neural networks and deep learning algorithms is given. As a result of the research, a weed detection system based on the YOLOv5 architecture was developed and quality estimates of the above algorithms were obtained. According to the results of the assessment, the accuracy of weed detection by the K-Nearest Neighbors, Random Forest and Decision Tree classifiers was 83.3 %, 87.5 %, and 80 %. Due to the fact that the images of weeds of each species differ in resolution and level of illumination, the results of the neural network have corresponding indicators in the intervals of 0.82–0.92 for each class. Quantitative results obtained on real data demonstrate that the proposed approach can provide good results in classifying low-resolution images of weeds.

Download Full-text

The Performance Analysis of K-Nearest Neighbors Based Detection Algorithm in Visible Light Communication Systems

International Journal of Scientific and Research Publications (IJSRP) ◽

10.29322/ijsrp.11.12.2021.p12069 ◽

2021 ◽

Vol 11 (12) ◽

pp. 479-483

Author(s):

Mehmet Sönmez

Keyword(s):

Performance Analysis ◽

Visible Light ◽

Communication Systems ◽

Nearest Neighbors ◽

Visible Light Communication ◽

Detection Algorithm ◽

K Nearest Neighbors

Download Full-text

Factorial Analysis for Gas Leakage Risk Predictions from a Vehicle-Based Methane Survey

Applied Sciences ◽

10.3390/app12010115 ◽

2021 ◽

Vol 12 (1) ◽

pp. 115

Author(s):

Khongorzul Dashdondov ◽

Mi-Hwa Song

Keyword(s):

Air Pollution ◽

Feature Selection ◽

Natural Gas ◽

Open Data ◽

Nearest Neighbors ◽

Factorial Analysis ◽

Test Results ◽

Accuracy Rate ◽

K Nearest Neighbors ◽

Gas Leakage

Natural gas (NG), typically methane, is released into the air, causing significant air pollution and environmental and health problems. Nowadays, there is a need to use machine-based methods to predict gas losses widely. In this article, we proposed to predict NG leakage levels through feature selection based on a factorial analysis (FA) of the USA’s urban natural gas open data. The paper has been divided into three sections. First, we select essential features using FA. Then, the dataset is labeled by k-means clustering with OrdinalEncoder (OE)-based normalization. The final module uses five algorithms (extreme gradient boost (XGBoost), K-nearest neighbors (KNN), decision tree (DT), random forest (RF), Naive Bayes (NB), and multilayer perceptron (MLP)) to predict gas leakage levels. The proposed method is evaluated by the accuracy, F1-score, mean standard error (MSE), and area under the ROC curve (AUC). The test results indicate that the F-OE-based classification method has improved successfully. Moreover, F-OE-based XGBoost (F-OE-XGBoost) showed the best performance by giving 95.14% accuracy, an F1-score of 95.75%, an MSE of 0.028, and an AUC of 96.29%. Following these, the second-best outcomes of an accuracy rate of 95.09%, F1-score of 95.60%, MSE of 0.029, and AUC of 96.11% were achieved by the F-OE-RF model.

Download Full-text

nearest neighbors
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

UrduAI: Writeprints for Urdu Authorship Identification

A Large-Scale k -Nearest Neighbor Classification Algorithm Based on Neighbor Relationship Preservation

Surface roughness prediction in micro-plasma transferred arc metal additive manufacturing process using K-nearest neighbors algorithm

Decision Tree vs K-Nearest Neighbors: Machine Learning Based Wind Estimation for Unmanned Aerial Vehicles

A new $k$-nearest neighbors classifier for functional data

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Parallel Nearest Neighbors in Low Dimensions with Batch Updates

Development of a weed detection system using machine learning and neural network algorithms

The Performance Analysis of K-Nearest Neighbors Based Detection Algorithm in Visible Light Communication Systems

Factorial Analysis for Gas Leakage Risk Predictions from a Vehicle-Based Methane Survey

Export Citation Format

nearest neighborsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

UrduAI: Writeprints for Urdu Authorship Identification

A Large-Scale k -Nearest Neighbor Classification Algorithm Based on Neighbor Relationship Preservation

Surface roughness prediction in micro-plasma transferred arc metal additive manufacturing process using K-nearest neighbors algorithm

Decision Tree vs K-Nearest Neighbors: Machine Learning Based Wind Estimation for Unmanned Aerial Vehicles

A new $k$-nearest neighbors classifier for functional data

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Parallel Nearest Neighbors in Low Dimensions with Batch Updates

Development of a weed detection system using machine learning and neural network algorithms

The Performance Analysis of K-Nearest Neighbors Based Detection Algorithm in Visible Light Communication Systems

Factorial Analysis for Gas Leakage Risk Predictions from a Vehicle-Based Methane Survey

nearest neighbors
Recently Published Documents