scholarly journals A New Hashing based Nearest Neighbors Selection Technique for Big Datasets

2021 ◽  
Author(s):  
Jude Tchaye-Kondi ◽  
Yanlong Zhai ◽  
Liehuang Zhu

KNN has the reputation of being a simple and powerful supervised learning algorithm used for either classification or regression. Although KNN prediction performance highly depends on the size of the training dataset, when this one is large, KNN suffers from slow decision making. This is because each decision-making process requires the KNN algorithm to look for nearest neighbors within the entire dataset. To overcome this slowness problem, we propose a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given data point. The proposed approach consists of dividing the data space into sub-cells of a virtual grid built on top of the dataset. The mapping between data points and sub-cells is achieved using hashing. When it comes to selecting the nearest neighbors of a new observation, we first identify the central cell where the observation is contained. Once that central cell is known, we then start looking for the nearest neighbors from it and the cells around. From our experimental performance analysis of publicly available datasets, our algorithm outperforms the original KNN with a predictive quality as good and offers competitive performance with solutions such as KDtree.

Author(s):  
Amir Al-Khafaji ◽  
Krishnanand Maillacheruvu ◽  
Robert Jacobs

A new technique to assess the reliability of published compression index equations in terms of soil void ratio is presented. Several published equations pertaining to different soil types are examined in terms of accuracy and applicability. The new technique employs regression analysis to examine a substantial number of published compression data objectively. The traditional bias inherent in the selection of the number of data points and the range of void ratios for a given regression equation is eliminated. This was made possible by creating ranges for the compression index irrespective of the data set involved. This technique revealed that a strong correlation exists between the slopes and intercepts of all published equations. The slopes and intercepts of the newly developed regression equations were used to compare several well know published equations to assess accuracy and applicability. The proposed technique permits the examination of the authenticity of any published empirical equations relating to the compression index of clay to void ratio.


Mathematics ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 286 ◽  
Author(s):  
Hamid Saadatfar ◽  
Samiyeh Khosravi ◽  
Javad Hassannataj Joloudari ◽  
Amir Mosavi ◽  
Shahaboddin Shamshirband

The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to make classification methods applicable on large datasets is pruning. LC-KNN is an improved KNN method which first clusters the data into some smaller partitions using the K-means clustering method; and then applies the KNN for each new sample on the partition which its center is the nearest one. However, because the clusters have different shapes and densities, selection of the appropriate cluster is a challenge. In this paper, an approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account these factors. The proposed approach helps to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy. The performance of the proposed approach is evaluated on different real datasets. The experimental results show the effectiveness of the proposed approach and its higher classification accuracy and lower time cost in comparison to other recent relevant methods.


Author(s):  
Xiaoning Yuan ◽  
Hang Yu ◽  
Jun Liang ◽  
Bing Xu

AbstractRecently the density peaks clustering algorithm (DPC) has received a lot of attention from researchers. The DPC algorithm is able to find cluster centers and complete clustering tasks quickly. It is also suitable for different kinds of clustering tasks. However, deciding the cutoff distance $${d}_{c}$$ d c largely depends on human experience which greatly affects clustering results. In addition, the selection of cluster centers requires manual participation which affects the efficiency of the algorithm. In order to solve these problems, we propose a density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy (KNN-ADPC). A clusters merging strategy is proposed to automatically aggregate over-segmented clusters. Additionally, the K nearest neighbors are adopted to divide data points more reasonably. There is only one parameter in KNN-ADPC algorithm, and the clustering task can be conducted automatically without human involvement. The experiment results on artificial and real-world datasets prove higher accuracy of KNN-ADPC compared with DBSCAN, K-means++, DPC, and DPC-KNN.


2011 ◽  
Vol 4 (4) ◽  
pp. 139-142
Author(s):  
S.PUSHPARANI S.PUSHPARANI ◽  
◽  
Dr.S.SENTHAMILKUMAR Dr.S.SENTHAMILKUMAR

Author(s):  
Lidia K Simanjuntak ◽  
Tessa Y M Sihite ◽  
Mesran Mesran ◽  
Nuning Kurniasih ◽  
Yuhandri Yuhandri

All colleges each year organize the selection of new admissions. Acceptance of prospective students in universities as education providers is done by selecting prospective students based on achievement in school and college entrance selection. To select the best student candidates based on predetermined criteria, then use Multi-Criteria Decision Making (MCDM) or commonly called decision support system. One method in MCDM is the Elimination Et Choix Traduisant la Reality (ELECTRE). The ELECTRE method is the best method of action selection. The ELECTRE method to obtain the best alternative by eliminating alternative that do not fit the criteria and can be applied to the decision SNMPTN invitation path.


Author(s):  
Liza Handayani ◽  
Muhammad Syahrizal ◽  
Kennedi Tampubolon

The head of the environment is an extension of the head of the village head in assisting or providing services to the community both in the administration of administration in the village and to other problems. It is natural for a kepling to be appreciated for their performance during their special tenure in the kecamatan field area. Previously, the selection of a dipling in a sub-district was very inefficient and seemed unfair for this exemplary selection to use a system to produce an accurate value, and no intentional element. To overcome the process of selecting an exemplary kepling that experiences these obstacles by using an application called a Decision Support System. Decision Support System (SPK) is a system that can solve a problem, and this system is also assisted with several methods, namely the Rank Order Centroid (ROC) method that can assign weight values to each of the criteria based on their priority level. And to do the ranking or determine an exemplary set using the Additive Ratio Assessment (ARAS) method, this method provides decision making that takes decisions based on ranking or the highest value.Keywords: Head of Medan Area Subdistrict, SPK, Centroid Rank Order, Additive Ratio Assessment (ARAS).


Author(s):  
Fajar Syahputra ◽  
Mesran Mesran ◽  
Ikhwan Lubis ◽  
Agus Perdana Windarto

The teacher is a major milestone in the world of education, the ability and achievement of students cannot be separated from the role of a teacher in teaching and guiding students. Based on the Law of the Republic of Indonesia No. 14 of 2005 concerning Teachers and Lecturers, in Article 1 explained that teachers are professional educators with the main task of educating, teaching, guiding, directing, training, evaluating, and evaluating students in early childhood education through formal education, basic education and education medium. Whereas in Article 4 of the Act, it is explained that the position of teachers as professionals serves to enhance the dignity and role of teachers as learning agents to function to improve the quality of national education.Decision making is an election process, among various alternatives that aim to meet one or several targets. The decision-making system has 4 phases, namely intelligence, design, choice and implementation. These phases are the basis for decision making, which ends with a recommendation.The Preferences Selection Index (PSI) method is a rarely used decision support system method. This method is a method developed by stevanie and Bhatt (2010) to solve the Multi Criteria Decision Making (MCDM). With the right consideration, this method can be one of the tools to determine policies in decision-making systems, especially the selection of outstanding teachers. Determination of policies taken as a basis for decision making, must use criteria that can be defined clearly and objectively.Keywords: Decision Support System, PSI, Selection of Achieving Teachers


2019 ◽  
Author(s):  
Winda Safitri Caniago ◽  
Hade Afriansyah

Decision making is an action with determine the result in solving problem with choose a rule action between alternative through a mental of process, logic of process and etc. This purpose article is to help make it easier to solve a problem. This article explain some strategy decision making such as optimization model, satisfying model, mixed scanning model, heuristic model, and last the selection of certain model.


2016 ◽  
Vol 7 (1) ◽  
pp. 12-18
Author(s):  
Joko Haryanto ◽  
Seng Hansun

This paper describes the development of decision support system application to assist students who want to enter college so that no one choose the majors incorrectly. This application uses fuzzy logic method because fuzzy logic is very flexible in data which are vague and can be represented as a linguistic variable. The purpose of this application is to assist students to choose available majors at University Multimedia Nusantara which are appropriate with his/her capabilities. This application accepts five kinds of input values i.e. Mathematics, Indonesian, English, Physics, and TIK. Received input will be processed by the calculation of the system for decision-making and the application will generate output that shows how great a match for each majors. With this application, prospective students can find out where the majors that match his/her capabilities. This application has ninety nine percentage of match result accuracy. Index Terms—fuzzy logic, decision support system, UMN, selection of major


Sign in / Sign up

Export Citation Format

Share Document