A fast iterative features selection for the K-nearest neighbor

Author(s):  
Zongtao Han ◽  
Wei Wang ◽  
Zengyuan Li ◽  
Erxue Chen ◽  
Qiuping Wang ◽  
...  
Author(s):  
Aimrudee Jongtaveesataporn ◽  
Shingo Takada

The selection of services is a key part of Service Oriented Architecture (SOA). Services are primarily selected based on function, but Quality of Service (QoS) is an important factor when choosing among several services with the same function. But current service selection approaches often takes time to unnecessarily recompute requests. Furthermore, if the same service is chosen as having the "best" QoS for multiple selections, this may result in that service having too much load. We thus propose the FASICA (FAst service selection for SImilar constraints with CAche) Framework which chooses a service with satisfactory QoS as quickly as possible. The key points are (1) to use a cache which stores previous search results, (2) to use K-Nearest Neighbor (K-NN) algorithm with K-d tree when a satisfactory service does not exist in the cache, and (3) to distribute the service request according to a distribution policy. Results of simulations show that our framework can rapidly select a service compared to a conventional approach.


2022 ◽  
Vol 8 (1) ◽  
pp. 50
Author(s):  
Rifki Indra Perwira ◽  
Bambang Yuwono ◽  
Risya Ines Putri Siswoyo ◽  
Febri Liantoni ◽  
Hidayatulah Himawan

State universities have a library as a facility to support students’ education and science, which contains various books, journals, and final assignments. An intelligent system for classifying documents is needed to ease library visitors in higher education as a form of service to students. The documents that are in the library are generally the result of research. Various complaints related to the imbalance of data texts and categories based on irrelevant document titles and words that have the ambiguity of meaning when searching for documents are the main reasons for the need for a classification system. This research uses k-Nearest Neighbor (k-NN) to categorize documents based on study interests with information gain features selection to handle unbalanced data and cosine similarity to measure the distance between test and training data. Based on the results of tests conducted with 276 training data, the highest results using the information gain selection feature using 80% training data and 20% test data produce an accuracy of 87.5% with a parameter value of k=5. The highest accuracy results of 92.9% are achieved without information gain feature selection, with the proportion of training data of 90% and 10% test data and parameters k=5, 7, and 9. This paper concludes that without information gain feature selection, the system has better accuracy than using the feature selection because every word in the document title is considered to have an essential role in forming the classification.


Author(s):  
Yong Wang ◽  
Lin Li

This paper provides a case study of diagnosing helicopter swashplate ball bearing faults using vibration signals. We develop and apply feature extraction and selection techniques in the time, frequency, and joint time-frequency domains to differentiate six types of swashplate bearing conditions: low-time, to-be-overhauled, corroded, cage-popping, spalled, and case-overlapping. With proper selection of the features, it is shown that even the simple k-nearest neighbor (k-NN) algorithm is able to correctly identify these six types of conditions on the tested data. The developed method is useful for helicopter swashplate condition monitoring and maintenance scheduling. It is also helpful for testing the manufactured swashplate ball bearings for quality control purposes.


Catalysts ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. 1364
Author(s):  
Ze Dong ◽  
Ling Li ◽  
Laiqing Yan ◽  
Ming Sun ◽  
Jinsong Li

In order to control NH3 injection for the selective catalytic reduction of nitrogen oxide (NOx) denitration (SCR de-NOx) process, a model that can accurately and quickly predict outlet NOx emissions is required. This paper presents a dynamic kernel partial least squares (KPLS) model incorporated with delay estimation and variable selection for outlet NOx emission and investigated control strategy for NH3 injection. First, k-nearest neighbor mutual information (KNN_MI) was used for delay estimation, and the effect of historical data lengths on KNN_MI was taken into account. Bidirectional search based on the change rate of KNN_MI (KNN_MI_CR) was used for variable selection. Delay–time difference update algorithm and feedback correction strategy were proposed. Second, the NH3 injection compensator (NIC) and the outlet NOx emission model constituted a correction controller. Then, its output and the output of the existing controller are added up to suitable NH3 injection. Finally, the KNN_MI_CR method was compared with different algorithms by benchmark dataset. The field data results showed that the KNN_MI_CR method could improve model accuracy for reconstructed samples. The final model can predict outlet NOx emissions in different operating states accurately. The control result not only meets the NOx emissions standard (50 mg/m3) but also keeps high de-NOx efficiency (80%). NH3 injection and NH3 escape are reduced by 11% and 39%.


2017 ◽  
Vol 25 (4) ◽  
pp. 103-124 ◽  
Author(s):  
Le Nguyen Bao ◽  
Dac-Nhuong Le ◽  
Gia Nhu Nguyen ◽  
Le Van Chung ◽  
Nilanjan Dey

Face recognition is an importance step which can affect the performance of the system. In this paper, the authors propose a novel Max-Min Ant System algorithm to optimal feature selection based on Discrete Wavelet Transform feature for Video-based face recognition. The length of the culled feature vector is adopted as heuristic information for ant's pheromone in their algorithm. They selected the optimal feature subset in terms of shortest feature length and the best performance of classifier used k-nearest neighbor classifier. The experiments were analyzed on face recognition show that the authors' algorithm can be easily implemented and without any priori information of features. The evaluated performance of their algorithm is better than previous approaches for feature selection.


Author(s):  
Fei-Long Chen ◽  
Feng-Chia Li

Credit scoring is an important topic for businesses and socio-economic establishments collecting huge amounts of data, with the intention of making the wrong decision obsolete. In this paper, the authors propose four approaches that combine four well-known classifiers, such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Back-Propagation Network (BPN) and Extreme Learning Machine (ELM). These classifiers are used to find a suitable hybrid classifier combination featuring selection that retains sufficient information for classification purposes. In this regard, different credit scoring combinations are constructed by selecting features with four approaches and classifiers than would otherwise be chosen. Two credit data sets from the University of California, Irvine (UCI), are chosen to evaluate the accuracy of the various hybrid features selection models. In this paper, the procedures that are part of the proposed approaches are described and then evaluated for their performances.


Sign in / Sign up

Export Citation Format

Share Document