A Case-based Reasoning with Feature Weights Derived by BP Network

Author(s):  
Yan Peng ◽  
Like Zhuang
Kybernetes ◽  
2014 ◽  
Vol 43 (2) ◽  
pp. 265-280 ◽  
Author(s):  
Aleksandar Kartelj ◽  
Nebojša Šurlan ◽  
Zoran Cekić

Purpose – The presented research proposes a method aimed to improve a case retrieval phase of the case-based reasoning (CBR) system through optimization of feature relevance parameters, i.e. feature weights. Design/methodology/approach – The improvement is achieved by applying the metaheuristic optimization technique, called electromagnetism-like algorithm (EM), in order to appropriately adjust the feature weights used in k-NN classifier. The usability of the proposed EM k-NN algorithm is much broader since it can also be used outside the CBR system, e.g. for solving general pattern recognition tasks. Findings – It is showed that the proposed EM k-NN algorithm improves the baseline k-NN model and outperforms the appropriately tuned artificial neural network (ANN) in the task of predicting the case (data record) output values. The results are verified by performing statistical analysis. Research limitations/implications – The proposed method is currently adjusted to deal with numerical features, so, as a direction for future work, the variant of EM k-NN algorithm that deals with symbolic or some more complex types of features should be considered. Practical implications – EM k-NN algorithm can be incorporated as a case retrieval component inside a general CBR system. This is the future direction of the investigation since the authors intend to build a complete specialized CBR system for construction project management. The overall CBR with incorporated EM k-NN will have significant implication in the construction management as it will be able to produce more accurate prediction of viability and the life cycle of new construction projects. Originality/value – The electromagnetism-like algorithm is applied to the problem of finding feature weights for the first time. EM potential for solving the problem of weighting features lies in its internal structure because it is based on the real-valued EM vectors. The overall EM k-NN algorithm is applied on data sets generated from real construction projects data corpus. The proposed algorithm proved its efficiency as it outperformed baseline k-NN model and ANN. Its applicability in more complex and specialized CBR systems is high since it can be easily added due to its modular (black-box) design.


Author(s):  
Yufika Sari Bagi ◽  
Suprapto Suprapto

Retrieval is one of the stages in case-based reasoning system which find a solution to new problem or case by measuring the similarity between the new case and old cases in the case base. Some of the similarity measurement techniques are involving feature weights that show the importance of the feature in a case. Feature weights can be obtained from a domain expert or by using a feature weighting method either locally or globally. Gradient descent is the feature weighting method which computes global weights for each feature. This research implemented gradient descent to obtain feature weights in case-based reasoning for hepatitis diagnosis and the similarity measurement using weighted Euclidean distance. There are four variations number of case base and test data that used in this research, those are: the first variation using 50% of data as case base and 50% as test data second variation using 60% of data as case base and 40% as test data, third variation using 70% of data as case base and 30% as test data and fourth variation using 80% of data as case base and 20% as test data. For each variation, using 4 kinds of scenario to mark the test data those are in first scenario the test data mark at the end of data, in second scenario the test data mark at the begin of data, in third scenario the test data mark half at the begin and half at the end of data and in the fourth scenario the test data mark in the middle of data. The result of this research showed that the accuracy of the system reaches 100% at scenario 1 in variation 4. Overall of all four variations and four kinds of scenario, the average accuracy of the system was 77.55%, average recall of system was 69.74%, and the average of precision was 78.39%. In addition, the level of accuracy was also influenced by the number of case base and the scenario of case selection for the case base. This is because more cases in the case base, the chances of a system to finding similar cases will be more.


Vestnik MEI ◽  
2020 ◽  
Vol 5 (5) ◽  
pp. 132-139
Author(s):  
Ivan E. Kurilenko ◽  
◽  
Igor E. Nikonov ◽  

A method for solving the problem of classifying short-text messages in the form of sentences of customers uttered in talking via the telephone line of organizations is considered. To solve this problem, a classifier was developed, which is based on using a combination of two methods: a description of the subject area in the form of a hierarchy of entities and plausible reasoning based on the case-based reasoning approach, which is actively used in artificial intelligence systems. In solving various problems of artificial intelligence-based analysis of data, these methods have shown a high degree of efficiency, scalability, and independence from data structure. As part of using the case-based reasoning approach in the classifier, it is proposed to modify the TF-IDF (Term Frequency - Inverse Document Frequency) measure of assessing the text content taking into account known information about the distribution of documents by topics. The proposed modification makes it possible to improve the classification quality in comparison with classical measures, since it takes into account the information about the distribution of words not only in a separate document or topic, but in the entire database of cases. Experimental results are presented that confirm the effectiveness of the proposed metric and the developed classifier as applied to classification of customer sentences and providing them with the necessary information depending on the classification result. The developed text classification service prototype is used as part of the voice interaction module with the user in the objective of robotizing the telephone call routing system and making a shift from interaction between the user and system by means of buttons to their interaction through voice.


Sign in / Sign up

Export Citation Format

Share Document