HANDLING AMBIGUOUS VALUES IN INSTANCE-BASED CLASSIFIERS

2008 ◽  
Vol 17 (03) ◽  
pp. 449-463
Author(s):  
HANS HOLLAND ◽  
MIROSLAV KUBAT ◽  
JAN ŽIŽKA

In an attempt to automate evaluation of network intrusion detection systems, we encountered the problem of ambiguously described learning examples. For instance, an attribute's value, or a class label, in a given example was known to be a or b but definitely not c or d. Previous research in machine learning usually either “disambiguated” the value (by giving preference to a or b), or replaced it with a “don't-know” symbol. Neither approach is satisfactory: while the former distorts the available information by pretending precise knowledge, the latter ignores the fact that at least something is known. Our experiments confirm the intuition that classification performance is indeed impaired if the ambiguities are not handled properly. In the research reported here, we limited ourselves to the realm of the relatively simple nearest-neighbor classifiers and investigated a few alternative solutions. The paper describes the techniques we used and describes their behavior in experimental domains.

Sign in / Sign up

Export Citation Format

Share Document