Cost-Sensitive Classification Using Decision Trees, Boosting and MetaCost

Author(s):  
Kai Ming Ting

This chapter reports results obtained from a series of studies on costsensitive classification using decision trees, boosting algorithms, and MetaCost which is a recently proposed procedure that converts an errorbased algorithm into a cost-sensitive algorithm. The studies give rise to new variants of algorithms designed for cost-sensitive classification, and provide insights into the strength and weaknesses of the algorithms. First, we describe a simple and effective heuristic of converting an error-based decision tree algorithm into a cost-sensitive one via instance weighting. The cost-sensitive version performs better than the error-based version that employs a minimum expected cost criterion during classification. Second, we report results from a study on four variants of cost-sensitive boosting algorithms. We find that boosting can be simplified for costsensitive classification. A new variant which excludes a factor used in ordinary boosting has an advantage of producing smaller trees and different trees for different scenarios; while it performs comparably to ordinary boosting in terms of cost. We find that the minimum expected cost criterion is the major contributor to the improvement of all cost-sensitive adaptations of ordinary boosting. Third, we reveal a limitation of MetaCost. We find that MetaCost retains only part of the performance of the internal classifier on which it relies. This occurs for both boosting and bagging as its internal classifier.

2020 ◽  
Vol 50 (10) ◽  
pp. 3090-3100 ◽  
Author(s):  
Lei Lei ◽  
Yafei Song ◽  
Xi Luo

Abstract When training base classifier by ternary Error Correcting Output Codes (ECOC), it is well know that some classes are ignored. On this account, a non-competent classifier emerges when it classify an instance whose real label does not belong to the meta-subclasses. Meanwhile, the classic ECOC dichotomizers can only produce binary outputs and have no capability of rejection for classification. To overcome the non-competence problem and better model the multi-class problem for reducing the classification cost, we embed reject option to ECOC and present a new variant of ECOC algorithm called as Reject-Option-based Re-encoding ECOC (ROECOC). The cost-sensitive classification model and cost-loss function based on Receiver Operating Characteristic (ROC) curve are built respectively. The optimal reject threshold values are obtained by combing the condition to be met for minimizing the loss function and the ROC convex hull. In so doing, reject option (t1, t2) provides a three-symbol output to make dichotomizers more competent and ROECOC more universal and practical for cost-sensitive classification issue. Experimental results on two kinds of datasets show that our scheme with low-degree freedom of initialized ECOC can effectively enhance accuracy and reduce cost.


Author(s):  
Phalguni Nanda ◽  
Prajamitra Bhuyan ◽  
Anup Dewanji

AbstractIn many real-life scenarios, system failure depends on dynamic stress-strength interference, where strength degrades and stress accumulates concurrently over time. In this paper, we consider the problem of finding an optimal replacement strategy that balances the cost of replacement with the cost of failure and results in the minimum expected cost per unit time under cumulative damage model with strength degradation. In the most general setting, we propose to find optimal choices of three thresholds on operation time, number of arriving shocks and amount of cumulative damage such that replacement of the system due to failure or reaching any of the three thresholds, whichever occurs first, results in the minimum expected cost per unit time. The existing recommendations are applicable only under the assumption of Exponential damage distribution including Poisson arrival of shocks and/or with fixed strength. As theoretical evaluation of the expected cost per unit time turns out to be very complicated, a simulation-based algorithm is proposed to evaluate the expected cost rate and find the optimal replacement strategy. The proposed method is easy to implement having wider domain of application including non-Poisson arrival of shocks and non-Exponential damage distributions. For illustration, the proposed method is applied to real case studies on mailbox and cell-phone battery experiments.


1980 ◽  
Vol 3 (6) ◽  
pp. 517-522 ◽  
Author(s):  
Aharon P. Vinkler ◽  
Lincoln J. Wood ◽  
Uy-Loi Ly ◽  
Robert H. Cannon Jr.

1994 ◽  
Vol 24 (6) ◽  
pp. 1253-1259 ◽  
Author(s):  
Romain Mees ◽  
David Strauss ◽  
Richard Chase

We describe a model that estimates the optimal total expected cost of a wildland fire, given uncertainty in both flame length and fire-line width produced. In the model, a sequence of possible fire-line perimeters is specified, each with a forecasted control time. For a given control time and fire line, the probability of containment of the fire is determined as a function of the fire-fighting resources available. Our procedure assigns the resources to the fire line so as to minimize the total expected cost. A key feature of the model is that the probabilities reflect the degree of uncertainty in (i) the width of fire line that can be built with a given resource allocation, and (ii) the flame length of the fire. The total expected cost associated with a given choice of fire line is the sum of: the loss or gain of value of the area already burned; the cost of the resources used in the attack; and the expected loss or gain of value beyond the fire line. The latter is the product of the probability that the chosen attack strategy fails to contain the fire and the value of the additional burned area that would result from such a failure. The model allows comparison of the costs of the different choices of fire line, and thus identification of the optimal strategy. A small case study is used to illustrate the procedure.


2006 ◽  
Vol 3 (2) ◽  
pp. 57-72 ◽  
Author(s):  
Kristina Machova ◽  
Miroslav Puszta ◽  
Frantisek Barcak ◽  
Peter Bednar

In this paper we present an improvement of the precision of classification algorithm results. Two various approaches are known: bagging and boosting. This paper describes a set of experiments with bagging and boosting methods. Our use of these methods aims at classification algorithms generating decision trees. Results of performance tests focused on the use of the bagging and boosting methods in connection with binary decision trees are presented. The minimum number of decision trees, which enables an improvement of the classification performed by the bagging and boosting methods, was found. The tests were carried out using the Reuter?s 21578 collection of documents as well as documents from an Internet portal of TV broadcasting company Mark?za. The comparison of our results on testing the bagging and boosting algorithms is presented.


Author(s):  
Alberto Freitas ◽  
Pavel Brazdil ◽  
Altamiro Costa-Pereira

This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to minimize several types of costs associated with healthcare, including attribute costs (e.g. the cost of a specific diagnostic test) and misclassification costs (e.g. the cost of a false negative test). In fact, as in other professional areas, both diagnostic tests and its associated misclassification errors can have significant financial or human costs, including the use of unnecessary resource and patient safety issues. This chapter presents some concepts related to cost-sensitive learning and cost-sensitive classification and its application to medicine. Different types of costs are also present, with an emphasis on diagnostic tests and misclassification costs. In addition, an overview of research in the area of cost-sensitive learning is given, including current methodological approaches. Finally, current methods for the cost-sensitive evaluation of classifiers are discussed.


2018 ◽  
Vol 251 ◽  
pp. 04006
Author(s):  
Stanislav Petrov ◽  
Irina Kuznetsova ◽  
Yuri Doladov ◽  
Nikita Krasnov

Three-layer sandwich panels are widely used in insulation of walls and roofing of buildings and other various structures. At the moment building products markets are full of various types of panels by produced different manufactures but skinned with one and the same material only. Panels skinned with two different types of materials are widely used in the sphere of transport. It may be also of considerable economical effect in building engineering. The article presents an analysis of the current state of the problem of calculation of thin-walled profiles in load-bearing structures. The authors developed a program of automated calculation of three-layer panels. The program is certified in Russia. The program allows you to optimize the panel parameters according to the cost criterion. The article presents the basic calculation ideas incorporated in the algorithm of the program. The figures show the program interface. To date, the program has only one Russian language interface. The paper introduces automated methods of singleand multi-span sandwich panels trial design. Different types of materials can be used while skinning these panels. Their middle-layer shift and compliance of supporting structures are taken into account.


1996 ◽  
Vol 33 (2) ◽  
pp. 557-572 ◽  
Author(s):  
Shey-Huei Sheu

This paper considers a modified block replacement with two variables and general random minimal repair cost. Under such a policy, an operating system is preventively replaced by new ones at times kT (k= 1, 2, ···) independently of its failure history. If the system fails in [(k − 1)T, (k − 1)T+ T0) it is either replaced by a new one or minimally repaired, and if in [(k − 1) T + T0, kT) it is either minimally repaired or remains inactive until the next planned replacement. The choice of these two possible actions is based on some random mechanism which is age-dependent. The cost of the ith minimal repair of the system at age y depends on the random part C(y) and the deterministic part ci (y). The expected cost rate is obtained, using the results of renewal reward theory. The model with two variables is transformed into a model with one variable and the optimum policy is discussed.


1988 ◽  
Vol 2 (2) ◽  
pp. 263-265 ◽  
Author(s):  
Uri Yechiali

N tasks must be successfully performed for a job to be completed. The tasks may be attempted in any order, where each attempt of task i requires an expected cost ci and is successful with probability pi. Whenever an attempt fails, the job is fed back to the initial stage and the entire sequence starts again. We show that the cost of completing a job is minimized if the tasks are sequenced via increasing values of ci/(l–pi). We further show that the same result holds when the feedback can be either to stage i itself or to the starting task.


Sign in / Sign up

Export Citation Format

Share Document