Cost-Sensitive Classification Using Decision Trees, Boosting and MetaCost

Heuristic and Optimization for Knowledge Discovery ◽

10.4018/978-1-930708-26-6.ch003 ◽

2011 ◽

pp. 27-53 ◽

Cited By ~ 1

Author(s):

Kai Ming Ting

Keyword(s):

Decision Trees ◽

Major Contributor ◽

Expected Cost ◽

Cost Criterion ◽

Instance Weighting ◽

New Variant ◽

Cost Sensitive Classification ◽

Boosting Algorithms ◽

The Cost ◽

Minimum Expected Cost

This chapter reports results obtained from a series of studies on costsensitive classification using decision trees, boosting algorithms, and MetaCost which is a recently proposed procedure that converts an errorbased algorithm into a cost-sensitive algorithm. The studies give rise to new variants of algorithms designed for cost-sensitive classification, and provide insights into the strength and weaknesses of the algorithms. First, we describe a simple and effective heuristic of converting an error-based decision tree algorithm into a cost-sensitive one via instance weighting. The cost-sensitive version performs better than the error-based version that employs a minimum expected cost criterion during classification. Second, we report results from a study on four variants of cost-sensitive boosting algorithms. We find that boosting can be simplified for costsensitive classification. A new variant which excludes a factor used in ordinary boosting has an advantage of producing smaller trees and different trees for different scenarios; while it performs comparably to ordinary boosting in terms of cost. We find that the minimum expected cost criterion is the major contributor to the improvement of all cost-sensitive adaptations of ordinary boosting. Third, we reveal a limitation of MetaCost. We find that MetaCost retains only part of the performance of the internal classifier on which it relies. This occurs for both boosting and bagging as its internal classifier.

Download Full-text

A new re-encoding ECOC using reject option

Applied Intelligence ◽

10.1007/s10489-020-01642-2 ◽

2020 ◽

Vol 50 (10) ◽

pp. 3090-3100 ◽

Cited By ~ 3

Author(s):

Lei Lei ◽

Yafei Song ◽

Xi Luo

Keyword(s):

Loss Function ◽

Operating Characteristic ◽

Classification Model ◽

Threshold Values ◽

Reject Option ◽

Roc Convex Hull ◽

Low Degree ◽

New Variant ◽

Cost Sensitive Classification ◽

The Cost

Abstract When training base classifier by ternary Error Correcting Output Codes (ECOC), it is well know that some classes are ignored. On this account, a non-competent classifier emerges when it classify an instance whose real label does not belong to the meta-subclasses. Meanwhile, the classic ECOC dichotomizers can only produce binary outputs and have no capability of rejection for classification. To overcome the non-competence problem and better model the multi-class problem for reducing the classification cost, we embed reject option to ECOC and present a new variant of ECOC algorithm called as Reject-Option-based Re-encoding ECOC (ROECOC). The cost-sensitive classification model and cost-loss function based on Receiver Operating Characteristic (ROC) curve are built respectively. The optimal reject threshold values are obtained by combing the condition to be met for minimizing the loss function and the ROC convex hull. In so doing, reject option (t1, t2) provides a three-symbol output to make dichotomizers more competent and ROECOC more universal and practical for cost-sensitive classification issue. Experimental results on two kinds of datasets show that our scheme with low-degree freedom of initialized ECOC can effectively enhance accuracy and reduce cost.

Download Full-text

Optimal replacement policy under cumulative damage model and strength degradation with applications

Annals of Operations Research ◽

10.1007/s10479-021-04080-6 ◽

2021 ◽

Author(s):

Phalguni Nanda ◽

Prajamitra Bhuyan ◽

Anup Dewanji

Keyword(s):

Damage Model ◽

Strength Degradation ◽

Cumulative Damage ◽

Expected Cost ◽

Replacement Strategy ◽

Optimal Replacement ◽

Poisson Arrival ◽

Cumulative Damage Model ◽

The Cost ◽

Minimum Expected Cost

AbstractIn many real-life scenarios, system failure depends on dynamic stress-strength interference, where strength degrades and stress accumulates concurrently over time. In this paper, we consider the problem of finding an optimal replacement strategy that balances the cost of replacement with the cost of failure and results in the minimum expected cost per unit time under cumulative damage model with strength degradation. In the most general setting, we propose to find optimal choices of three thresholds on operation time, number of arriving shocks and amount of cumulative damage such that replacement of the system due to failure or reaching any of the three thresholds, whichever occurs first, results in the minimum expected cost per unit time. The existing recommendations are applicable only under the assumption of Exponential damage distribution including Poisson arrival of shocks and/or with fixed strength. As theoretical evaluation of the expected cost per unit time turns out to be very complicated, a simulation-based algorithm is proposed to evaluate the expected cost rate and find the optimal replacement strategy. The proposed method is easy to implement having wider domain of application including non-Poisson arrival of shocks and non-Exponential damage distributions. For illustration, the proposed method is applied to real case studies on mailbox and cell-phone battery experiments.

Download Full-text

Minimum Expected Cost Control of a Remotely Piloted Vehicle

Journal of Guidance and Control ◽

10.2514/3.56031 ◽

1980 ◽

Vol 3 (6) ◽

pp. 517-522 ◽

Cited By ~ 9

Author(s):

Aharon P. Vinkler ◽

Lincoln J. Wood ◽

Uy-Loi Ly ◽

Robert H. Cannon Jr.

Keyword(s):

Cost Control ◽

Expected Cost ◽

Minimum Expected Cost

Download Full-text

Minimizing the cost of wildland fire suppression: a model with uncertainty in predicted flame length and fire-line width produced

Canadian Journal of Forest Research ◽

10.1139/x94-164 ◽

1994 ◽

Vol 24 (6) ◽

pp. 1253-1259 ◽

Cited By ~ 18

Author(s):

Romain Mees ◽

David Strauss ◽

Richard Chase

Keyword(s):

Line Width ◽

Wildland Fire ◽

Fire Suppression ◽

Flame Length ◽

Burned Area ◽

Expected Cost ◽

Control Time ◽

The Cost ◽

Attack Strategy

We describe a model that estimates the optimal total expected cost of a wildland fire, given uncertainty in both flame length and fire-line width produced. In the model, a sequence of possible fire-line perimeters is specified, each with a forecasted control time. For a given control time and fire line, the probability of containment of the fire is determined as a function of the fire-fighting resources available. Our procedure assigns the resources to the fire line so as to minimize the total expected cost. A key feature of the model is that the probabilities reflect the degree of uncertainty in (i) the width of fire line that can be built with a given resource allocation, and (ii) the flame length of the fire. The total expected cost associated with a given choice of fire line is the sum of: the loss or gain of value of the area already burned; the cost of the resources used in the attack; and the expected loss or gain of value beyond the fire line. The latter is the product of the probability that the chosen attack strategy fails to contain the fire and the value of the additional burned area that would result from such a failure. The model allows comparison of the costs of the different choices of fire line, and thus identification of the optimal strategy. A small case study is used to illustrate the procedure.

Download Full-text

A comparison of the bagging and the boosting methods using the decision trees classifiers

Computer Science and Information Systems ◽

10.2298/csis0602057m ◽

2006 ◽

Vol 3 (2) ◽

pp. 57-72 ◽

Cited By ~ 9

Author(s):

Kristina Machova ◽

Miroslav Puszta ◽

Frantisek Barcak ◽

Peter Bednar

Keyword(s):

Decision Trees ◽

Classification Algorithm ◽

Classification Algorithms ◽

Performance Tests ◽

Binary Decision ◽

Internet Portal ◽

Minimum Number ◽

Boosting Algorithms ◽

Binary Decision Trees ◽

Tv Broadcasting

In this paper we present an improvement of the precision of classification algorithm results. Two various approaches are known: bagging and boosting. This paper describes a set of experiments with bagging and boosting methods. Our use of these methods aims at classification algorithms generating decision trees. Results of performance tests focused on the use of the bagging and boosting methods in connection with binary decision trees are presented. The minimum number of decision trees, which enables an improvement of the classification performed by the bagging and boosting methods, was found. The tests were carried out using the Reuter?s 21578 collection of documents as well as documents from an Internet portal of TV broadcasting company Mark?za. The comparison of our results on testing the bagging and boosting algorithms is presented.

Download Full-text

Cost-Sensitive Learning in Medicine

Data Mining and Medical Knowledge Management ◽

10.4018/978-1-60566-218-3.ch003 ◽

2011 ◽

pp. 57-75

Author(s):

Alberto Freitas ◽

Pavel Brazdil ◽

Altamiro Costa-Pereira

Keyword(s):

Diagnostic Tests ◽

False Negative ◽

Cost Sensitive Learning ◽

Safety Issues ◽

Misclassification Costs ◽

Misclassification Errors ◽

Cost Sensitive Classification ◽

Human Costs ◽

The Cost ◽

Specific Diagnostic Test

This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to minimize several types of costs associated with healthcare, including attribute costs (e.g. the cost of a specific diagnostic test) and misclassification costs (e.g. the cost of a false negative test). In fact, as in other professional areas, both diagnostic tests and its associated misclassification errors can have significant financial or human costs, including the use of unnecessary resource and patient safety issues. This chapter presents some concepts related to cost-sensitive learning and cost-sensitive classification and its application to medicine. Different types of costs are also present, with an emphasis on diagnostic tests and misclassification costs. In addition, an overview of research in the area of cost-sensitive learning is given, including current methodological approaches. Finally, current methods for the cost-sensitive evaluation of classifiers are discussed.

Download Full-text

Identifying a small set of marker genes using minimum expected cost of misclassification

Artificial Intelligence in Medicine ◽

10.1016/j.artmed.2012.01.004 ◽

2012 ◽

Vol 55 (1) ◽

pp. 51-59 ◽

Cited By ~ 2

Author(s):

Samuel H. Huang ◽

Dengyao Mo ◽

Jarek Meller ◽

Michael Wagner

Keyword(s):

Marker Genes ◽

Expected Cost ◽

Small Set ◽

Cost Of Misclassification ◽

Minimum Expected Cost ◽

Expected Cost Of Misclassification

Download Full-text

The calculation, design and optimization of sandwich panels using author’s programs of the automated designing of development

MATEC Web of Conferences ◽

10.1051/matecconf/201825104006 ◽

2018 ◽

Vol 251 ◽

pp. 04006

Author(s):

Stanislav Petrov ◽

Irina Kuznetsova ◽

Yuri Doladov ◽

Nikita Krasnov

Keyword(s):

Sandwich Panels ◽

Middle Layer ◽

Russian Language ◽

Design And Optimization ◽

Cost Criterion ◽

Different Types ◽

The Moment ◽

The Cost ◽

Program Interface ◽

Building Engineering

Three-layer sandwich panels are widely used in insulation of walls and roofing of buildings and other various structures. At the moment building products markets are full of various types of panels by produced different manufactures but skinned with one and the same material only. Panels skinned with two different types of materials are widely used in the sphere of transport. It may be also of considerable economical effect in building engineering. The article presents an analysis of the current state of the problem of calculation of thin-walled profiles in load-bearing structures. The authors developed a program of automated calculation of three-layer panels. The program is certified in Russia. The program allows you to optimize the panel parameters according to the cost criterion. The article presents the basic calculation ideas incorporated in the algorithm of the program. The figures show the program interface. To date, the program has only one Russian language interface. The paper introduces automated methods of singleand multi-span sandwich panels trial design. Different types of materials can be used while skinning these panels. Their middle-layer shift and compliance of supporting structures are taken into account.

Download Full-text

A modified block replacement policy with two variables and general random minimal repair cost

Journal of Applied Probability ◽

10.2307/3215079 ◽

1996 ◽

Vol 33 (2) ◽

pp. 557-572 ◽

Cited By ~ 18

Author(s):

Shey-Huei Sheu

Keyword(s):

Operating System ◽

Replacement Policy ◽

Repair Cost ◽

Minimal Repair ◽

Expected Cost ◽

Cost Rate ◽

Age Dependent ◽

Modified Block ◽

The Cost ◽

Random Part

This paper considers a modified block replacement with two variables and general random minimal repair cost. Under such a policy, an operating system is preventively replaced by new ones at times kT (k= 1, 2, ···) independently of its failure history. If the system fails in [(k − 1)T, (k − 1)T+ T0) it is either replaced by a new one or minimally repaired, and if in [(k − 1) T + T0, kT) it is either minimally repaired or remains inactive until the next planned replacement. The choice of these two possible actions is based on some random mechanism which is age-dependent. The cost of the ith minimal repair of the system at age y depends on the random part C(y) and the deterministic part ci (y). The expected cost rate is obtained, using the results of renewal reward theory. The model with two variables is transformed into a model with one variable and the optimum policy is discussed.

Download Full-text

Sequencing an N-Stage Process with Feedback

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800000784 ◽

1988 ◽

Vol 2 (2) ◽

pp. 263-265 ◽

Cited By ~ 5

Author(s):

Uri Yechiali

Keyword(s):

Expected Cost ◽

Initial Stage ◽

Entire Sequence ◽

N Stage ◽

Stage Process ◽

The Cost

N tasks must be successfully performed for a job to be completed. The tasks may be attempted in any order, where each attempt of task i requires an expected cost ci and is successful with probability pi. Whenever an attempt fails, the job is fed back to the initial stage and the entire sequence starts again. We show that the cost of completing a job is minimized if the tasks are sequenced via increasing values of ci/(l–pi). We further show that the same result holds when the feedback can be either to stage i itself or to the starting task.

Download Full-text