A Decision-Tree Approach to the Interpretation of Archaeological Data

1994 ◽  
Vol 74 ◽  
pp. 1-11 ◽  
Author(s):  
Anne Marshall ◽  
Garry Marshall

SummaryA number of techniques capable of generating descriptions of data have been developed in the area of Artificial Intelligence. Their suitability for use with a data set compiled to record the details of Anglo-Saxon structures in Britain is considered. An appropriate method which describes the data by means of decision trees is chosen and, after some adaptation, is used to generate descriptions of this data in the form of decision trees. The resulting descriptions compare favourably with interpretations obtained by archaeological analysis.

2021 ◽  
Author(s):  
Nicodemus Nzoka Maingi ◽  
Ismail Ateya Lukandu ◽  
Matilu MWAU

Abstract BackgroundThe disease outbreak management operations of most countries (notably Kenya) present numerous novel ideas of how to best make use of notifiable disease data to effect proactive interventions. Notifiable disease data is reported, aggregated and variously consumed. Over the years, there has been a deluge of notifiable disease data and the challenge for notifiable disease data management entities has been how to objectively and dynamically aggregate such data in a manner such as to enable the efficient consumption to inform relevant mitigation measures. Various models have been explored, tried and tested with varying results; some purely mathematical and statistical, others quasi-mathematical cum software model-driven.MethodsOne of the tools that has been explored is Artificial Intelligence (AI). AI is a technique that enables computers to intelligently perform and mimic actions and tasks usually reserved for human experts. AI presents a great opportunity for redefining how the data is more meaningfully processed and packaged. This research explores AI’s Machine Learning (ML) theory as a differentiator in the crunching of notifiable disease data and adding perspective. An algorithm has been designed to test different notifiable disease outbreak data cases, a shift to managing disease outbreaks via the symptoms they generally manifest. Each notifiable disease is broken down into a set of symptoms, dubbed symptom burden variables, and consequently categorized into eight clusters: Bodily, Gastro-Intestinal, Muscular, Nasal, Pain, Respiratory, Skin, and finally, Other Symptom Clusters. ML’s decision tree theory has been utilized in the determination of the entropies and information gains of each symptom cluster based on select test data sets.ResultsOnce the entropies and information gains have been determined, the information gain variables are then ranked in descending order; from the variables with the highest information gains to those with the lowest, thereby giving a clear-cut criteria of how the variables are ordered. The ranked variables are then utilized in the construction of a binary decision tree, which graphically and structurally represents the variables. Should any variables have a tie in the information gain rankings, such are given equal importance in the construction of the binary decision-tree. From the presented data, the computed information gains are ordered as; Gastro-Intestinal, Bodily, Pain, Skin, Respiratory, Others. Muscular, and finally Nasal Symptoms respectively. The corresponding binary decision tree is then constructed.ConclusionsThe algorithm successfully singles out the disease burden variable(s) that are most critical as the point of diagnostic focus to enable the relevant authorities take the necessary, informed interventions. This algorithm provides a good basis for a country’s localized diagnostic activities driven by data from the reported notifiable disease cases. The algorithm presents a dynamic mechanism that can be used to analyze and aggregate any notifiable disease data set, meaning that the algorithm is not fixated or locked on any particular data set.


2008 ◽  
pp. 2978-2992
Author(s):  
Jianting Zhang ◽  
Wieguo Liu ◽  
Le Gruenwald

Decision trees (DT) has been widely used for training and classification of remotely sensed image data due to its capability to generate human interpretable decision rules and its relatively fast speed in training and classification. This chapter proposes a successive decision tree (SDT) approach where the samples in the ill-classified branches of a previous resulting decision tree are used to construct a successive decision tree. The decision trees are chained together through pointers and used for classification. SDT aims at constructing more interpretable decision trees while attempting to improve classification accuracies. The proposed approach is applied to two real remotely sensed image datasets for evaluations in terms of classification accuracy and interpretability of the resulting decision rules.


2019 ◽  
Vol 2019 (1) ◽  
pp. 266-286 ◽  
Author(s):  
Anselme Tueno ◽  
Florian Kerschbaum ◽  
Stefan Katzenbeisser

Abstract Decision trees are widespread machine learning models used for data classification and have many applications in areas such as healthcare, remote diagnostics, spam filtering, etc. In this paper, we address the problem of privately evaluating a decision tree on private data. In this scenario, the server holds a private decision tree model and the client wants to classify its private attribute vector using the server’s private model. The goal is to obtain the classification while preserving the privacy of both – the decision tree and the client input. After the computation, only the classification result is revealed to the client, while nothing is revealed to the server. Many existing protocols require a constant number of rounds. However, some of these protocols perform as many comparisons as there are decision nodes in the entire tree and others transform the whole plaintext decision tree into an oblivious program, resulting in higher communication costs. The main idea of our novel solution is to represent the tree as an array. Then we execute only d – the depth of the tree – comparisons. Each comparison is performed using a small garbled circuit, which output secret-shares of the index of the next node. We get the inputs to the comparison by obliviously indexing the tree and the attribute vector. We implement oblivious array indexing using either garbled circuits, Oblivious Transfer or Oblivious RAM (ORAM). Using ORAM, this results in the first protocol with sub-linear cost in the size of the tree. We implemented and evaluated our solution using the different array indexing procedures mentioned above. As a result, we are not only able to provide the first protocol with sublinear cost for large trees, but also reduce the communication cost for the large real-world data set “Spambase” from 18 MB to 1[triangleright]2 MB and the computation time from 17 seconds to less than 1 second in a LAN setting, compared to the best related work.


2018 ◽  
Vol 41 (8) ◽  
pp. 2185-2195
Author(s):  
Yuliang Cai ◽  
Huaguang Zhang ◽  
Qiang He ◽  
Shaoxin Sun

Based on axiomatic fuzzy set (AFS) theory and fuzzy information entropy, a novel fuzzy oblique decision tree (FODT) algorithm is proposed in this paper. Traditional axis-parallel decision trees only consider a single feature at each non-leaf node, while oblique decision trees partition the feature space with an oblique hyperplane. By contrast, the FODT takes dynamic mining fuzzy rules as a decision function. The main idea of the FODT is to use these fuzzy rules to construct leaf nodes for each class in each layer of the tree; the samples that cannot be covered by the fuzzy rules are then put into an additional node – the only non-leaf node in this layer. Construction of the FODT consists of four major steps: (a) generation of fuzzy membership functions automatically by AFS theory according to the raw data distribution; (b) extraction of dynamically fuzzy rules in each non-leaf node by the fuzzy rule extraction algorithm (FREA); (c) construction of the FODT by the fuzzy rules obtained from step (b); and (d) determination of the optimal threshold [Formula: see text] to generate a final tree. Compared with five traditional decision trees (C4.5, LADtree (LAD), Best-first tree (BFT), SimpleCart (SC) and NBTree (NBT)) and a recently obtained fuzzy rules decision tree (FRDT) on eight UCI machine learning data sets and one biomedical data set (ALLAML), the experimental results demonstrate that the proposed algorithm outperforms the other decision trees in both classification accuracy and tree size.


Author(s):  
Malcolm J. Beynon

The inductive learning methodology known as decision trees, concerns the ability to classify objects based on their attributes values, using a tree like structure from which decision rules can be accrued. In this article, a description of decision trees is given, with the main emphasis on their operation in a fuzzy environment. A first reference to decision trees is made in Hunt et al. (1966), who proposed the Concept learning system to construct a decision tree that attempts to minimize the score of classifying chess endgames. The example problem concerning chess offers early evidence supporting the view that decision trees are closely associated with artificial intelligence (AI). It is over ten years later that Quinlan (1979) developed the early work on decision trees, to introduced the Interactive Dichotomizer 3 (ID3). The important feature with their development was the use of an entropy measure to aid the decision tree construction process (using again the chess game as the considered problem). It is ID3, and techniques like it, that defines the hierarchical structure commonly associated with decision trees, see for example the recent theoretical and application studies of Pal and Chakraborty (2001), Bhatt and Gopal (2005) and Armand et al. (2007). Moreover, starting from an identified root node, paths are constructed down to leaf nodes, where the attributes associated with the intermediate nodes are identified through the use of an entropy measure to preferentially gauge the classification certainty down that path. Each path down to a leaf node forms an ‘if .. then ..’ decision rule used to classify the objects. The introduction of fuzzy set theory in Zadeh (1965), offered a general methodology that allows notions of vagueness and imprecision to be considered. Moreover, Zadeh’s work allowed the possibility for previously defined techniques to be considered with a fuzzy environment. It was over ten years later that the area of decision trees benefited from this fuzzy environment opportunity (see Chang and Pavlidis, 1977). Since then there has been a steady stream of research studies that have developed or applied fuzzy decision trees (FDTs) (see recently for example Li et al., 2006 and Wang et al., 2007). The expectations that come with the utilisation of FDTs are succinctly stated by Li et al. (2006, p. 655); “Decision trees based on fuzzy set theory combines the advantages of good comprehensibility of decision trees and the ability of fuzzy representation to deal with inexact and uncertain information.” Chiang and Hsu (2002) highlight that decision trees has been successfully applied to problems in artificial intelligence, pattern recognition and statistics. They go onto outline a positive development the FDTs offer, namely that it is better placed to have an estimate of the degree that an object is associated with each class, often desirable in areas like medical diagnosis (see Quinlan (1987) for the alternative view with respect to crisp decision trees). The remains of this article look in more details at FDTs, including a tutorial example showing the rudiments of how an FDT can be constructed.


Author(s):  
Jianting Zhang ◽  
Wieguo Liu ◽  
Le Gruenwald

Decision trees (DT) has been widely used for training and classification of remotely sensed image data due to its capability to generate human interpretable decision rules and its relatively fast speed in training and classification. This chapter proposes a successive decision tree (SDT) approach where the samples in the ill-classified branches of a previous resulting decision tree are used to construct a successive decision tree. The decision trees are chained together through pointers and used for classification. SDT aims at constructing more interpretable decision trees while attempting to improve classification accuracies. The proposed approach is applied to two real remotely sensed image datasets for evaluations in terms of classification accuracy and interpretability of the resulting decision rules.


Author(s):  
Malcolm J. Beynon

The first (crisp) decision tree techniques were introduced in the 1960s (Hunt, Marin, & Stone, 1966), their appeal to decision makers is due in no part to their comprehensibility in classifying objects based on their attribute values (Janikow, 1998). With early techniques such as the ID3 algorithm (Quinlan, 1979), the general approach involves the repetitive partitioning of the objects in a data set through the augmentation of attributes down a tree structure from the root node, until each subset of objects is associated with the same decision class or no attribute is available for further decomposition, ending in a number of leaf nodes. This article considers the notion of decision trees in a fuzzy environment (Zadeh, 1965). The first fuzzy decision tree (FDT) reference is attributed to Chang and Pavlidis (1977), which defined a binary tree using a branch-bound-backtrack algorithm, but limited instruction on FDT construction. Later developments included fuzzy versions of crisp decision techniques, such as fuzzy ID3, and so forth (see Ichihashi, Shirai, Nagasaka, & Miyoshi, 1996; Pal & Chakraborty, 2001) and other versions (Olaru & Wehenkel, 2003).


Author(s):  
Ljiljana Kašćelan ◽  
Vladimir Kašćelan

Popular decision tree (DT) algorithms such as ID3, C4.5, CART, CHAID and QUEST may have different results using same data set. They consist of components which have similar functionalities. These components implemented on different ways and they have different performance. The best way to get an optimal DT for a data set is one that use component-based design, which enables user to intelligently select in advance implemented components well suited to specific data set. In this article the authors proposed component-based design of the optimal DT for classification of securities account holders. Research results showed that the optimal algorithm is not one of the original DT algorithms. This fact confirms that the component design provided algorithms with better performance than the original ones. Also, the authors found how the specificities of the data influence the DT components performance. Obtained results of classification can be useful to the future investors in the Montenegrin capital market.


Author(s):  
Md Nasim Adnan ◽  
Md Zahidul Islam

Decision trees are popularly used in a wide range of real world problems for both prediction and classification (logic) rules discovery. A decision forest is an ensemble of decision trees and it is often built for achieving better predictive performance compared to a single decision tree. Besides improving predictive performance, a decision forest can be seen as a pool of logic rules (rules) with great potential for knowledge discovery. However, a standard-sized decision forest usually generates a large number of rules that a user may not able to manage for effective knowledge analysis. In this paper, we propose a new, data set independent framework for extracting those rules that are comparatively more accurate, generalized and concise than others. We apply the proposed framework on rules generated by two different decision forest algorithms from some publicly available medical related data sets on dementia and heart disease. We then compare the quality of rules extracted by the proposed framework with rules generated from a single J48 decision tree and rules extracted by another recent method. The results reported in this paper demonstrate the effectiveness of the proposed framework.


1996 ◽  
Vol 59 (11) ◽  
pp. 1242-1247 ◽  
Author(s):  
FRANK L. BRYAN

Decision trees have been used as an aid to selection of critical control points as part of the development of hazard analysis critical control point (HACCP) systems. The background for those in existence is described. Another decision-tree approach that follows the logic in the IAMFES manual Procedures to Implement the Hazard Analysis Critical Control Point Approach is presented. It takes into consideration impending hazards, the effect of actions exercised at the operation in question, whether control actions should be taken at this or subsequent operations and whether the CCP will be monitored and corrections made. Further, guidelines are given for selecting an operation as a critical control point. A decision tree to aid in the evaluation of risks is presented, considering whether illness will result, the severity of the illness and the likely occurrence of this outcome, based upon epidemiologic or challenge study of related events. These decision trees provide additional tools to aid in the development of HACCP systems.


Sign in / Sign up

Export Citation Format

Share Document