An Optimized Random Forest Classification Method for Processing Imbalanced Data Sets of Alzheimer's Disease

Author(s):  
Haijing Sun ◽  
Anna Wang ◽  
Yun Feng ◽  
Chen Liu
2020 ◽  
Vol 53 (8) ◽  
pp. 5747-5788
Author(s):  
Julian Hatwell ◽  
Mohamed Medhat Gaber ◽  
R. Muhammad Atif Azad

Abstract Modern machine learning methods typically produce “black box” models that are opaque to interpretation. Yet, their demand has been increasing in the Human-in-the-Loop processes, that is, those processes that require a human agent to verify, approve or reason about the automated decisions before they can be applied. To facilitate this interpretation, we propose Collection of High Importance Random Path Snippets (CHIRPS); a novel algorithm for explaining random forest classification per data instance. CHIRPS extracts a decision path from each tree in the forest that contributes to the majority classification, and then uses frequent pattern mining to identify the most commonly occurring split conditions. Then a simple, conjunctive form rule is constructed where the antecedent terms are derived from the attributes that had the most influence on the classification. This rule is returned alongside estimates of the rule’s precision and coverage on the training data along with counter-factual details. An experimental study involving nine data sets shows that classification rules returned by CHIRPS have a precision at least as high as the state of the art when evaluated on unseen data (0.91–0.99) and offer a much greater coverage (0.04–0.54). Furthermore, CHIRPS uniquely controls against under- and over-fitting solutions by maximising novel objective functions that are better suited to the local (per instance) explanation setting.


2021 ◽  
Vol 6 (3) ◽  
pp. 23-29
Author(s):  
Aleksander V. Butorin ◽  
Grigory V. Mokhov

Background. Facial zoning based on seismic data is an important task of dynamic analysis. There are numerous approaches of solving this problem using various algorithms. The most common method is clustering by reflection shape. This approach belongs to unsupervised learning algorithms, due to the mapping of seismic facies is based on the internal data structure and the key feature is the change in the wave packet within the target interval. The disadvantage of this method is requirement of further tying clustering results and geological information. Another way of directed solution of this problem is the use of supervised learning algorithms. This category includes various classification methods that relate to the category of machine learning. In comparison to traditional approaches of seismic facial analysis, this method accounts geological information at the computation stage. Aim. This paper considers the results of a research carried out with the study of the facies structure of the Tyumen formation at a group of fields in the Khanty-Mansiysk Autonomous Region. The Tyumen formation is characterized by the predominance of channel facies associated with the development of complex river systems, which are clearly observed in the dynamic characteristics of the wave field. A complicating factor in the study of these deposits is the rather low coverage of well data, which makes difficult the geological interpretation of the results obtained. Materials and methods. The authors used the Random Forest classification method to deal with the assigned task. The application of the method is considered on the cluster consisting of three seismic surveys obtained at different times. For training, expert marking by area was used based on the distribution of amplitudes along the reflecting horizon. Results. As a result of the research, a probabilistic assessment of the distribution of channel facies was obtained, that is related to the perspective of this type of deposits in the study area. Thus, the authors have developed a methodology that gives an opportunity to obtain an estimate of the probability of the presence of a certain facies using seismic data. Conclusions. The performed study shows the possibility of using the Random Forest classification method to solve the problem of facial zoning.


Sign in / Sign up

Export Citation Format

Share Document