scholarly journals Model Simplification of Deep Random Forest for Real-Time Applications of Various Sensor Data

Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3004
Author(s):  
Sangwon Kim ◽  
Byoung-Chul Ko ◽  
Jaeyeal Nam

The deep random forest (DRF) has recently gained new attention in deep learning because it has a high performance similar to that of a deep neural network (DNN) and does not rely on a backpropagation. However, it connects a large number of decision trees to multiple layers, thereby making analysis difficult. This paper proposes a new method for simplifying a black-box model of a DRF using a proposed rule elimination. For this, we consider quantifying the feature contributions and frequency of the fully trained DRF in the form of a decision rule set. The feature contributions provide a basis for determining how features affect the decision process in a rule set. Model simplification is achieved by eliminating unnecessary rules by measuring the feature contributions. Consequently, the simplified and transparent DRF has fewer parameters and rules than before. The proposed method was successfully applied to various DRF models and benchmark sensor datasets while maintaining a robust performance despite the elimination of a large number of rules. A comparison with state-of-the-art compressed DNNs also showed the proposed model simplification’s higher parameter compression and memory efficiency with a similar classification accuracy.

2020 ◽  
Author(s):  
Sangwon Kim ◽  
Mira Jeong ◽  
Byoung Chul Ko

<div><br></div><div>This paper proposes a new method for interpreting and simplifying a black box model of a deep random forest (RF) using a proposed rule elimination. In deep RF, a large number of decision trees are connected to multiple layers, thereby making an analysis difficult. It has a high performance similar to that of a deep neural network (DNN), but achieves a better generalizability. Therefore, in this study, we consider quantifying the feature contributions and frequency of the fully trained deep RF in the form of a decision rule set. The feature contributions provide a basis for determining how features affect the decision process in a rule set. Model simplification is achieved by eliminating unnecessary rules by measuring the feature contributions. Consequently, the simplified model has fewer parameters and rules than before. Experiment results have shown that a feature contribution analysis allows a black box model to be decomposed for quantitatively interpreting a rule set. The proposed method was successfully applied to various deep RF models and benchmark datasets while maintaining a robust performance despite the elimination of a large number of rules.<br></div>


2020 ◽  
Author(s):  
Sangwon Kim ◽  
Mira Jeong ◽  
Byoung Chul Ko

<div><br></div><div>This paper proposes a new method for interpreting and simplifying a black box model of a deep random forest (RF) using a proposed rule elimination. In deep RF, a large number of decision trees are connected to multiple layers, thereby making an analysis difficult. It has a high performance similar to that of a deep neural network (DNN), but achieves a better generalizability. Therefore, in this study, we consider quantifying the feature contributions and frequency of the fully trained deep RF in the form of a decision rule set. The feature contributions provide a basis for determining how features affect the decision process in a rule set. Model simplification is achieved by eliminating unnecessary rules by measuring the feature contributions. Consequently, the simplified model has fewer parameters and rules than before. Experiment results have shown that a feature contribution analysis allows a black box model to be decomposed for quantitatively interpreting a rule set. The proposed method was successfully applied to various deep RF models and benchmark datasets while maintaining a robust performance despite the elimination of a large number of rules.<br></div>


Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2444
Author(s):  
Mazhar Javed Awan ◽  
Osama Ahmed Masood ◽  
Mazin Abed Mohammed ◽  
Awais Yasin ◽  
Azlan Mohd Zain ◽  
...  

In recent years the amount of malware spreading through the internet and infecting computers and other communication devices has tremendously increased. To date, countless techniques and methodologies have been proposed to detect and neutralize these malicious agents. However, as new and automated malware generation techniques emerge, a lot of malware continues to be produced, which can bypass some state-of-the-art malware detection methods. Therefore, there is a need for the classification and detection of these adversarial agents that can compromise the security of people, organizations, and countless other forms of digital assets. In this paper, we propose a spatial attention and convolutional neural network (SACNN) based on deep learning framework for image-based classification of 25 well-known malware families with and without class balancing. Performance was evaluated on the Malimg benchmark dataset using precision, recall, specificity, precision, and F1 score on which our proposed model with class balancing reached 97.42%, 97.95%, 97.33%, 97.11%, and 97.32%. We also conducted experiments on SACNN with class balancing on benign class, also produced above 97%. The results indicate that our proposed model can be used for image-based malware detection with high performance, despite being simpler as compared to other available solutions.


Algorithms ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 17 ◽  
Author(s):  
Emmanuel Pintelas ◽  
Ioannis E. Livieris ◽  
Panagiotis Pintelas

Machine learning has emerged as a key factor in many technological and scientific advances and applications. Much research has been devoted to developing high performance machine learning models, which are able to make very accurate predictions and decisions on a wide range of applications. Nevertheless, we still seek to understand and explain how these models work and make decisions. Explainability and interpretability in machine learning is a significant issue, since in most of real-world problems it is considered essential to understand and explain the model’s prediction mechanism in order to trust it and make decisions on critical issues. In this study, we developed a Grey-Box model based on semi-supervised methodology utilizing a self-training framework. The main objective of this work is the development of a both interpretable and accurate machine learning model, although this is a complex and challenging task. The proposed model was evaluated on a variety of real world datasets from the crucial application domains of education, finance and medicine. Our results demonstrate the efficiency of the proposed model performing comparable to a Black-Box and considerably outperforming single White-Box models, while at the same time remains as interpretable as a White-Box model.


2020 ◽  
Vol 34 (04) ◽  
pp. 4107-4114 ◽  
Author(s):  
Masoumeh Heidari Kapourchali ◽  
Bonny Banerjee

We propose an agent model capable of actively and selectively communicating with other agents to predict its environmental state efficiently. Selecting whom to communicate with is a challenge when the internal model of other agents is unobservable. Our agent learns a communication policy as a mapping from its belief state to with whom to communicate in an online and unsupervised manner, without any reinforcement. Human activity recognition from multimodal, multisource and heterogeneous sensor data is used as a testbed to evaluate the proposed model where each sensor is assumed to be monitored by an agent. The recognition accuracy on benchmark datasets is comparable to the state-of-the-art even though our model uses significantly fewer parameters and infers the state in a localized manner. The learned policy reduces number of communications. The agent is tolerant to communication failures and can recognize unreliable agents through their communication messages. To the best of our knowledge, this is the first work on learning communication policies by an agent for predicting its environmental state.


2020 ◽  
Author(s):  
Y Sun ◽  
H Wang ◽  
Bing Xue ◽  
Y Jin ◽  
GG Yen ◽  
...  

© 1997-2012 IEEE. Convolutional neural networks (CNNs) have shown remarkable performance in various real-world applications. Unfortunately, the promising performance of CNNs can be achieved only when their architectures are optimally constructed. The architectures of state-of-the-art CNNs are typically handcrafted with extensive expertise in both CNNs and the investigated data, which consequently hampers the widespread adoption of CNNs for less experienced users. Evolutionary deep learning (EDL) is able to automatically design the best CNN architectures without much expertise. However, the existing EDL algorithms generally evaluate the fitness of a new architecture by training from scratch, resulting in the prohibitive computational cost even operated on high-performance computers. In this paper, an end-to-end offline performance predictor based on the random forest is proposed to accelerate the fitness evaluation in EDL. The proposed performance predictor shows the promising performance in term of the classification accuracy and the consumed computational resources when compared with 18 state-of-the-art peer competitors by integrating into an existing EDL algorithm as a case study. The proposed performance predictor is also compared with the other two representatives of existing performance predictors. The experimental results show the proposed performance predictor not only significantly speeds up the fitness evaluations but also achieves the best prediction among the peer performance predictors.


2020 ◽  
Vol 10 (23) ◽  
pp. 8346
Author(s):  
Ni Jiang ◽  
Feihong Yu

Cell counting is a fundamental part of biomedical and pathological research. Predicting a density map is the mainstream method to count cells. As an easy-trained and well-generalized model, the random forest is often used to learn the cell images and predict the density maps. However, it cannot predict the data that are beyond the training data, which may result in underestimation. To overcome this problem, we propose a cell counting framework to predict the density map by detecting cells. The cell counting framework contains two parts: the training data preparation and the detection framework. The former makes sure that the cells can be detected even when overlapping, and the latter makes sure the count result accurate and robust. The proposed method uses multiple random forests to predict various probability maps where the cells can be detected by Hessian matrix. Take all the detection results into consideration to get the density map and achieve better performance. We conducted experiments on three public cell datasets. Experimental results showed that the proposed model performs better than the traditional random forest (RF) in terms of accuracy and robustness, and even superior to some state-of-the-art deep learning models. Especially when the training data are small, which is the usual case in cell counting, the count errors on VGG cells, and MBM cells were decreased from 3.4 to 2.9, from 11.3 to 9.3, respectively. The proposed model can obtain the lowest count error and achieves state-of-the-art.


2020 ◽  
Author(s):  
Y Sun ◽  
H Wang ◽  
Bing Xue ◽  
Y Jin ◽  
GG Yen ◽  
...  

© 1997-2012 IEEE. Convolutional neural networks (CNNs) have shown remarkable performance in various real-world applications. Unfortunately, the promising performance of CNNs can be achieved only when their architectures are optimally constructed. The architectures of state-of-the-art CNNs are typically handcrafted with extensive expertise in both CNNs and the investigated data, which consequently hampers the widespread adoption of CNNs for less experienced users. Evolutionary deep learning (EDL) is able to automatically design the best CNN architectures without much expertise. However, the existing EDL algorithms generally evaluate the fitness of a new architecture by training from scratch, resulting in the prohibitive computational cost even operated on high-performance computers. In this paper, an end-to-end offline performance predictor based on the random forest is proposed to accelerate the fitness evaluation in EDL. The proposed performance predictor shows the promising performance in term of the classification accuracy and the consumed computational resources when compared with 18 state-of-the-art peer competitors by integrating into an existing EDL algorithm as a case study. The proposed performance predictor is also compared with the other two representatives of existing performance predictors. The experimental results show the proposed performance predictor not only significantly speeds up the fitness evaluations but also achieves the best prediction among the peer performance predictors.


Sign in / Sign up

Export Citation Format

Share Document