scholarly journals Boosted Fuzzy Granular Regression Trees

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Wei Li ◽  
Youmeng Luo ◽  
Chao Tang ◽  
Kaiqiang Zhang ◽  
Xiaoyu Ma

The regression problem is a valued problem in the domain of machine learning, and it has been widely employed in many fields such as meteorology, transportation, and material. Granular computing (GrC) is a good approach of exploring human intelligent information processing, which has the superiority of knowledge discovery. Ensemble learning is easy to execute parallelly. Based on granular computing and ensemble learning, we convert the regression problem into granular space equivalently to solve and proposed boosted fuzzy granular regression trees (BFGRT) to predict a test instance. The thought of BFGRT is as follows. First, a clustering algorithm with automatic optimization of clustering centers is presented. Next, in terms of the clustering algorithm, we employ MapReduce to parallelly implement fuzzy granulation of the data. Then, we design new operators and metrics of fuzzy granules to build fuzzy granular rule base. Finally, a fuzzy granular regression tree (FGRT) in the fuzzy granular space is presented. In the light of these, BFGRT can be designed by parallelly combing multiple FGRTs via random sampling attributes and MapReduce. Theory and experiments show that BFGRT is accurate, efficient, and robust.

Author(s):  
Pooya Tavallali ◽  
Peyman Tavallali ◽  
Mukesh Singhal

A fast, convenient and well-known way toward regression is to induce and prune a binary tree. However, there has been little attempt toward improving the performance of an induced regression tree. This paper presents a meta-algorithm capable of minimizing the regression loss function, thus, improving the accuracy of any given hierarchical model, such as k-ary regression trees. Our proposed method minimizes the loss function of each node one by one. At split nodes, this leads to solving an instance-based cost-sensitive classification problem over the node’s data points. At the leaf nodes, the method leads to a simple regression problem. In the case of binary univariate and multivariate regression trees, the computational complexity of training is linear over the samples. Hence, our method is scalable to large trees and datasets. We also briefly explore possibilities of applying proposed method to classification tasks. We show that our algorithm has significantly better test error compared to other state-ofthe- art tree algorithms. At the end, accuracy, memory usage and query time of our method are compared to recently introduced forest models. We depict that, most of the time, our proposed method is able to achieve better or similar accuracy while having tangibly faster query time and smaller number of nonzero weights.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Yoonseok Shin

Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 707 ◽  
Author(s):  
Tran Manh Tuan ◽  
Luong Thi Hong Lan ◽  
Shuo-Yan Chou ◽  
Tran Thi Ngan ◽  
Le Hoang Son ◽  
...  

Complex fuzzy theory has strong practical background in many important applications, especially in decision-making support systems. Recently, the Mamdani Complex Fuzzy Inference System (M-CFIS) has been introduced as an effective tool for handling events that are not restricted to only values of a given time point but also include all values within certain time intervals (i.e., the phase term). In such decision-making problems, the complex fuzzy theory allows us to observe both the amplitude and phase values of an event, thus resulting in better performance. However, one of the limitations of the existing M-CFIS is the rule base that may be redundant to a specific dataset. In order to handle the problem, we propose a new Mamdani Complex Fuzzy Inference System with Rule Reduction Using Complex Fuzzy Measures in Granular Computing called M-CFIS-R. Several fuzzy similarity measures such as Complex Fuzzy Cosine Similarity Measure (CFCSM), Complex Fuzzy Dice Similarity Measure (CFDSM), and Complex Fuzzy Jaccard Similarity Measure (CFJSM) together with their weighted versions are proposed. Those measures are integrated into the M-CFIS-R system by the idea of granular computing such that only important and dominant rules are being kept in the system. The difference and advantage of M-CFIS-R against M-CFIS is the usage of the training process in which the rule base is repeatedly changed toward the original base set until the performance is better. By doing so, the new rule base in M-CFIS-R would improve the performance of the whole system. Experiments on various decision-making datasets demonstrate that the proposed M-CFIS-R performs better than M-CFIS.


Vehicles ◽  
2020 ◽  
Vol 2 (1) ◽  
pp. 126-141
Author(s):  
Weizhen You ◽  
Alexandre Saidi ◽  
Abdel-malek Zine ◽  
Mohamed Ichchou

Reliability assessment plays a significant role in mechanical design and improvement processes. Uncertainties in structural properties as well as those in the stochatic excitations have made reliability analysis more difficult to apply. In fact, reliability evaluations involve estimations of the so-called conditional failure probability (CFP) that can be seen as a regression problem taking the structural uncertainties as input and the CFPs as output. As powerful ensemble learning methods in a machine learning (ML) domain, random forest (RF), and its variants Gradient boosting (GB), Extra-trees (ETs) always show good performance in handling non-parametric regressions. However, no systematic studies of such methods in mechanical reliability are found in the current published research. Another more complex ensemble method, i.e., Stacking (Stacked Generalization), tries to build the regression model hierarchically, resulting in a meta-learner induced from various base learners. This research aims to build a framework that integrates ensemble learning theories in mechanical reliability estimations and explore their performances on different complexities of structures. In numerical simulations, the proposed methods are tested based on different ensemble models and their performances are compared and analyzed from different perspectives. The simulation results show that, with much less analysis of structural samples, the ensemble learning methods achieve highly comparable estimations with those by direct Monte Carlo simulation (MCS).


2016 ◽  
Vol 04 (02) ◽  
pp. 117-127 ◽  
Author(s):  
Anoop Sathyan ◽  
Nicholas D. Ernest ◽  
Kelly Cohen

Fuzzy logic is used in a variety of applications because of its universal approximator attribute and nonlinear characteristics. But, it takes a lot of trial and error to come up with a set of membership functions and rule-base that will effectively work for a specific application. This process could be simplified by using a heuristic search algorithm like Genetic Algorithm (GA). In this paper, genetic fuzzy is applied to the task assignment for cooperating Unmanned Aerial Vehicles (UAVs) classified as the Polygon Visiting Multiple Traveling Salesman Problem (PVMTSP). The PVMTSP finds a lot of applications including UAV swarm routing. We propose a method of genetic fuzzy clustering that would be specific to PVMTSP problems and hence more efficient compared to k-means and c-means clustering. We developed two different algorithms using genetic fuzzy. One evaluates the distance covered by each UAV to cluster the search-space and the other uses a cost function that approximates the distance covered thus resulting in a reduced computational time. We compare these two approaches to each other as well as to an already benchmarked fuzzy clustering algorithm which is the current state-of-the-art. We also discuss how well our algorithm scales for increasing number of targets. The results are compared for small and large polygon sizes.


2021 ◽  
Vol 24 (1) ◽  
pp. 42-47
Author(s):  
N. P. Koryshev ◽  
◽  
I. A. Hodashinsky ◽  

The article presents a description of the algorithm for generating fuzzy rules for a fuzzy classifier using data clustering, metaheuristic, and the clustering quality index, as well as the results of performance testing on real data sets.


Author(s):  
Ramón Ventura Roque Hernández ◽  
José Melchor Medina Quintero ◽  
Adán López Mendoza ◽  
Demián Ábrego Almazán

En los últimos años, las universidades han promovido el acceso a los repositorios digitales para localizar fuentes de información que faciliten el proceso de investigación científica. Sin embargo, son escasos los estudios que han evaluado la satisfacción de los usuarios en relación con el empleo de estos recursos tecnológicos. Este trabajo, en consecuencia, tuvo como objetivo identificar perfiles en la satisfacción de estudiantes universitarios con el manejo de estas herramientas. Para ello, se aplicó un cuestionario con 26 preguntas agrupadas en 7 dimensiones que permitieron recabar respuestas de 219 participantes de una universidad con presencia en Nuevo Laredo y Ciudad Victoria (Tamaulipas, México). En esta labor, se analizaron dos variables como posibles predictores en la construcción de perfiles de satisfacción de uso: la primera se relacionó con la interfaz del repositorio (interactividad, confianza, oportunidad de acceso, facilidad de uso, atractivo visual e innovación), mientras que la segunda se vinculó con el estudiante (sexo, nivel de estudios máximo y lugar de origen). Para esta tarea se utilizó el paquete estadístico SPSS y se aplicó la técnica de minería de datos denominada árbol de regresión, con método de crecimiento denominado CRT (classification and regression trees). A partir de los datos recabados, se obtuvo un árbol que describe tres perfiles con niveles de satisfacción bajo, medio y alto. Las personas con bajo nivel de satisfacción fueron quienes percibieron que los repositorios no eran fáciles de utilizar. El nivel medio de satisfacción se observó en personas que consideraron que los repositorios eran fáciles de usar, aunque no tuvieron confianza en la seguridad que ofrecían ni percibieron un alto nivel de innovación en ellos. Por último, los más altos niveles de satisfacción se evidenciaron en estudiantes que opinaron que los repositorios eran fáciles de manejar y tenían un nivel confiable de seguridad. Los resultados hacen posible el entendimiento de la satisfacción de los usuarios en términos de las variables estudiadas, con el objetivo de priorizarlas en el diseño e implementación de nuevos repositorios institucionales para brindar mejores experiencias de uso orientadas al óptimo aprovechamiento de estos recursos.


Author(s):  
Pierre-Alexandre Murena ◽  
Jérémie Sublime ◽  
Basarab Matei ◽  
Antoine Cornuéjols

Clustering is a compression task which consists in grouping similar objects into clusters. In real-life applications, the system may have access to several views of the same data and each view may be processed by a specific clustering algorithm: this framework is called multi-view clustering and can benefit from algorithms capable of exchanging information between the different views. In this paper, we consider this type of unsupervised ensemble learning as a compression problem and develop a theoretical framework based on algorithmic theory of information suitable for multi-view clustering and collaborative clustering applications. Using this approach, we propose a new algorithm based on solid theoretical basis, and test it on several real and artificial data sets.


2012 ◽  
Vol 31 ◽  
pp. 15-21 ◽  
Author(s):  
A. Künne ◽  
M. Fink ◽  
H. Kipka ◽  
P. Krause ◽  
W.-A. Flügel

Abstract. In this paper, a method is presented to estimate excess nitrogen on large scales considering single field processes. The approach was implemented by using the physically based model J2000-S to simulate the nitrogen balance as well as the hydrological dynamics within meso-scale test catchments. The model input data, the parameterization, the results and a detailed system understanding were used to generate the regression tree models with GUIDE (Loh, 2002). For each landscape type in the federal state of Thuringia a regression tree was calibrated and validated using the model data and results of excess nitrogen from the test catchments. Hydrological parameters such as precipitation and evapotranspiration were also used to predict excess nitrogen by the regression tree model. Hence they had to be calculated and regionalized as well for the state of Thuringia. Here the model J2000g was used to simulate the water balance on the macro scale. With the regression trees the excess nitrogen was regionalized for each landscape type of Thuringia. The approach allows calculating the potential nitrogen input into the streams of the drainage area. The results show that the applied methodology was able to transfer the detailed model results of the meso-scale catchments to the entire state of Thuringia by low computing time without losing the detailed knowledge from the nitrogen transport modeling. This was validated with modeling results from Fink (2004) in a catchment lying in the regionalization area. The regionalized and modeled excess nitrogen correspond with 94%. The study was conducted within the framework of a project in collaboration with the Thuringian Environmental Ministry, whose overall aim was to assess the effect of agro-environmental measures regarding load reduction in the water bodies of Thuringia to fulfill the requirements of the European Water Framework Directive (Bäse et al., 2007; Fink, 2006; Fink et al., 2007).


Author(s):  
Donghyun Kim

In this paper, we propose methods for brain tumor detection in MRI images based on ensemble learning. We build upon prior research on ensemble methods by testing the concatenation of pre-trained models: features extracted via transfer learning are merged and segmented by classification algorithms or a stacked ensemble of those algorithms. The proposed approach achieved accuracy scores of 0.98 , outperforming a benchmark VGG-16 model. Considerations to granular computing are given in the paper as well.


Sign in / Sign up

Export Citation Format

Share Document