Predicting effects of selected impregnation processes on the observed bending strength of wood, with use of data mining models

BioResources ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. 4891-4904
Author(s):  
Selahattin Bardak ◽  
Timucin Bardak ◽  
Hüseyin Peker ◽  
Eser Sözen ◽  
Yildiz Çabuk

Wood materials have been used in many products such as furniture, stairs, windows, and doors for centuries. There are differences in methods used to adapt wood to ambient conditions. Impregnation is a widely used method of wood preservation. In terms of efficiency, it is critical to optimize the parameters for impregnation. Data mining techniques reduce most of the cost and operational challenges with accurate prediction in the wood industry. In this study, three data-mining algorithms were applied to predict bending strength in impregnated wood materials (Pinus sylvestris L. and Millettia laurentii). Models were created from real experimental data to examine the relationship between bending strength, diffusion time, vacuum duration, and wood type, based on decision trees (DT), random forest (RF), and Gaussian process (GP) algorithms. The highest bending strength was achieved with wenge (Millettia laurentii) wood in 10 bar vacuum and the diffusion condition during 25 min. The results showed that all algorithms are suitable for predicting bending strength. The goodness of fit for the testing phase was determined as 0.994, 0.986, and 0.989 in the DT, RF, and GP algorithms, respectively. Moreover, the importance of attributes was determined in the algorithms.

2021 ◽  
pp. 016555152110308
Author(s):  
Salma Khan ◽  
Muhammad Shaheen

The knowledge gained from data mining is highly dependent on the experience of an expert for further analysis to increase effectiveness and wise decision-making. This mined knowledge requires actionability enhancement before it can be applied to real-world problems. The literature highlights the reasons that emerged the need to incorporate human wisdom in decision-making for complex problems. To solve this problem, a domain called ‘Wisdom Mining’ is recommended, proposing a set of algorithms parallel to the algorithms proposed by the data mining. In wisdom mining, a process to extract wisdom needs to be defined with less influence from an expert. This review proposed improvements to data mining techniques and their applications in the real world and emphasised the need to seek ways to harness wisdom from data. This study covers the diverse definitions and different perspectives of wisdom within philosophy, psychology, management and computer science. This comprehensive literature review served as a foundation for constructing a wise decision framework that aided in identifying the wisdom factors like context, utility, location and time. The inclusion of these wisdom factors in existing data mining algorithms makes the transition from data mining to wisdom mining possible. This research includes the relationship between these two mining process that facilitated further elucidation of the wisdom mining process. Potential research trends in the domain are also seen as a potential endeavour to improve the analysis and use of data.


2009 ◽  
pp. 2000-2009
Author(s):  
J. J. Dolado ◽  
D. Rodríguez ◽  
J. Riquelme ◽  
F. Ferrer-Troyano ◽  
J. J. Cuadrado

One of the problems found in generic project databases, where the data is collected from different organizations, is the large disparity of its instances. In this chapter, we characterize the database selecting both attributes and instances so that project managers can have a better global vision of the data they manage. To achieve that, we first make use of data mining algorithms to create clusters. From each cluster, instances are selected to obtain a final subset of the database. The result of the process is a smaller database which maintains the prediction capability and has a lower number of instances and attributes than the original, yet allow us to produce better predictions.


2007 ◽  
Vol Volume 6, april 2007, joint... ◽  
Author(s):  
Patricia E.N. Lutu ◽  
Andries P. Engelbrecht

International audience Sampling of large datasets for data mining is important for at least two reasons. The processing of large amounts of data results in increased computational complexity. The cost of this additional complexity may not be justifiable. On the other hand, the use of small samples results in fast and efficient computation for data mining algorithms. Statistical methods for obtaining sufficient samples from datasets for classification problems are discussed in this paper. Results are presented for an empirical study based on the use of sequential random sampling and sample evaluation using univariate hypothesis testing and an information theoretic measure. Comparisons are made between theoretical and empirical estimates. L’échantillonnage pour le minage de large ensemble de données est important pour au moins deux raisons. Le traitement de grande quantité de données a pour résultat une augmentation de la complexité informatique. Le coût de cette complexité additionnelle pourrait être non justifiable. D’autre part, l’utilisation de petits échantillons a pour résultat des calculs rapides et efficaces pour les algorithmes de minage de données. Les méthodes de statistique pour obtenir des échantillons d’ensemble de donnés satisfaisants pour les problèmes de classification sont discutées dans ce papier. Des résultats sont présentés pour une étude empirique basée sur l’utilisation d’échantillonnage aléatoire séquentiel et l’évaluation d’échantillon utilisant le test d’hypothèse univariée et une mesure théorétique de l’information. Des comparaisons sont faites entre des estimations théoriques et empiriques


Author(s):  
J. J. Dolado ◽  
D. Rodríguez ◽  
J. Riquelme ◽  
F. Ferrer-Troyano ◽  
J. J. Cuadrado

One of the problems found in generic project databases, where the data is collected from different organizations, is the large disparity of its instances. In this chapter, we characterize the database selecting both attributes and instances so that project managers can have a better global vision of the data they manage. To achieve that, we first make use of data mining algorithms to create clusters. From each cluster, instances are selected to obtain a final subset of the database. The result of the process is a smaller database which maintains the prediction capability and has a lower number of instances and attributes than the original, yet allow us to produce better predictions.


Author(s):  
Ansar Abbas ◽  
Muhammad Aman Ullah ◽  
Abdul Waheed

This study is conducted to predict the body weight (BW) for Thalli sheep of southern Punjab from different body measurements. In the BW prediction, several body measurements viz., withers height, body length, head length, head width, ear length, ear width, neck length, neck width, heart girth, rump length, rump width, tail length, barrel depth and sacral pelvic width are used as predictors. The data mining algorithms such as Chi-square Automatic Interaction Detector (CHAID), Exhaustive CHAID, Classification and Regression Tree (CART) and Artificial Neural Network (ANN) are used to predict the BW for a total of 85 female Thalli sheep. The data set is partitioned into training (80 %) and test (20 %) sets before the algorithms are used. The minimum number of parent (4) and child nodes (2) are set in order to ensure their predictive ability. The R2 % and RMSE values for CHAID, Exhaustive CHAID, ANN and CART algorithms are 67.38(1.003), 64.37(1.049), 61.45(1.093) and 59.02(1.125), respectively. The mostsignificant predictor is BL in the BW prediction of Thalli sheep. The heaviest BW average of 9.596 kg is obtained from the subgroup of those having BL > 25.000 inches. On behalf of the several goodness of fit criteria, we conclude that the CHAID algorithm performance is better in order to predict the BW of Thalli sheep and more suitable decision tree diagram visually. Also, the obtained CHAID results may help to determine body measurements positively associated with BW for developing better selection strategies with the scope of indirect selection criteria.


Author(s):  
Geert Wets ◽  
Koen Vanhoof ◽  
Theo Arentze ◽  
Harry Timmermans

The utility-maximizing framework—in particular, the logit model—is the dominantly used framework in transportation demand modeling. Computational process modeling has been introduced as an alternative approach to deal with the complexity of activity-based models of travel demand. Current rule-based systems, however, lack a methodology to derive rules from data. The relevance and performance of data-mining algorithms that potentially can provide the required methodology are explored. In particular, the C4 algorithm is applied to derive a decision tree for transport mode choice in the context of activity scheduling from a large activity diary data set. The algorithm is compared with both an alternative method of inducing decision trees (CHAID) and a logit model on the basis of goodness-of-fit on the same data set. The ratio of correctly predicted cases of a holdout sample is almost identical for the three methods. This suggests that for data sets of comparable complexity, the accuracy of predictions does not provide grounds for either rejecting or choosing the C4 method. However, the method may have advantages related to robustness. Future research is required to determine the ability of decision tree-based models in predicting behavioral change.


Author(s):  
G. Ramadevi ◽  
Srujitha Yeruva ◽  
P. Sravanthi ◽  
P. Eknath Vamsi ◽  
S. Jaya Prakash

In a digitized world, data is growing exponentially and it is difficult to analyze the data and give the results. Data mining techniques play an important role in healthcare sector - BigData. By making use of Data mining algorithms it is possible to analyze, detect and predict the presence of disease which helps doctors to detect the disease early and in decision making. The objective of data mining techniques used is to design an automated tool that notifies the patient’s treatment history disease and medical data to doctors. Data mining techniques are very much useful in analyzing medical data to achieve meaningful and practical patterns. This project works on diabetes medical data, classification and clustering algorithms like (OPTICS, NAIVEBAYES, and BRICH) are implemented and the efficiency of the same is examined.


Author(s):  
Abdusalam Shaltooki ◽  
Mojtaba Jamshidi

Aerodynamic is a branch of fluid dynamics that evaluates the behavior of airflow and its interaction with moving objects. The most important application of aerodynamic is in aerospace engineering, designing and construction of flying objects. Reduction of noise emitted by aerodynamic objects is one of the most important challenges in this area and many efforts have been to reduce its negative effects. The prediction of noise emitted from these aerodynamic objects is a low-cost and fast approach that can partially replace the "fabrication and testing" phase. One of the most common and successful tools in prediction procedures is data mining technology. In this paper, the performance of different data mining algorithms such as Random Forest, J48, RBF Network, SVM, MLP, Logistic, and Bagging is evaluated in predicting the amount of noise emitted from aerodynamic objects. The experiments are conducted on a dataset collected by NASA, which is called "Airfoil Self-Noise". The obtained results illustrate that the proposed hybrid model derived from the combination of Random Forest and Bagging algorithms has better performance compared to other methods with an accuracy of 77.6% and mean absolute error of 0.2279.


Sign in / Sign up

Export Citation Format

Share Document