scholarly journals Modelling the Spatial Distribution of Asbestos—Cement Products in Poland with the Use of the Random Forest Algorithm

2019 ◽  
Vol 11 (16) ◽  
pp. 4355 ◽  
Author(s):  
Ewa Wilk ◽  
Małgorzata Krówczyńska ◽  
Bogdan Zagajewski

The unique set of physical and chemical properties of asbestos has led to its many industrial applications worldwide, of which roofing and facades constitute approximately 80% of currently used asbestos-containing products. Since asbestos-containing products are harmful to human health, their use and production have been banned in many countries. To date, no research has been undertaken to estimate the total amount of asbestos–cement products used at the country level in relation to regions or other administrative units. The objective of this paper is to present a possible new solution for developing the spatial distribution of asbestos–cement products used across the country by applying the supervised machine learning algorithm, i.e., Random Forest. Based on the results of a physical inventory taken on asbestos–cement products with the use of aerial imagery, and the application of selected features, considering the socio-economic situation of Poland, i.e., population, buildings, public finance, housing economy and municipal infrastructure, wages, salaries and social security benefits, agricultural census, entities of the national economy, labor market, environment protection, area of built-up surfaces, historical belonging to annexations, and data on asbestos manufacturing plants, best Random Forest models were computed. The selection of important variables was made in the R v.3.1.0 program and supported by the Boruta algorithm. The prediction of the amount of asbestos–cement products used in communes was executed in the randomForest package. An algorithm explaining 75.85% of the variance was subsequently used to prepare the prediction map of the spatial distribution of the amount of asbestos–cement products used in Poland. The total amount was estimated at 710,278,645 m2 (7.8 million tons). Since the best model used data on built-up surfaces which are available for the whole of Europe, it is worth considering the use of the developed method in other European countries, as well as to assess the environmental risk of asbestos exposure to humans.

2017 ◽  
Vol 35 (5) ◽  
pp. 491-499 ◽  
Author(s):  
Ewa Wilk ◽  
Małgorzata Krówczyńska ◽  
Piotr Pabjanek ◽  
Piotr Mędrzycki

The unique set of physical and chemical properties has led to many industrial applications of asbestos worldwide; one of them was roof covering. Asbestos is harmful to human health, and therefore its use was legally forbidden. Since in Poland there is no adequate data on the amount of asbestos-cement roofing, the objective of this study was to estimate its quantity on the basis of physical inventory taking with the use of aerial imagery, and the application of selected statistical features. Data pre-processing and analysis was executed in R Statistical Environment v. 3.1.0. Best random forest models were computed; model explaining 72.9% of the variance was subsequently used to prepare the prediction map of the amount of asbestos-cement roofing in Poland. Variables defining the number of farms, number and age of buildings, and regional differences were crucial for the analysis. The total amount of asbestos roofing in Poland was estimated at 738,068,000 m2 (8.2m t). It is crucial for the landfill development programme, financial resources distribution, and application of monitoring policies.


Friction ◽  
2021 ◽  
Author(s):  
Vigneashwara Pandiyan ◽  
Josef Prost ◽  
Georg Vorlaufer ◽  
Markus Varga ◽  
Kilian Wasmer

AbstractFunctional surfaces in relative contact and motion are prone to wear and tear, resulting in loss of efficiency and performance of the workpieces/machines. Wear occurs in the form of adhesion, abrasion, scuffing, galling, and scoring between contacts. However, the rate of the wear phenomenon depends primarily on the physical properties and the surrounding environment. Monitoring the integrity of surfaces by offline inspections leads to significant wasted machine time. A potential alternate option to offline inspection currently practiced in industries is the analysis of sensors signatures capable of capturing the wear state and correlating it with the wear phenomenon, followed by in situ classification using a state-of-the-art machine learning (ML) algorithm. Though this technique is better than offline inspection, it possesses inherent disadvantages for training the ML models. Ideally, supervised training of ML models requires the datasets considered for the classification to be of equal weightage to avoid biasing. The collection of such a dataset is very cumbersome and expensive in practice, as in real industrial applications, the malfunction period is minimal compared to normal operation. Furthermore, classification models would not classify new wear phenomena from the normal regime if they are unfamiliar. As a promising alternative, in this work, we propose a methodology able to differentiate the abnormal regimes, i.e., wear phenomenon regimes, from the normal regime. This is carried out by familiarizing the ML algorithms only with the distribution of the acoustic emission (AE) signals captured using a microphone related to the normal regime. As a result, the ML algorithms would be able to detect whether some overlaps exist with the learnt distributions when a new, unseen signal arrives. To achieve this goal, a generative convolutional neural network (CNN) architecture based on variational auto encoder (VAE) is built and trained. During the validation procedure of the proposed CNN architectures, we were capable of identifying acoustics signals corresponding to the normal and abnormal wear regime with an accuracy of 97% and 80%. Hence, our approach shows very promising results for in situ and real-time condition monitoring or even wear prediction in tribological applications.


2021 ◽  
Author(s):  
Omar Alfarisi ◽  
Zeyar Aung ◽  
Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.


2019 ◽  
Vol 8 (2S3) ◽  
pp. 1630-1635

In the present century, various classification issues are raised with large data and most commonly used machine learning algorithms are failed in the classification process to get accurate results. Datamining techniques like ensemble, which is made up of individual classifiers for the classification process and to generate the new data as well. Random forest is one of the ensemble supervised machine learning technique and essentially used in numerous machine learning applications such as the classification of text and image data. It is popular since it collects more relevant features such as variable importance measure, Out-of-bag error etc. For the viable learning and classification of random forest, it is required to reduce the number of decision trees (Pruning) in the random forest. In this paper, we have presented systematic overview of random forest algorithm along with its application areas. In addition, we presented a brief review of machine learning algorithm proposed in the recent years. Animal classification is considered as an important problem and most of the recent studies are classifying the animals by taking the image dataset. But, very less work has been done on attribute-oriented animal classification and poses many challenges in the process of extracting the accurate features. We have taken a real-time dataset from the Kaggle to classify the animal by collecting the more relevant features with the help of variable importance measure metric and compared with the other popular machine learning models.


2021 ◽  
Vol 13 (22) ◽  
pp. 4716
Author(s):  
Wanxue Zhu ◽  
Ehsan Eyshi Rezaei ◽  
Hamideh Nouri ◽  
Ting Yang ◽  
Binbin Li ◽  
...  

Satellite and unmanned aerial vehicle (UAV) remote sensing can be used to estimate soil properties; however, little is known regarding the effects of UAV and satellite remote sensing data integration on the estimation of soil comprehensive attributes, or how to estimate quickly and robustly. In this study, we tackled those gaps by employing UAV multispectral and Sentinel-2B data to estimate soil salinity and chemical properties over a large agricultural farm (400 ha) covered by different crops and harvest areas at the coastal saline-alkali land of the Yellow River Delta of China in 2019. Spatial information of soil salinity, organic matter, available/total nitrogen content, and pH at 0–10 cm and 10–20 cm layers were obtained via ground sampling (n = 195) and two-dimensional spatial interpolation, aiming to overlap the soil information with remote sensing information. The exploratory factor analysis was conducted to generate latent variables, which represented the salinity and chemical characteristics of the soil. A machine learning algorithm (random forest) was applied to estimate soil attributes. Our results indicated that the integration of UAV texture and Sentinel-2B spectral data as random forest model inputs improved the accuracy of latent soil variable estimation. The remote sensing-based information from cropland (crop-based) had a higher accuracy compared to estimations performed on bare soil (soil-based). Therefore, the crop-based approach, along with the integration of UAV texture and Sentinel-2B data, is recommended for the quick assessment of soil comprehensive attributes.


2022 ◽  
Author(s):  
Omar Alfarisi ◽  
Zeyar Aung ◽  
Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneous rock fabric, we identified Random Forest, among others, to be the appropriate algorithm.


2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Tahir Ali Rather ◽  
Sharad Kumar ◽  
Jamal Ahmad Khan

Abstract Background The habitat resources are structured across different spatial scales in the environment, and thus animals perceive and select habitat resources at different spatial scales. Failure to adopt the scale-dependent framework in species habitat relationships may lead to biased inferences. Multi-scale species distribution models (SDMs) can thus improve the predictive ability as compared to single-scale approaches. This study outlines the importance of multi-scale modeling in assessing the species habitat relationships and may provide a methodological framework using a robust algorithm to model and predict habitat suitability maps (HSMs) for similar multi-species and multi-scale studies. Results We used a supervised machine learning algorithm, random forest (RF), to assess the habitat relationships of Asiatic wildcat (Felis lybica ornata), jungle cat (Felis chaus), Indian fox (Vulpes bengalensis), and golden-jackal (Canis aureus) at ten spatial scales (500–5000 m) in human-dominated landscapes. We calculated out-of-bag (OOB) error rates of each predictor variable across ten scales to select the most influential spatial scale variables. The scale optimization (OOB rates) indicated that model performance was associated with variables at multiple spatial scales. The species occurrence tended to be related strongest to predictor variables at broader scales (5000 m). Multivariate RF models indicated landscape composition to be strong predictors of the Asiatic wildcat, jungle cat, and Indian fox occurrences. At the same time, topographic and climatic variables were the most important predictors determining the golden jackal distribution. Our models predicted range expansion in all four species under future climatic scenarios. Conclusions Our results highlight the importance of using multiscale distribution models when predicting the distribution and species habitat relationships. The wide adaptability of meso-carnivores allows them to persist in human-dominated regions and may even thrive in disturbed habitats. These meso-carnivores are among the few species that may benefit from climate change.


Author(s):  
Chau Vo ◽  
Tru Cao ◽  
Bao Ho

Abbreviations have been widely used in clinical notes because generating clinical notes often takes place under high pressure with lack of writing time and medical record simplification. Those abbreviations limit the clarity and understanding of the records and greatly affect all the computer-based data processing tasks. In this paper, we propose a solution to the abbreviation identification task on clinical notes in a practical context where a few clinical notes have been labeled while so many clinical notes need to be labeled. Our solution is defined with a semi-supervised learning approach that uses level-wise feature engineering to construct an abbreviation identifier, from using a small set of labeled clinical texts and exploiting a larger set of unlabeled clinical texts. A semi-supervised learning algorithm, Semi-RF, and its advanced adaptive version, Weighted Semi-RF, are proposed in the self-training framework using random forest models and Tri-training. Weighted Semi-RF is different from Semi-RF as equipped with a new weighting scheme via adaptation on the current labeled data set. The proposed semi-supervised learning algorithms are practical with parameter-free settings to build an effective abbreviation identifier for identifying abbreviations automatically in clinical texts. Their effectiveness is confirmed with the better Precision and F-measure values from various experiments on real Vietnamese clinical notes. Compared to the existing solutions, our solution is novel for automatic abbreviation identification in clinical notes. Its results can lay the basis for determining the full form of each correctly identified abbreviation and then enhance the readability of the records. Keywords: Electronic medical record, Clinical note, Abbreviation identification, Semi-supervised learning,  Self-training, Random forest.


Every individual host after death has its own altered micro biome configuration. After death, postmortem microorganism communities change to represent the attributes of death. The micro biome act as a many roles in human health, usually done by the exclusive lens of clinical interest. By scouring 5 anatomical areas throughout regular demise exploration from 188 case to predict the Postmortem Interval (PMI), location of death and manner of death, the postmortem micro biomes were collected. The micro biome sequencing are not easy to analyze and interpret because it produces large multidimensional dataset. To overcome the analytical challenge Machine learning method can be used. The two supervised machine learning methods employed here are Random Forest and Bias-Variance Tradeoff. In training datasets, Random forest algorithm is applied. This algorithm makes predictions by choosing the most voted node from each decision tree as the output. The output is checked for any bias variance error, by the Bias-Variance Tradeoff algorithm in order to help the supervised learning algorithm to perform generalization beyond the training datasets. To obtain a prediction that is best fitted and accurate, these two algorithms are chosen for learning.


Sign in / Sign up

Export Citation Format

Share Document