scholarly journals Landslide Susceptibility Assessment Using an AutoML Framework

Author(s):  
Adrián G. Bruzón ◽  
Patricia Arrogante-Funes ◽  
Fátima Arrogante-Funes ◽  
Fidel Martín-González ◽  
Carlos J. Novillo ◽  
...  

The risks associated with landslides are increasing the personal losses and material damages in more and more areas of the world. These natural disasters are related to geological and extreme meteorological phenomena (e.g., earthquakes, hurricanes) occurring in regions that have already suffered similar previous natural catastrophes. Therefore, to effectively mitigate the landslide risks, new methodologies must better identify and understand all these landslide hazards through proper management. Within these methodologies, those based on assessing the landslide susceptibility increase the predictability of the areas where one of these disasters is most likely to occur. In the last years, much research has used machine learning algorithms to assess susceptibility using different sources of information, such as remote sensing data, spatial databases, or geological catalogues. This study presents the first attempt to develop a methodology based on an automatic machine learning (AutoML) framework. These frameworks are intended to facilitate the development of machine learning models, with the aim to enable researchers focus on data analysis. The area to test/validate this study is the center and southern region of Guerrero (Mexico), where we compare the performance of 16 machine learning algorithms. The best result achieved is the extra trees with an area under the curve (AUC) of 0.983. This methodology yields better results than other similar methods because using an AutoML framework allows to focus on the treatment of the data, to better understand input variables and to acquire greater knowledge about the processes involved in the landslides.

2021 ◽  
Vol 11 (9) ◽  
pp. 4251
Author(s):  
Jinsong Zhang ◽  
Shuai Zhang ◽  
Jianhua Zhang ◽  
Zhiliang Wang

In the digital microfluidic experiments, the droplet characteristics and flow patterns are generally identified and predicted by the empirical methods, which are difficult to process a large amount of data mining. In addition, due to the existence of inevitable human invention, the inconsistent judgment standards make the comparison between different experiments cumbersome and almost impossible. In this paper, we tried to use machine learning to build algorithms that could automatically identify, judge, and predict flow patterns and droplet characteristics, so that the empirical judgment was transferred to be an intelligent process. The difference on the usual machine learning algorithms, a generalized variable system was introduced to describe the different geometry configurations of the digital microfluidics. Specifically, Buckingham’s theorem had been adopted to obtain multiple groups of dimensionless numbers as the input variables of machine learning algorithms. Through the verification of the algorithms, the SVM and BPNN algorithms had classified and predicted the different flow patterns and droplet characteristics (the length and frequency) successfully. By comparing with the primitive parameters system, the dimensionless numbers system was superior in the predictive capability. The traditional dimensionless numbers selected for the machine learning algorithms should have physical meanings strongly rather than mathematical meanings. The machine learning algorithms applying the dimensionless numbers had declined the dimensionality of the system and the amount of computation and not lose the information of primitive parameters.


2021 ◽  
Vol 10 (2) ◽  
pp. 58
Author(s):  
Muhammad Fawad Akbar Khan ◽  
Khan Muhammad ◽  
Shahid Bashir ◽  
Shahab Ud Din ◽  
Muhammad Hanif

Low-resolution Geological Survey of Pakistan (GSP) maps surrounding the region of interest show oolitic and fossiliferous limestone occurrences correspondingly in Samanasuk, Lockhart, and Margalla hill formations in the Hazara division, Pakistan. Machine-learning algorithms (MLAs) have been rarely applied to multispectral remote sensing data for differentiating between limestone formations formed due to different depositional environments, such as oolitic or fossiliferous. Unlike the previous studies that mostly report lithological classification of rock types having different chemical compositions by the MLAs, this paper aimed to investigate MLAs’ potential for mapping subclasses within the same lithology, i.e., limestone. Additionally, selecting appropriate data labels, training algorithms, hyperparameters, and remote sensing data sources were also investigated while applying these MLAs. In this paper, first, oolitic (Samanasuk), fossiliferous (Lockhart and Margalla) limestone-bearing formations along with the adjoining Hazara formation were mapped using random forest (RF), support vector machine (SVM), classification and regression tree (CART), and naïve Bayes (NB) MLAs. The RF algorithm reported the best accuracy of 83.28% and a Kappa coefficient of 0.78. To further improve the targeted allochemical limestone formation map, annotation labels were generated by the fusion of maps obtained from principal component analysis (PCA), decorrelation stretching (DS), X-means clustering applied to ASTER-L1T, Landsat-8, and Sentinel-2 datasets. These labels were used to train and validate SVM, CART, NB, and RF MLAs to obtain a binary classification map of limestone occurrences in the Hazara division, Pakistan using the Google Earth Engine (GEE) platform. The classification of Landsat-8 data by CART reported 99.63% accuracy, with a Kappa coefficient of 0.99, and was in good agreement with the field validation. This binary limestone map was further classified into oolitic (Samanasuk) and fossiliferous (Lockhart and Margalla) formations by all the four MLAs; in this case, RF surpassed all the other algorithms with an improved accuracy of 96.36%. This improvement can be attributed to better annotation, resulting in a binary limestone classification map, which formed a mask for improved classification of oolitic and fossiliferous limestone in the area.


Information ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 35
Author(s):  
Jibouni Ayoub ◽  
Dounia Lotfi ◽  
Ahmed Hammouch

The analysis of social networks has attracted a lot of attention during the last two decades. These networks are dynamic: new links appear and disappear. Link prediction is the problem of inferring links that will appear in the future from the actual state of the network. We use information from nodes and edges and calculate the similarity between users. The more users are similar, the higher the probability of their connection in the future will be. The similarity metrics play an important role in the link prediction field. Due to their simplicity and flexibility, many authors have proposed several metrics such as Jaccard, AA, and Katz and evaluated them using the area under the curve (AUC). In this paper, we propose a new parameterized method to enhance the AUC value of the link prediction metrics by combining them with the mean received resources (MRRs). Experiments show that the proposed method improves the performance of the state-of-the-art metrics. Moreover, we used machine learning algorithms to classify links and confirm the efficiency of the proposed combination.


Drones ◽  
2020 ◽  
Vol 4 (2) ◽  
pp. 21 ◽  
Author(s):  
Francisco Rodríguez-Puerta ◽  
Rafael Alonso Ponce ◽  
Fernando Pérez-Rodríguez ◽  
Beatriz Águeda ◽  
Saray Martín-García ◽  
...  

Controlling vegetation fuels around human settlements is a crucial strategy for reducing fire severity in forests, buildings and infrastructure, as well as protecting human lives. Each country has its own regulations in this respect, but they all have in common that by reducing fuel load, we in turn reduce the intensity and severity of the fire. The use of Unmanned Aerial Vehicles (UAV)-acquired data combined with other passive and active remote sensing data has the greatest performance to planning Wildland-Urban Interface (WUI) fuelbreak through machine learning algorithms. Nine remote sensing data sources (active and passive) and four supervised classification algorithms (Random Forest, Linear and Radial Support Vector Machine and Artificial Neural Networks) were tested to classify five fuel-area types. We used very high-density Light Detection and Ranging (LiDAR) data acquired by UAV (154 returns·m−2 and ortho-mosaic of 5-cm pixel), multispectral data from the satellites Pleiades-1B and Sentinel-2, and low-density LiDAR data acquired by Airborne Laser Scanning (ALS) (0.5 returns·m−2, ortho-mosaic of 25 cm pixels). Through the Variable Selection Using Random Forest (VSURF) procedure, a pre-selection of final variables was carried out to train the model. The four algorithms were compared, and it was concluded that the differences among them in overall accuracy (OA) on training datasets were negligible. Although the highest accuracy in the training step was obtained in SVML (OA=94.46%) and in testing in ANN (OA=91.91%), Random Forest was considered to be the most reliable algorithm, since it produced more consistent predictions due to the smaller differences between training and testing performance. Using a combination of Sentinel-2 and the two LiDAR data (UAV and ALS), Random Forest obtained an OA of 90.66% in training and of 91.80% in testing datasets. The differences in accuracy between the data sources used are much greater than between algorithms. LiDAR growth metrics calculated using point clouds in different dates and multispectral information from different seasons of the year are the most important variables in the classification. Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires.


2019 ◽  
Vol 19 (03) ◽  
pp. 1950014
Author(s):  
ALFREDO ARANDA ◽  
ALVARO VALENCIA

Fluid-mechanical and morphological parameters are recognized as major factors in the rupture risk of human aneurysms. On the other hand, it is well known that a lot of machine learning tools are available to study a variety of problems in many fields. In this work, fluid–structure interaction (FSI) simulations were carried out to examine a database of 60 real saccular cerebral aneurysms (30 ruptured and 30 unruptured) using reconstructions by angiography images. With the results of the simulations and geometric analyses, we studied the analysis of variance (ANOVA) statistic test in many variables and we obtained that aspect ratio (AR), bottleneck factor (BNF), maximum height of the aneurysms (MH), relative residence time (RRT), Womersley number (WN) and Von-Mises strain (VMS) are statically significant and good predictors for the models. In consequence, these ones were used in five machine learning algorithms to determine the rupture risk predictions of the aneurysms, where the adaptative boosting (AdaBoost) was calculated with the highest area under the curve (AUC) in the receiver operating characteristic (ROC) curve (AUC 0.944).


2021 ◽  
Vol 12 (2) ◽  
pp. 857-876
Author(s):  
Sk Ajim Ali ◽  
Farhana Parvin ◽  
Jana Vojteková ◽  
Romulus Costache ◽  
Nguyen Thi Thuy Linh ◽  
...  

Cancers ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 3010
Author(s):  
Johannes Uhlig ◽  
Andreas Leha ◽  
Laura M. Delonge ◽  
Anna-Maria Haack ◽  
Brian Shuch ◽  
...  

This study evaluates the diagnostic performance of radiomic features and machine learning algorithms for renal tumor subtype assessment in venous computed tomography (CT) studies from clinical routine. Patients undergoing surgical resection and histopathological assessment of renal tumors at a tertiary referral center between 2012 and 2019 were included. Preoperative venous-phase CTs from multiple referring imaging centers were segmented, and standardized radiomic features extracted. After preprocessing, class imbalance handling, and feature selection, machine learning algorithms were used to predict renal tumor subtypes using 10-fold cross validation, assessed as multiclass area under the curve (AUC). In total, n = 201 patients were included (73.7% male; mean age 66 ± 11 years), with n = 131 clear cell renal cell carcinomas (ccRCC), n = 29 papillary RCC, n = 11 chromophobe RCC, n = 16 oncocytomas, and n = 14 angiomyolipomas (AML). An extreme gradient boosting algorithm demonstrated the highest accuracy (multiclass area under the curve (AUC) = 0.72). The worst discrimination was evident for oncocytomas vs. AML and oncocytomas vs. chromophobe RCC (AUC = 0.55 and AUC = 0.45, respectively). In sensitivity analyses excluding oncocytomas, a random forest algorithm showed the highest accuracy, with multiclass AUC = 0.78. Radiomic feature analyses from venous-phase CT acquired in clinical practice with subsequent machine learning can discriminate renal tumor subtypes with moderate accuracy. The classification of oncocytomas seems to be the most complex with the lowest accuracy.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 2581-2581 ◽  
Author(s):  
Paul Johannet ◽  
Nicolas Coudray ◽  
George Jour ◽  
Douglas MacArthur Donnelly ◽  
Shirin Bajaj ◽  
...  

2581 Background: There is growing interest in optimizing patient selection for treatment with immune checkpoint inhibitors (ICIs). We postulate that phenotypic features present in metastatic melanoma tissue reflect the biology of tumor cells, immune cells, and stromal tissue, and hence can provide predictive information about tumor behavior. Here, we test the hypothesis that machine learning algorithms can be trained to predict the likelihood of response and/or toxicity to ICIs. Methods: We examined 124 stage III/IV melanoma patients who received anti-CTLA-4 (n = 81), anti-PD-1 (n = 25), or combination (n = 18) therapy as first line. The tissue analyzed was resected before treatment with ICIs. In total, 340 H&E slides were digitized and annotated for three regions of interest: tumor, lymphocytes, and stroma. The slides were then partitioned into training (n = 285), validation (n = 26), and test (n = 29) sets. Slides were tiled (299x299 pixels) at 20X magnification. We trained a deep convolutional neural network (DCNN) to automatically segment the images into each of the three regions and then deconstruct images into their component features to detect non-obvious patterns with objectivity and reproducibility. We then trained the DCNN for two classifications: 1) complete/partial response versus progression of disease (POD), and 2) severe versus no immune-related adverse events (irAEs). Predictive accuracy was estimated by area under the curve (AUC) of receiver operating characteristics (ROC). Results: The DCNN identified tumor within LN with AUC 0.987 and within ST with AUC 0.943. Prediction of POD based on ST-only always performed better than prediction based on LN-only (AUC 0.84 compared to 0.61, respectively). The DCNN had an average AUC 0.69 when analyzing only tumor regions from both LN and ST data sets and AUC 0.68 when analyzing tumor and lymphocyte regions. Severe irAEs were predicted with limited accuracy (AUC 0.53). Conclusions: Our results support the potential application of machine learning on pre-treatment histologic slides to predict response to ICIs. It also revealed their limited value in predicting toxicity. We are currently investigating whether the predictive capability of the algorithm can be further improved by incorporating additional immunologic biomarkers.


Sign in / Sign up

Export Citation Format

Share Document