Geologic provinces from unsupervised learning: synthetic experiments in clustering of localized topography/gravity admittance and correlation spectra

Mapping Intimacies ◽

10.5194/egusphere-egu21-10092 ◽

2021 ◽

Author(s):

Alberto Pastorutti ◽

Carla Braitenberg

Keyword(s):

Heat Flow ◽

Spherical Harmonics ◽

Transfer Functions ◽

Flexural Rigidity ◽

Ground Truth ◽

Spatial Localization ◽

Data Set ◽

Mantle Dynamics ◽

External Data ◽

Clustering Model

Partitioning of the Earth surface in "provinces": tectonic domains, outcropping geological units, crustal types, discrete classes extracted from age or geophysical data (e.g. tomography, gravity) is often employed to perform data imputation of ill-sampled observables (e.g. the similarity-based NGHF surface heat flow map [1]) or to constrain the parameters of ill-posed inverse problems (e.g. the gravimetric global Moho model GEMMA [2]).We define provinces as noncontiguous areas where quantities or their relationships are similar. Following the goodness metric employed for proxy observables, an adequate province model should be able to significantly improve prediction of the extrapolated quantity. Interpolation of a quantity with no reliance on external data sets a predictivity benchmark, which a province-based prediction should exceed. In a solid Earth modelling perspective, gravity, topography, and their relationship, seem ideal candidates to constrain a province clustering model. Earth gravity and topography, at resolutions of at least 100 km, are known with an incomparable sampling uniformity and negligible error, respect to other observables.Most of the observed topography-gravity relationship can be explained by regional isostatic compensation. The topography, representing the load exerted on the lithosphere, is compensated by the elastic, thin-shell like response of the latter. In the spectral domain, flexure results in a lowpass transfer function between topography and isostatic roots. The signal of both surfaces, superimposed, is observed in the gravity field. However, reality shows significant shifts from the ideal case: the separation of nonisostatic effects [3], such as density inhomogeneities, glacial isostatic adjustments, dynamic mantle processes, is nontrivial. Acknowledging this superposition, we aim at identifying clusters of similar topography-gravity transfer functions.We evaluate the transfer functions, in the form of admittance and correlation [4], in the spherical harmonics domain. Spatial localization is achieved with the method by Wieczorek and Simons [5], using SHTOOLS [6]. Admittance and correlation spectra are computed on a set of regularly spaced sample points, each point being representative of the topo-gravity relationship in its proximity. The coefficients of the localized topo-gravity admittance and correlation spectra constitute each point features.We present a set of experiments performed on synthetic models, in which we can control the variations of elastic parameters and non-isostatic contributions. These tests allowed to define both the feature extraction segment: the spatial localization method and the range of spherical harmonics degrees which are more sensible to lateral variations in flexural rigidity; and the clustering segment: metrics of the ground-truth clusters, performance of dimensionality reduction methods and of different clustering models.[1] Lucazeau (2019). Analysis and Mapping of an Updated Terrestrial Heat Flow Data Set. doi:10.1029/2019GC008389 [2] Reguzzoni and Sampietro (2015). GEMMA: An Earth crustal model based on GOCE satellite data. doi:10.1016/j.jag.2014.04.002 [3] Bagherbandi and Sj&#246;berg (2013). Improving gravimetric&#8211;isostatic models of crustal depth by correcting for non-isostatic effects and using CRUST2.0. doi:10.1016/j.earscirev.2012.12.002 [4] Simons et al. (1997). Localization of gravity and topography: Constraints on the tectonics and mantle dynamics of Venus. doi:10.1111/j.1365-246X.1997.tb00593.x [5] Wieczorek and Simons (2005). Localized spectral analysis on the sphere. doi:10.1111/j.1365-246X.2005.02687.x [6] Wieczorek and Meschede (2018). SHTools: Tools for Working with Spherical Harmonics. doi:10.1029/2018GC007529

Download Full-text

A new database structure for the IHFC Global Heat Flow Database

International Journal of Terrestrial Heat Flow and Applications ◽

10.31214/ijthfa.v4i1.62 ◽

2021 ◽

Vol 4 (1) ◽

pp. 1-14

Author(s):

Sven Fuchs ◽

Graeme Beardsmore ◽

Paolo Chiozzi ◽

Orlando Miguel Espinoza-Ojeda ◽

Gianluca Gola ◽

...

Keyword(s):

Heat Flow ◽

International Association ◽

Open Data ◽

Flow Data ◽

Ocean Drilling Program ◽

Data Set ◽

Data Services ◽

External Data ◽

Database Structure ◽

Starting Point

Periodic revisions of the Global Heat Flow Database (GHFD) take place under the auspices of the International Heat Flow Commission (IHFC) of the International Association of Seismology and Physics of the Earth's Interior (IASPEI). A growing number of heat-flow values, advances in scientific methods, digitization, and improvements in database technologies all warrant a revision of the structure of the GHFD that was last amended in 1976. We present a new structure for the GHFD, which will provide a basis for a reassessment and revision of the existing global heat-flow data set. The database fields within the new structure are described in detail to ensure a common understanding of the respective database entries. The new structure of the database takes advantage of today's possibilities for data management. It supports FAIR and open data principles, including interoperability with external data services, and links to DOI and IGSN numbers and other data resources (e.g., world geological map, world stratigraphic system, and International Ocean Drilling Program data). Aligned with this publication, a restructured version of the existing database is published, which provides a starting point for the upcoming collaborative process of data screening, quality control and revision. In parallel, the IHFC will work on criteria for a new quality scheme that will allow future users of the database to evaluate the quality of the collated heat-flow data based on specific criteria.

Download Full-text

Fig Plant Segmentation from Aerial Images Using a Deep Convolutional Encoder-Decoder Network

Remote Sensing ◽

10.3390/rs11101157 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1157 ◽

Cited By ~ 8

Author(s):

Jorge Fuentes-Pacheco ◽

Juan Torres-Olivares ◽

Edgar Roman-Rangel ◽

Salvador Cervantes ◽

Porfirio Juarez-Lopez ◽

...

Keyword(s):

Precision Agriculture ◽

Image Data ◽

Ground Truth ◽

Aerial Images ◽

Aerial Image ◽

Data Set ◽

Visual Appearance ◽

Aerial Robots ◽

Lighting Conditions ◽

Convolutional Encoder

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.

Download Full-text

Quantifying the structure of strong gravitational lens potentials with uncertainty-aware deep neural networks

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3201 ◽

2020 ◽

Vol 499 (4) ◽

pp. 5641-5652

Author(s):

Georgios Vernardos ◽

Grigorios Tsagkatakis ◽

Yannis Pantazis

Keyword(s):

Confidence Intervals ◽

Galaxy Evolution ◽

Gravitational Lensing ◽

Probability Distributions ◽

Mass Density ◽

Ground Truth ◽

Gaussian Random Fields ◽

Training Data ◽

Gravitational Lens ◽

Data Set

ABSTRACT Gravitational lensing is a powerful tool for constraining substructure in the mass distribution of galaxies, be it from the presence of dark matter sub-haloes or due to physical mechanisms affecting the baryons throughout galaxy evolution. Such substructure is hard to model and is either ignored by traditional, smooth modelling, approaches, or treated as well-localized massive perturbers. In this work, we propose a deep learning approach to quantify the statistical properties of such perturbations directly from images, where only the extended lensed source features within a mask are considered, without the need of any lens modelling. Our training data consist of mock lensed images assuming perturbing Gaussian Random Fields permeating the smooth overall lens potential, and, for the first time, using images of real galaxies as the lensed source. We employ a novel deep neural network that can handle arbitrary uncertainty intervals associated with the training data set labels as input, provides probability distributions as output, and adopts a composite loss function. The method succeeds not only in accurately estimating the actual parameter values, but also reduces the predicted confidence intervals by 10 per cent in an unsupervised manner, i.e. without having access to the actual ground truth values. Our results are invariant to the inherent degeneracy between mass perturbations in the lens and complex brightness profiles for the source. Hence, we can quantitatively and robustly quantify the smoothness of the mass density of thousands of lenses, including confidence intervals, and provide a consistent ranking for follow-up science.

Download Full-text

Metastatic heterogeneity of the consensus molecular subtypes of colorectal cancer

npj Genomic Medicine ◽

10.1038/s41525-021-00223-7 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Peter W. Eide ◽

Seyed H. Moosavi ◽

Ina A. Eilertsen ◽

Tuva H. Brunsell ◽

Jonas Langerud ◽

...

Keyword(s):

Gene Expression ◽

Colorectal Cancer ◽

Principal Components ◽

Prognostic Value ◽

Tumor Heterogeneity ◽

Molecular Subtypes ◽

R Package ◽

Data Set ◽

Primary Tumors ◽

External Data

AbstractGene expression-based subtypes of colorectal cancer have clinical relevance, but the representativeness of primary tumors and the consensus molecular subtypes (CMS) for metastatic cancers is not well known. We investigated the metastatic heterogeneity of CMS. The best approach to subtype translation was delineated by comparisons of transcriptomic profiles from 317 primary tumors and 295 liver metastases, including multi-metastatic samples from 45 patients and 14 primary-metastasis sets. Associations were validated in an external data set (n = 618). Projection of metastases onto principal components of primary tumors showed that metastases were depleted of CMS1-immune/CMS3-metabolic signals, enriched for CMS4-mesenchymal/stromal signals, and heavily influenced by the microenvironment. The tailored CMS classifier (available in an updated version of the R package CMScaller) therefore implemented an approach to regress out the liver tissue background. The majority of classified metastases were either CMS2 or CMS4. Nonetheless, subtype switching and inter-metastatic CMS heterogeneity were frequent and increased with sampling intensity. Poor-prognostic value of CMS1/3 metastases was consistent in the context of intra-patient tumor heterogeneity.

Download Full-text

A hybrid regularization scheme for the inversion of magnetotelluric data from natural and controlled sources to layer and distortion parameters

Geophysics ◽

10.1190/geo2012-0018.1 ◽

2012 ◽

Vol 77 (4) ◽

pp. E301-E315 ◽

Cited By ~ 10

Author(s):

Thomas Kalscheuer ◽

Juliane Hübert ◽

Alexey Kuvshinov ◽

Tobias Lochbühler ◽

Laust B. Pedersen

Keyword(s):

Transfer Functions ◽

Homogeneous Distribution ◽

Stable Configuration ◽

Inversion Algorithm ◽

Magnetotelluric Data ◽

Data Set ◽

Near Surface ◽

Minimum Solution ◽

Magnetic Transfer ◽

Distortion Parameters

Magnetotelluric (MT), radiomagnetotelluric (RMT), and, in particular, controlled-source audiomagnetotelluric (CSAMT) data are often heavily distorted by near-surface inhomogeneities. We developed a novel scheme to invert MT, RMT, and CSAMT data in the form of scalar or tensorial impedances and vertical magnetic transfer functions simultaneously for layer resistivities and electric and magnetic galvanic distortion parameters. The inversion scheme uses smoothness constraints to regularize layer resistivities and either Marquardt-Levenberg damping or the minimum-solution length criterion to regularize distortion parameters. A depth of investigation range is estimated by comparing layered model sections derived from first- and second-order smoothness constraints. Synthetic examples demonstrate that earth models are reconstructed properly for distorted and undistorted tensorial CSAMT data. In the inversion of scalar CSAMT data, such as the determinant impedance or individual tensor elements, the reduced number of transfer functions inevitably leads to increased ambiguity for distortion parameters. As a consequence of this ambiguity for scalar data, distortion parameters often grow over the iterations to unrealistic absolute values when regularized with the Marquardt-Levenberg scheme. Essentially, compensating relationships between terms containing electric and/or magnetic distortion are used in this growth. In a regularization with the minimum solution length criterion, the distortion parameters converge into a stable configuration after several iterations and attain reasonable values. The inversion algorithm was applied to a CSAMT field data set collected along a profile over a tunnel construction site at Hallandsåsen, Sweden. To avoid erroneous inverse models from strong anthropogenic effects on the data, two scalar transfer functions (one scalar impedance and one scalar vertical magnetic transfer function) were selected for inversion. Compared with a regularization of distortion parameters with the Marquardt-Levenberg method, the minimum-solution length criterion yielded smaller absolute values of distortion parameters and a horizontally more homogeneous distribution of electrical conductivity.

Download Full-text

Learnings From Strain Measurements on an In-Field Conductor and Wellhead System

Volume 3: Structures, Safety, and Reliability ◽

10.1115/omae2018-78521 ◽

2018 ◽

Author(s):

Rohit Shankaran ◽

Alexander Rimmer ◽

Alan Haig

Keyword(s):

Fatigue Damage ◽

Strain Gauge ◽

Transfer Functions ◽

Fatigue Loading ◽

Accurate Method ◽

Environmental Data ◽

Analytical Methodology ◽

Data Set ◽

Soil Model ◽

Drilling Operations

In recent years due to use of drilling risers with larger and heavier BOP/LMRP stacks, fatigue loading on subsea wellheads has increased, which poses potential restrictions on the duration of drilling operations. In order to track wellhead and conductor fatigue capacity consumption to support safe drilling operations a range of methods have been applied: • Analytical riser model and measured environmental data; • BOP motion measurement and transfer functions; • Strain gauge data. Strain gauge monitoring is considered the most accurate method for measuring fatigue capacity consumption. To compare the three approaches and establish recommendations for an optimal approach and method to establish fatigue accumulation of the wellhead, a monitoring data set is obtained on a well offshore West of Shetland. This paper presents an analysis of measured strain, motions and analytical predictions with the objective of better understanding the accuracy, limitations, or conservatism in each of the three methods defined above. Of the various parameters that affect the accuracy of the fatigue damage estimates, the paper identifies that the selection of analytical conductor-soil model is critical to narrowing the gap between fatigue life predictions from the different approaches. The work presented here presents the influence of alternative approaches to model conductor-soil interaction than the traditionally used API soil model. Overall, the paper presents the monitoring equipment and analytical methodology to advance the accuracy of wellhead fatigue damage measurements.

Download Full-text

Autonomous feature type selection based on environment using expectation maximization in self-localization

International Journal of Advanced Robotic Systems ◽

10.1177/1729881418814701 ◽

2018 ◽

Vol 15 (6) ◽

pp. 172988141881470

Author(s):

Nezih Ergin Özkucur ◽

H Levent Akın

Keyword(s):

Expectation Maximization ◽

Autonomous Robots ◽

Expectation Maximization Algorithm ◽

Sensory Information ◽

Ground Truth ◽

Local Environment ◽

Localization Algorithm ◽

Data Set ◽

Public Data ◽

The Individual

Self-localization in autonomous robots is one of the fundamental issues in the development of intelligent robots, and processing of raw sensory information into useful features is an integral part of this problem. In a typical scenario, there are several choices for the feature extraction algorithm, and each has its weaknesses and strengths depending on the characteristics of the environment. In this work, we introduce a localization algorithm that is capable of capturing the quality of a feature type based on the local environment and makes soft selection of feature types throughout different regions. A batch expectation–maximization algorithm is developed for both discrete and Monte Carlo localization models, exploiting the probabilistic pose estimations of the robot without requiring ground truth poses and also considering different observation types as blackbox algorithms. We tested our method in simulations, data collected from an indoor environment with a custom robot platform and a public data set. The results are compared with the individual feature types as well as naive fusion strategy.

Download Full-text

Value of External Data in the Extrapolation of Survival Data: A Study Using the NJR Data Set

Value in Health ◽

10.1016/j.jval.2017.12.023 ◽

2018 ◽

Vol 21 (7) ◽

pp. 822-829 ◽

Cited By ~ 2

Author(s):

Mark Pennington ◽

Richard Grieve ◽

Jan Van der Meulen ◽

Neil Hawkins

Keyword(s):

Survival Data ◽

Data Set ◽

External Data

Download Full-text

Using Model Performance to Assess the Representativeness of Data for Model Development and Calibration in Financial Institutions

Risks ◽

10.3390/risks9110204 ◽

2021 ◽

Vol 9 (11) ◽

pp. 204

Author(s):

Chamay Kruger ◽

Willem Daniel Schutte ◽

Tanja Verster

Keyword(s):

Case Studies ◽

Model Development ◽

Model Performance ◽

Data Set ◽

Pooled Data ◽

External Data ◽

First Case ◽

Multivariate Prediction ◽

Credit Data

This paper proposes a methodology that utilises model performance as a metric to assess the representativeness of external or pooled data when it is used by banks in regulatory model development and calibration. There is currently no formal methodology to assess representativeness. The paper provides a review of existing regulatory literature on the requirements of assessing representativeness and emphasises that both qualitative and quantitative aspects need to be considered. We present a novel methodology and apply it to two case studies. We compared our methodology with the Multivariate Prediction Accuracy Index. The first case study investigates whether a pooled data source from Global Credit Data (GCD) is representative when considering the enrichment of internal data with pooled data in the development of a regulatory loss given default (LGD) model. The second case study differs from the first by illustrating which other countries in the pooled data set could be representative when enriching internal data during the development of a LGD model. Using these case studies as examples, our proposed methodology provides users with a generalised framework to identify subsets of the external data that are representative of their Country’s or bank’s data, making the results general and universally applicable.

Download Full-text

Molecular profile reveals immune-associated markers of medulloblastoma for different subtypes

10.21203/rs.3.rs-816357/v2 ◽

2021 ◽

Author(s):

shenglan li ◽

Zhuang Kang ◽

jinyi Chen ◽

Can Wang ◽

Zehao Cai ◽

...

Keyword(s):

Immune Cell ◽

Intracranial Tumor ◽

Cell Infiltration ◽

Molecular Profile ◽

Immune Cell Infiltration ◽

Ppi Network ◽

Hub Genes ◽

Data Set ◽

Hub Gene ◽

External Data

Abstract Background Medulloblastoma is a common intracranial tumor among children. In recent years, research on cancer genome has established four distinct subtypes of medulloblastoma: WNT, SHH, Group3, and Group4. Each subtype has its own transcriptional profile, methylation changes, and different clinical outcomes. Treatment and prognosis also vary depending on the subtype. Methods Based on the methylation data of medulloblastoma samples, methylCIBERSORT was used to evaluate the level of immune cell infiltration in medulloblastoma samples and identified 10 kinds of immune cells with different subtypes. Combined with the immune database, 293 Imm-DEGs were screened. Imm-DEGs were used to construct the co-expression network, and the key modules related to the level of differential immune cell infiltration were identified. Three immune hub genes (GAB1, ABL1, CXCR4) were identified according to the gene connectivity and the correlation with phenotype in the key modules, as well as the PPI network involved in the genes in the modules. Results The subtype marker was recognized according to the immune hub, and the subtype marker was verified in the external data set, the methylation level of immune hub gene among different subtypes was compared and analyzed, at the same time, tissue microarray was used for immunohistochemical verification, and a multi-factor regulatory network of hub gene was constructed. Conclusions Identifying subtype marker is helpful to accurately identify the subtypes of medulloblastoma patients, and can accurately evaluate the treatment and prognosis, so as to improve the overall survival of patients.

Download Full-text