Application of Supervised Machine-Learning Methods for Attesting Provenance in Catalan Traditional Pottery Industry

The traditional pottery industry was an important activity in Catalonia (NE Spain) up to the 20th century. However, nowadays only few workshops persist in small villages were the activity is promoted as a touristic attraction. The preservation and promotion of traditional pottery in Catalonia is part of an ongoing strategy of tourism diversification that is revitalizing the sector. The production of authenticable local pottery handicrafts aims at attracting cultivated and high-purchasing power tourists. The present paper inspects several approaches to set up a scientific protocol based on the chemical composition of both raw materials and pottery. These could be used to develop a seal of quality and provenance to regulate the sector. Six Catalan villages with a renowned tradition of local pottery production have been selected. The chemical composition of their clays and the corresponding fired products has been obtained by Energy dispersive X-ray fluorescence (EDXRF). Using the obtained geochemical dataset, a number of unsupervised and supervised machine learning methods have been applied to test their applicability to define geochemical fingerprints that could allow inter-site discrimination. The unsupervised approach fails to distinguish samples from different provenances. These methods are only roughly able to divide the different provenances in two large groups defined by their different SiO2 and CaCO3 concentrations. In contrast, almost all the tested supervised methods allow inter-site discrimination with accuracy levels above 80%, and accuracies above 85% were obtained using a meta-model combining all the predictive supervised methods. The obtained results can be taken as encouraging and demonstrative of the potential of the supervised approach as a way to define geochemical fingerprints to track or attest the provenance of samples.

Download Full-text

FRI0046 PHARMACOGENOMICS-DRIVEN INDIVIDUALIZED PREDICTION OF TREATMENT RESPONSE TO METHOTREXATE IN PATIENTS WITH RHEUMATOID ARTHRITIS: A MACHINE LEARNING APPROACH

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.4993 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 598.2-598

Author(s):

E. Myasoedova ◽

A. Athreya ◽

C. S. Crowson ◽

R. Weinshilboum ◽

L. Wang ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Machine Learning ◽

Supervised Machine Learning ◽

Research Support ◽

Eular Response ◽

Learning Methods ◽

Machine Learning Methods ◽

Early Ra ◽

Genome Wide

Background:Methotrexate (MTX) is the most common anchor drug for rheumatoid arthritis (RA), but the risk of missing the opportunity for early effective treatment with alternative medications is substantial given the delayed onset of MTX action and 30-40% inadequate response rate. There is a compelling need to accurately predicting MTX response prior to treatment initiation, which allows for effectively identifying patients at RA onset who are likely to respond to MTX.Objectives:To test the ability of machine learning approaches with clinical and genomic biomarkers to predict MTX response with replications in independent samples.Methods:Age, sex, clinical, serological and genome-wide association study (GWAS) data on patients with early RA of European ancestry from 647 patients (336 recruited in United Kingdom [UK]; 307 recruited across Europe; 70% female; 72% rheumatoid factor [RF] positive; mean age 54 years; mean baseline Disease Activity Score with 28-joint count [DAS28] 5.65) of the PhArmacogenetics of Methotrexate in RA (PAMERA) consortium was used in this study. The genomics data comprised 160 genome-wide significant single nucleotide polymorphisms (SNPs) with p<1×10-5 associated with risk of RA and MTX metabolism. DAS28 score was available at baseline and 3-month follow-up visit. Response to MTX monotherapy at the dose of ≥15 mg/week was defined as good or moderate by the EULAR response criteria at 3 months’ follow up visit. Supervised machine-learning methods were trained with 5-repeats and 10-fold cross-validation using data from PAMERA’s 336 UK patients. Class imbalance (higher % of MTX responders) in training was accounted by using simulated minority oversampling technique. Prediction performance was validated in PAMERA’s 307 European patients (not used in training).Results:Age, sex, RF positivity and baseline DAS28 data predicted MTX response with 58% accuracy of UK and European patients (p = 0.7). However, supervised machine-learning methods that combined demographics, RF positivity, baseline DAS28 and genomic SNPs predicted EULAR response at 3 months with area under the receiver operating curve (AUC) of 0.83 (p = 0.051) in UK patients, and achieved prediction accuracies (fraction of correctly predicted outcomes) of 76.2% (p = 0.054) in the European patients, with sensitivity of 72% and specificity of 77%. The addition of genomic data improved the predictive accuracies of MTX response by 19% and achieved cross-site replication. Baseline DAS28 scores and following SNPs rs12446816, rs13385025, rs113798271, and rs2372536 were among the top predictors of MTX response.Conclusion:Pharmacogenomic biomarkers combined with DAS28 scores predicted MTX response in patients with early RA more reliably than using demographics and DAS28 scores alone. Using pharmacogenomics biomarkers for identification of MTX responders at early stages of RA may help to guide effective RA treatment choices, including timely escalation of RA therapies. Further studies on personalized prediction of response to MTX and other anti-rheumatic treatments are warranted to optimize control of RA disease and improve outcomes in patients with RA.Disclosure of Interests:Elena Myasoedova: None declared, Arjun Athreya: None declared, Cynthia S. Crowson Grant/research support from: Pfizer research grant, Richard Weinshilboum Shareholder of: co-founder and stockholder in OneOme, Liewei Wang: None declared, Eric Matteson Grant/research support from: Pfizer, Consultant of: Boehringer Ingelheim, Gilead, TympoBio, Arena Pharmaceuticals, Speakers bureau: Simply Speaking

Download Full-text

Identifying Spatiotemporal Patterns in Land Use and Cover Samples from Satellite Image Time Series

Remote Sensing ◽

10.3390/rs13050974 ◽

2021 ◽

Vol 13 (5) ◽

pp. 974

Author(s):

Lorena Alves Santos ◽

Karine Ferreira ◽

Michelle Picoli ◽

Gilberto Camara ◽

Raul Zurita-Milla ◽

...

Keyword(s):

Machine Learning ◽

Land Use ◽

Time Series ◽

Satellite Image ◽

Spatiotemporal Patterns ◽

Supervised Machine Learning ◽

Learning Methods ◽

Self Organizing Maps ◽

Machine Learning Methods ◽

Land Use And Cover

The use of satellite image time series analysis and machine learning methods brings new opportunities and challenges for land use and cover changes (LUCC) mapping over large areas. One of these challenges is the need for samples that properly represent the high variability of land used and cover classes over large areas to train supervised machine learning methods and to produce accurate LUCC maps. This paper addresses this challenge and presents a method to identify spatiotemporal patterns in land use and cover samples to infer subclasses through the phenological and spectral information provided by satellite image time series. The proposed method uses self-organizing maps (SOMs) to reduce the data dimensionality creating primary clusters. From these primary clusters, it uses hierarchical clustering to create subclusters that recognize intra-class variability intrinsic to different regions and periods, mainly in large areas and multiple years. To show how the method works, we use MODIS image time series associated to samples of cropland and pasture classes over the Cerrado biome in Brazil. The results prove that the proposed method is suitable for identifying spatiotemporal patterns in land use and cover samples that can be used to infer subclasses, mainly for crop-types.

Download Full-text

Identification of Village Building via Google Earth Images and Supervised Machine Learning Methods

Remote Sensing ◽

10.3390/rs8040271 ◽

2016 ◽

Vol 8 (4) ◽

pp. 271 ◽

Cited By ~ 29

Author(s):

Zhiling Guo ◽

Xiaowei Shao ◽

Yongwei Xu ◽

Hiroyuki Miyazaki ◽

Wataru Ohira ◽

...

Keyword(s):

Machine Learning ◽

Google Earth ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behaviours

PLoS ONE ◽

10.1371/journal.pone.0166898 ◽

2016 ◽

Vol 11 (12) ◽

pp. e0166898 ◽

Cited By ~ 15

Author(s):

Monique A. Ladds ◽

Adam P. Thompson ◽

David J. Slip ◽

David P. Hocking ◽

Robert G. Harcourt

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Machine Learning Methods Applied for Modeling the Process of Obtaining Bricks Using Silicon-Based Materials

Materials ◽

10.3390/ma14237232 ◽

2021 ◽

Vol 14 (23) ◽

pp. 7232

Author(s):

Costel Anton ◽

Silvia Curteanu ◽

Cătălin Lisa ◽

Florin Leon

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Raw Materials ◽

Optimization Procedure ◽

Exhaust Emission ◽

Energy Potential ◽

Learning Methods ◽

Machine Learning Methods ◽

Emission Changes ◽

The Impact

Most of the time, industrial brick manufacture facilities are designed and commissioned for a particular type of manufacture mix and a particular type of burning process. Productivity and product quality maintenance and improvement is a challenge for process engineers. Our paper aims at using machine learning methods to evaluate the impact of adding new auxiliary materials on the amount of exhaust emissions. Experimental determinations made in similar conditions enabled us to build a database containing information about 121 brick batches. Various models (artificial neural networks and regression algorithms) were designed to make predictions about exhaust emission changes when auxiliary materials are introduced into the manufacture mix. The best models were feed-forward neural networks with two hidden layers, having MSE < 0.01 and r2 > 0.82 and, as regression model, kNN with error < 0.6. Also, an optimization procedure, including the best models, was developed in order to determine the optimal values for the parameters that assure the minimum quantities for the gas emission. The Pareto front obtained in the multi-objective optimization conducted with grid search method allows the user the chose the most convenient values for the dry product mass, clay, ash and organic raw materials which minimize gas emissions with energy potential.

Download Full-text

Advanced Machine Learning Methods for Prediction of Fracture Closure Pressure

10.2118/200782-ms ◽

2021 ◽

Author(s):

Mohamed Ibrahim Mohamed ◽

Dinesh Mehta ◽

Erdal Ozkan

Keyword(s):

Machine Learning ◽

Integrated Approach ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Learning Methods ◽

Pressure Derivative ◽

Fracture Geometry ◽

Machine Learning Methods ◽

Personal Bias ◽

Fracture Closure

Abstract Determining the closure pressure is crucial for optimal hydraulic fracturing design and successful execution of fracturing treatment. Historically, the use of diagnostic tests before the main fracturing treatment has significantly advanced to gain more information about the pattern of fracture propagation and fluid performance to optimize the designs. The goal is to inject a small volume of fracturing fluid to breakdown the formation and create small fracture geometry, then once pumping is stopped the pressure decline is analyzed to observe the fracture closure. Many analytical methods such as G-Function, square root of time, etc. have been developed to determine the fracture closure pressure. There are cases in which there is difficulty in determining the fracture closure pressure, as well as personal bias and field experiences make it challenging to interpret the changes in the pressure derivative slope and identify fracture closure. These conditions include: High permeability reservoirs where fracture closure occurs very fast due to the quick fluid leakoff.Extremely low permeability reservoir, which requires a long shut-in time for the fluid to leak off and determine the fracture closure pressure.The non-ideal fluid leak-off behavior under complex conditions. The objective of this study is to apply machine learning methods to implement a predesigned algorithm to execute the required tasks and predict the fracture closure pressure while minimizing the shortcomings in determining the closure pressure for non-ideal or subjective conditions. This paper demonstrates training different supervised machine learning algorithms to help predict fracture closure pressure. The workflow involves using the datasets to train and optimize the models, which subsequently are used to predict the closure pressure of testing data. The output results are then compared with actual results from more than 120 DFIT data points. We further propose an integrated approach to feature selection and dataset processing and study the effects of data processing on the success of the model prediction. The results from this study limit the subjectivity and the need for the experience of personal interpreting the data. We speculate that a linear regression and MLP neural network algorithms can yield high scores in the prediction of fracture closure pressure.

Download Full-text

Comparison of supervised machine learning methods to predict ship propulsion power at sea

Ocean Engineering ◽

10.1016/j.oceaneng.2021.110387 ◽

2022 ◽

Vol 245 ◽

pp. 110387

Author(s):

Xiao Lang ◽

Da Wu ◽

Wengang Mao

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Learning Methods ◽

Ship Propulsion ◽

Machine Learning Methods ◽

Propulsion Power

Download Full-text

Prediction of Compound-Protein Interactions with Machine Learning Methods

Chemoinformatics and Advanced Machine Learning Perspectives ◽

10.4018/978-1-61520-911-8.ch016 ◽

2011 ◽

pp. 304-317

Author(s):

Yoshihiro Yamanishi ◽

Hisashi Kashima

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Chemical Structure ◽

Genomic Sequence ◽

Sequence Data ◽

Binary Classification ◽

Biological Data ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.

Download Full-text

Inter-dataset generalization strength of supervised machine learning methods for intrusion detection

Journal of Information Security and Applications ◽

10.1016/j.jisa.2020.102564 ◽

2020 ◽

Vol 54 ◽

pp. 102564

Author(s):

Laurens D’hooge ◽

Tim Wauters ◽

Bruno Volckaert ◽

Filip De Turck

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Journal of Translational Medicine ◽

10.1186/s12967-020-02550-2 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Kerry E. Poppenberg ◽

Vincent M. Tutino ◽

Lu Li ◽

Muhammad Waqas ◽

Armond June ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Prediction Models ◽

Model Performance ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Training Cohort ◽

Network Analyses ◽

Machine Learning Methods

Abstract Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Download Full-text