Descriptor selection for predicting interfacial thermal resistance by machine learning methods

AbstractInterfacial thermal resistance (ITR) is a critical property for the performance of nanostructured devices where phonon mean free paths are larger than the characteristic length scales. The affordable, accurate and reliable prediction of ITR is essential for material selection in thermal management. In this work, the state-of-the-art machine learning methods were employed to realize this. Descriptor selection was conducted to build robust models and provide guidelines on determining the most important characteristics for targets. Firstly, decision tree (DT) was adopted to calculate the descriptor importances. And descriptor subsets with topX highest importances were chosen (topX-DT, X = 20, 15, 10, 5) to build models. To verify the transferability of the descriptors picked by decision tree, models based on kernel ridge regression, Gaussian process regression and K-nearest neighbors were also evaluated. Afterwards, univariate selection (UV) was utilized to sort descriptors. Finally, the top5 common descriptors selected by DT and UV were used to build concise models. The performance of these refined models is comparable to models using all descriptors, which indicates the high accuracy and reliability of these selection methods. Our strategy results in concise machine learning models for a fast prediction of ITR for thermal management applications.

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

Innovation in Aging ◽

10.1093/geroni/igaa057.859 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 268-269

Author(s):

Jaime Speiser ◽

Kathryn Callahan ◽

Jason Fanning ◽

Thomas Gill ◽

Anne Newman ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Receiver Operating Curve ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination

Breast Cancer Research and Treatment ◽

10.1007/s10549-019-05330-9 ◽

2019 ◽

Vol 177 (3) ◽

pp. 591-601 ◽

Cited By ~ 10

Author(s):

Takeshi Murata ◽

Takako Yanagisawa ◽

Toshiaki Kurihara ◽

Miku Kaneko ◽

Sana Ota ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Decision Tree ◽

Learning Methods ◽

Machine Learning Methods ◽

Alternative Decision

Download Full-text

Prognozowanie zapotrzebowania na gaz metodami sztucznej inteligencji

Nafta-Gaz ◽

10.18668/ng.2019.02.07 ◽

2019 ◽

Vol 75 (2) ◽

pp. 111-117

Author(s):

Andrzej Paliński ◽

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Decision Tree ◽

Computational Intelligence ◽

Prognostic Models ◽

Learning Methods ◽

Explanatory Variables ◽

Machine Learning Methods ◽

Gas Consumption ◽

Classical Linear Regression

The paper presents contemporary trends in artificial intelligence and machine learning methods, which include, among others, artificial neural networks, decision trees, fuzzy logic systems and others. Computational intelligence methods are part of the field of research on artificial intelligence. Selected methods of computational intelligence were used to build medium-term monthly forecasts of natural gas demand for Poland. The accuracy of forecasts obtained using the artificial neural network and the decision tree with classical linear regression was compared based on historical data from a ten-year period. The explanatory variables were: gas consumption in other EU countries, average monthly temperature, industrial production, wages in the economy and the price of natural gas. Forecasting was carried out in five stages differing in the selection of the learning and testing sample, the use of data preprocessing and the elimination of some variables. For raw data and a random training set, the highest accuracy was achieved by linear regression. For the preprocessed data and the random learning set, the decision tree was the most accurate. The forecast obtained on the basis of the first eight years and tested on the last two was most accurately created by regression, but only slightly better than with the decision tree or neural network, regardless of data normalization and elimination of collinear variables. Machine learning methods showed good accuracy of monthly gas consumption forecasts, but nevertheless slightly gave way to classical linear regression, due to too narrow set of explanatory variables. Machine learning methods will be able to show higher effectiveness as the number of data increases and the set of potential explanatory variables is expanded. In the sea of data, machine learning methods are able to create prognostic models more effectively, without the analyst’s laborious involvement in data preparation and multi-stage analysis. They will also allow for the frequent updating of the form of prognostic models even after each addition of new data into the database.

Download Full-text

Forecasting Natural Gas Spot Prices with Machine Learning

Energies ◽

10.3390/en14185782 ◽

2021 ◽

Vol 14 (18) ◽

pp. 5782

Author(s):

Dimitrios Mouchtaris ◽

Emmanouil Sofianos ◽

Periklis Gogas ◽

Theophilos Papadimitriou

Keyword(s):

Machine Learning ◽

Natural Gas ◽

Gaussian Process Regression ◽

Time Frame ◽

Spot Price ◽

Support Vector ◽

Learning Methods ◽

Spot Prices ◽

Explanatory Variables ◽

Machine Learning Methods

The ability to accurately forecast the spot price of natural gas benefits stakeholders and is a valuable tool for all market participants in the competitive gas market. In this paper, we attempt to forecast the natural gas spot price 1, 3, 5, and 10 days ahead using machine learning methods: support vector machines (SVM), regression trees, linear regression, Gaussian process regression (GPR), and ensemble of trees. These models are trained with a set of 21 explanatory variables in a 5-fold cross-validation scheme with 90% of the dataset used for training and the remaining 10% used for testing the out-of-sample generalization ability. The results show that these machine learning methods all have different forecasting accuracy for every time frame when it comes to forecasting natural gas spot prices. However, the bagged trees (belonging to the ensemble of trees method) and the linear SVM models have superior forecasting performance compared to the rest of the models.

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

The Journals of Gerontology Series A ◽

10.1093/gerona/glaa138 ◽

2020 ◽

Author(s):

Jaime Lynn Speiser ◽

Kathryn E Callahan ◽

Denise K Houston ◽

Jason Fanning ◽

Thomas M Gill ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods ◽

Using Data

Abstract Background Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty in understanding the complex algorithms that underlie models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. Method We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Results Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated using data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Conclusions Machine learning methods offer an alternative to traditional approaches for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Art Price Prediction Using Decision Tree-Based Machine Learning Methods

korean management review ◽

10.17287/kmr.2021.50.2.357 ◽

2021 ◽

Vol 50 (2) ◽

pp. 357-381

Author(s):

Dongryul Jang ◽

Minjae Park

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Learning Methods ◽

Price Prediction ◽

Machine Learning Methods

Download Full-text

Predictive modelling for contact angle of liquid metals and oxide ceramics by comparing Gaussian process regression with other machine learning methods

Ceramics International ◽

10.1016/j.ceramint.2021.09.146 ◽

2021 ◽

Author(s):

Dewen Jiang ◽

Zhenyang Wang ◽

Jianliang Zhang ◽

Dejun Jiang ◽

Fulong Liu ◽

...

Keyword(s):

Machine Learning ◽

Contact Angle ◽

Gaussian Process ◽

Liquid Metals ◽

Gaussian Process Regression ◽

Predictive Modelling ◽

Oxide Ceramics ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Machine Learning Methods of Regression for Plasmonic Nanoantenna Glucose Sensing

Sensors ◽

10.3390/s22010007 ◽

2021 ◽

Vol 22 (1) ◽

pp. 7

Author(s):

Emilio Corcione ◽

Diana Pfezer ◽

Mario Hentschel ◽

Harald Giessen ◽

Cristina Tarín

Keyword(s):

Machine Learning ◽

Measurement Techniques ◽

Gaussian Process Regression ◽

Glucose Sensing ◽

Sensor Data ◽

Glucose Measurement ◽

Sensor Calibration ◽

Learning Methods ◽

Machine Learning Methods ◽

Major Interest

The measurement and quantification of glucose concentrations is a field of major interest, whether motivated by potential clinical applications or as a prime example of biosensing in basic research. In recent years, optical sensing methods have emerged as promising glucose measurement techniques in the literature, with surface-enhanced infrared absorption (SEIRA) spectroscopy combining the sensitivity of plasmonic systems and the specificity of standard infrared spectroscopy. The challenge addressed in this paper is to determine the best method to estimate the glucose concentration in aqueous solutions in the presence of fructose from the measured reflectance spectra. This is referred to as the inverse problem of sensing and usually solved via linear regression. Here, instead, several advanced machine learning regression algorithms are proposed and compared, while the sensor data are subject to a pre-processing routine aiming to isolate key patterns from which to extract the relevant information. The most accurate and reliable predictions were finally made by a Gaussian process regression model which improves by more than 60% on previous approaches. Our findings give insight into the applicability of machine learning methods of regression for sensor calibration and explore the limitations of SEIRA glucose sensing.

Download Full-text

Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis

Atmosphere ◽

10.3390/atmos11010066 ◽

2020 ◽

Vol 11 (1) ◽

pp. 66 ◽

Cited By ~ 9

Author(s):

Sevda Shabani ◽

Saeed Samadianfard ◽

Mohammad Taghi Sattari ◽

Amir Mosavi ◽

Shahaboddin Shamshirband ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Nearest Neighbors ◽

Support Vector ◽

Pan Evaporation ◽

Learning Methods ◽

K Nearest Neighbors ◽

Machine Learning Methods

Evaporation is a very important process; it is one of the most critical factors in agricultural, hydrological, and meteorological studies. Due to the interactions of multiple climatic factors, evaporation is considered as a complex and nonlinear phenomenon to model. Thus, machine learning methods have gained popularity in this realm. In the present study, four machine learning methods of Gaussian Process Regression (GPR), K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Regression (SVR) were used to predict the pan evaporation (PE). Meteorological data including PE, temperature (T), relative humidity (RH), wind speed (W), and sunny hours (S) collected from 2011 through 2017. The accuracy of the studied methods was determined using the statistical indices of Root Mean Squared Error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). Furthermore, the Taylor charts utilized for evaluating the accuracy of the mentioned models. The results of this study showed that at Gonbad-e Kavus, Gorgan and Bandar Torkman stations, GPR with RMSE of 1.521 mm/day, 1.244 mm/day, and 1.254 mm/day, KNN with RMSE of 1.991 mm/day, 1.775 mm/day, and 1.577 mm/day, RF with RMSE of 1.614 mm/day, 1.337 mm/day, and 1.316 mm/day, and SVR with RMSE of 1.55 mm/day, 1.262 mm/day, and 1.275 mm/day had more appropriate performances in estimating PE values. It was found that GPR for Gonbad-e Kavus Station with input parameters of T, W and S and GPR for Gorgan and Bandar Torkmen stations with input parameters of T, RH, W and S had the most accurate predictions and were proposed for precise estimation of PE. The findings of the current study indicated that the PE values may be accurately estimated with few easily measured meteorological parameters.

Download Full-text