Model averaging with the hybrid model: An asymptotic study and demonstration

2022 ◽  
pp. 096228022110417
Author(s):  
Kian Wee Soh ◽  
Thomas Lumley ◽  
Cameron Walker ◽  
Michael O’Sullivan

In this paper, we present a new model averaging technique that can be applied in medical research. The dataset is first partitioned by the values of its categorical explanatory variables. Then for each partition, a model average is determined by minimising some form of squared errors, which could be the leave-one-out cross-validation errors. From our asymptotic optimality study and the results of simulations, we demonstrate under several high-level assumptions and modelling conditions that this model averaging procedure may outperform jackknife model averaging, which is a well-established technique. We also present an example where a cross-validation procedure does not work (that is, a zero-valued cross-validation error is obtained) when determining the weights for model averaging.

Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2705 ◽  
Author(s):  
Marko Jamšek ◽  
Tadej Petrič ◽  
Jan Babič

Research and development of active and passive exoskeletons for preventing work related injuries has steadily increased in the last decade. Recently, new types of quasi-passive designs have been emerging. These exoskeletons use passive viscoelastic elements, such as springs and dampers, to provide support to the user, while using small actuators only to change the level of support or to disengage the passive elements. Control of such devices is still largely unexplored, especially the algorithms that predict the movement of the user, to take maximum advantage of the passive viscoelastic elements. To address this issue, we developed a new control scheme consisting of Gaussian mixture models (GMM) in combination with a state machine controller to identify and classify the movement of the user as early as possible and thus provide a timely control output for the quasi-passive spinal exoskeleton. In a leave-one-out cross-validation procedure, the overall accuracy for providing support to the user was 86 . 72 ± 0 . 86 % (mean ± s.d.) with a sensitivity and specificity of 97 . 46 ± 2 . 09 % and 83 . 15 ± 0 . 85 % respectively. The results of this study indicate that our approach is a promising tool for the control of quasi-passive spinal exoskeletons.


2020 ◽  
Vol 41 (3) ◽  
pp. 829 ◽  
Author(s):  
Marco Antonio Vieira Morais ◽  
Marcelo Ribeiro Viola ◽  
Carlos Rogério de Mello ◽  
Jéssica Assaid Martins Rodrigues ◽  
Vinícius Augusto de Oliveira

Hydraulic projects and water management require reliable hydrological data. The Araguaia-Tocantins River basin, in addition to agricultural use, has great potential for hydroelectric exploitation. However, the streamflow monitoring network in the Araguaia River basin is composed of only a few stations, resulting in a lack of hydrological data. The regionalization of the reference streamflows is a technique that can help circumvent this lack of data, enabling the estimation of streamflows from easily obtainable explanatory variables. In this context, the objective of this study was to develop regional functions for the maximum streamflow (Qmax) applicable to different Return Periods (RP), the long-term mean streamflow (Qmlt) and the 95% streamflow permanence (Q95) of the upper and middle Araguaia River sub-basins. The dimensionless streamflow methodology was adopted with the drainage area as an explanatory variable. The tested regressive models were the linear, potential and quotient models. Leave-one-out cross-validation was used to assess the quality of the regional models. Ten statistical distributions of 2 to 5 parameters were used. (i) Satisfactory results were obtained for all reference streamflows. (ii) The cross-validation technique proved to be essential for the selection of the most robust model. (iii) The quotient model was shown to be superior to the potential linear model in most cases.


2020 ◽  
Author(s):  
Rong Zhu ◽  
Xinyu Zhang ◽  
Yanyuan Ma ◽  
Guohua Zou

Abstract In this paper, we develop a model averaging method to estimate the high-dimensional covariance matrix, where the candidate models are constructed by different orders of the polynomial functions. We propose a Mallows-type model averaging criterion and select the weights by minimizing this criterion, which is an unbiased estimator of the expected in-sample squared error plus a constant. Then, we prove the asymptotic optimality of the resulting model average covariance (MAC) estimators. Furthermore, numerical simulations and a case study on Chinese airport network structure data are conducted to demonstrate the usefulness of the proposed approaches.


Author(s):  
Bonpagna Kann ◽  
Thodsaporn Chay-intr ◽  
Hour Kaing ◽  
Thanaruk Theeramunkong

Despite the fact that there are a number of researches working on Khmer Language in the field of Natural Language Processing along with some resources regarding words segmentation and POS Tagging, we still lack of high-level resources regarding syntax, Treebanks and grammars, for example. This paper illustrates the semi-automatic framework of constructing Khmer Treebank and the extraction of the Khmer grammar rules from a set of sentences taken from the Khmer grammar books. Initially, these sentences will be manually annotated and processed to generate a number of grammar rules with their probabilities once the Treebank is obtained. In our experiments, the annotated trees and the extracted grammar rules are analyzed in both quantitative and qualitative way. Finally, the results will be evaluated in three evaluation processes including Self-Consistency, 5-Fold Cross-Validation, Leave-One-Out Cross-Validation along with the three validation methods such as Precision, Recall, F1-Measure. According to the result of the three validations, Self-Consistency has shown the best result with more than 92%, followed by the Leave-One-Out Cross-Validation and 5-Fold Cross Validation with the average of 88% and 75% respectively. On the other hand, the crossing bracket data shows that Leave-One-Out Cross Validation holds the highest average with 96% while the other two are 85% and 89%, respectively.


2021 ◽  
pp. 1-43
Author(s):  
Ji Hyung Lee ◽  
Youngki Shin

We propose a novel conditional quantile prediction method based on complete subset averaging (CSA) for quantile regressions. All models under consideration are potentially misspecified, and the dimension of regressors goes to infinity as the sample size increases. Since we average over the complete subsets, the number of models is much larger than the usual model averaging method which adopts sophisticated weighting schemes. We propose to use an equal weight but select the proper size of the complete subset based on the leave-one-out cross-validation method. Building upon the theory of Lu and Su (2015, Journal of Econometrics 188, 40–58), we investigate the large sample properties of CSA and show the asymptotic optimality in the sense of Li (1987, Annals of Statistics 15, 958–975) We check the finite sample performance via Monte Carlo simulations and empirical applications.


Risks ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 114
Author(s):  
Paritosh Navinchandra Jha ◽  
Marco Cucculelli

The paper introduces a novel approach to ensemble modeling as a weighted model average technique. The proposed idea is prudent, simple to understand, and easy to implement compared to the Bayesian and frequentist approach. The paper provides both theoretical and empirical contributions for assessing credit risk (probability of default) effectively in a new way by creating an ensemble model as a weighted linear combination of machine learning models. The idea can be generalized to any classification problems in other domains where ensemble-type modeling is a subject of interest and is not limited to an unbalanced dataset or credit risk assessment. The results suggest a better forecasting performance compared to the single best well-known machine learning of parametric, non-parametric, and other ensemble models. The scope of our approach can be extended to any further improvement in estimating weights differently that may be beneficial to enhance the performance of the model average as a future research direction.


2019 ◽  
Vol 76 (7) ◽  
pp. 2349-2361
Author(s):  
Benjamin Misiuk ◽  
Trevor Bell ◽  
Alec Aitken ◽  
Craig J Brown ◽  
Evan N Edinger

Abstract Species distribution models are commonly used in the marine environment as management tools. The high cost of collecting marine data for modelling makes them finite, especially in remote locations. Underwater image datasets from multiple surveys were leveraged to model the presence–absence and abundance of Arctic soft-shell clam (Mya spp.) to support the management of a local small-scale fishery in Qikiqtarjuaq, Nunavut, Canada. These models were combined to predict Mya abundance, conditional on presence throughout the study area. Results suggested that water depth was the primary environmental factor limiting Mya habitat suitability, yet seabed topography and substrate characteristics influence their abundance within suitable habitat. Ten-fold cross-validation and spatial leave-one-out cross-validation (LOO CV) were used to assess the accuracy of combined predictions and to test whether this was inflated by the spatial autocorrelation of transect sample data. Results demonstrated that four different measures of predictive accuracy were substantially inflated due to spatial autocorrelation, and the spatial LOO CV results were therefore adopted as the best estimates of performance.


2014 ◽  
Vol 79 (8) ◽  
pp. 965-975 ◽  
Author(s):  
Long Jiao ◽  
Xiaofei Wang ◽  
LI. Hua ◽  
Yunxia Wang

The quantitative structure property relationship (QSPR) for gas/particle partition coefficient, Kp, of polychlorinated biphenyls (PCBs) was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCBs. The quantitative relationship between the MDEV index and log Kp was modeled by multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave one out cross validation and external validation were carried out to assess the prediction ability of the developed models. When the MLR method is used, the root mean square relative error (RMSRE) of prediction for leave one out cross validation and external validation is 4.72 and 8.62 respectively. When the ANN method is employed, the prediction RMSRE of leave one out cross validation and external validation is 3.87 and 7.47 respectively. It is demonstrated that the developed models are practicable for predicting the Kp of PCBs. The MDEV index is shown to be quantitatively related to the Kp of PCBs.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Hong-Jhang Chen ◽  
Yii-Jeng Lin ◽  
Pei-Chen Wu ◽  
Wei-Hsiang Hsu ◽  
Wan-Chung Hu ◽  
...  

Traditional Chinese medicine (TCM) formulates treatment according to body constitution (BC) differentiation. Different constitutions have specific metabolic characteristics and different susceptibility to certain diseases. This study aimed to assess theYang-Xuconstitution using a body constitution questionnaire (BCQ) and clinical blood variables. A BCQ was employed to assess the clinical manifestation ofYang-Xu. The logistic regression model was conducted to explore the relationship between BC scores and biomarkers. Leave-one-out cross-validation (LOOCV) and K-fold cross-validation were performed to evaluate the accuracy of a predictive model in practice. Decision trees (DTs) were conducted to determine the possible relationships between blood biomarkers and BC scores. According to the BCQ analysis, 49% participants without any BC were classified as healthy subjects. Among them, 130 samples were selected for further analysis and divided into two groups. One group comprised healthy subjects without any BC (68%), while subjects of the other group, named as the sub-healthy group, had three BCs (32%). Six biomarkers, CRE, TSH, HB, MONO, RBC, and LH, were found to have the greatest impact on BCQ outcomes inYang-Xusubjects. This study indicated significant biochemical differences inYang-Xusubjects, which may provide a connection between blood variables and theYang-XuBC.


Sign in / Sign up

Export Citation Format

Share Document