Model Selection for Univariable Fractional Polynomials

Since Royston and Altman's 1994 publication ( Journal of the Royal Statistical Society, Series C 43: 429–467), fractional polynomials have steadily gained popularity as a tool for flexible parametric modeling of regression relationships. In this article, I present fp_select, a postestimation tool for fp that allows the user to select a parsimonious fractional polynomial model according to a closed test procedure called the fractional polynomial selection procedure or function selection procedure. I also give a brief introduction to fractional polynomial models and provide examples of using fp and fp_select to select such models with real data.

Download Full-text

Automatic Variable Selection for Partially Linear Functional Additive Model and Its Application to the Tecator Data Set

Mathematical Problems in Engineering ◽

10.1155/2018/5683539 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9

Author(s):

Yuping Hu ◽

Sanying Feng ◽

Liugen Xue

Keyword(s):

Variable Selection ◽

Linear Functional ◽

Additive Model ◽

Selection Procedure ◽

Real Data ◽

Regularity Conditions ◽

Data Set ◽

Partially Linear ◽

Functional Additive ◽

Selection For

We introduce a new partially linear functional additive model, and we consider the problem of variable selection for this model. Based on the functional principal components method and the centered spline basis function approximation, a new variable selection procedure is proposed by using the smooth-threshold estimating equation (SEE). The proposed procedure automatically eliminates inactive predictors by setting the corresponding parameters to be zero and simultaneously estimates the nonzero regression coefficients by solving the SEE. The approach avoids the convex optimization problem, and it is flexible and easy to implement. We establish the asymptotic properties of the resulting estimators under some regularity conditions. We apply the proposed procedure to analyze a real data set: the Tecator data set.

Download Full-text

Optimal quantile level selection for disease classification and biomarker discovery with application to electrocardiogram data

Statistical Methods in Medical Research ◽

10.1177/0962280217699996 ◽

2017 ◽

Vol 27 (11) ◽

pp. 3340-3349

Author(s):

Yingchun Zhou ◽

Rong Huang ◽

Shanshan Yu ◽

Yanyuan Ma

Keyword(s):

Cardiovascular Diseases ◽

Medical Research ◽

Biomarker Discovery ◽

Selection Procedure ◽

Real Data ◽

Disease Classification ◽

Intensive Study ◽

Data Set ◽

Selection For

Classification with a large number of predictors and biomarker discovery become increasingly important in biological and medical research. This paper focuses on performing classification of cardiovascular diseases based on electrocardiogram analysis which deals with many variables and a lot of measurements within variables. We propose an optimal quantile level selection procedure to reduce dimension by characterizing distributions with quantiles and combine with classification tools to produce sensible classification and biomarker discovery results. Simulation and an intensive study of a real data set are performed to illustrate the performance of the proposed method.

Download Full-text

Identifying Measurement Invariant Item Sets in Cross-Cultural Settings Using an Automated Item Selection Procedure

Methodology ◽

10.1027/1614-2241/a000155 ◽

2018 ◽

Vol 14 (4) ◽

pp. 177-188 ◽

Cited By ~ 2

Author(s):

Martin Schultze ◽

Michael Eid

Keyword(s):

Measurement Invariance ◽

Optimal Solution ◽

Selection Procedure ◽

Cross Cultural ◽

Item Selection ◽

Ant System ◽

Item Quality ◽

Ant System Algorithm ◽

Selection For ◽

Selection Of

Abstract. In the construction of scales intended for the use in cross-cultural studies, the selection of items needs to be guided not only by traditional criteria of item quality, but has to take information about the measurement invariance of the scale into account. We present an approach to automated item selection which depicts the process as a combinatorial optimization problem and aims at finding a scale which fulfils predefined target criteria – such as measurement invariance across cultures. The search for an optimal solution is performed using an adaptation of the [Formula: see text] Ant System algorithm. The approach is illustrated using an application to item selection for a personality scale assuming measurement invariance across multiple countries.

Download Full-text

Effective customer selection for marketing campaigns based on net scores

Journal of Research in Interactive Marketing ◽

10.1108/jrim-10-2015-0080 ◽

2017 ◽

Vol 11 (1) ◽

pp. 2-15 ◽

Cited By ~ 7

Author(s):

René Michel ◽

Igor Schnakenburg ◽

Tobias von Martens

Keyword(s):

Decision Trees ◽

Direct Marketing ◽

Real Data ◽

Business Case ◽

Added Value ◽

Scoring Method ◽

Content Type ◽

Effective Selection ◽

Selection For ◽

Response Modeling

Purpose This paper aims to address the effective selection of customers for direct marketing campaigns. It introduces a new method to forecast campaign-related uplifts (also known as incremental response modeling or net scoring). By means of these uplifts, only the most responsive customers are targeted by a campaign. This paper also aims at calculating the financial impact of the new approach compared to the classical (gross) scoring methods. Design/methodology/approach First, gross and net scoring approaches to customer selection for direct marketing campaigns are compared. After that, it is shown how net scoring can be applied in practice with regard to different strategical objectives. Then, a new statistic for net scoring based on decision trees is developed. Finally, a business case based on real data from the financial sector is calculated to compare gross and net scoring approaches. Findings Whereas gross scoring focuses on customers with a high probability of purchase, regardless of being targeted by a campaign, net scoring identifies those customers who are most responsive to campaigns. A common scoring procedure – decision trees – can be enhanced by the new statistic to forecast those campaign-related uplifts. The business case shows that the selected scoring method has a relevant impact on economical indicators. Practical implications The contribution of net scoring to campaign effectiveness and efficiency is shown by the business case. Furthermore, this paper suggests a framework for customer selection, given strategical objectives, e.g. minimizing costs or maximizing (gross or lift)-added value, and presents a new statistic that can be applied to common scoring procedures. Originality/value Despite its lever on the effectiveness of marketing campaigns, only few contributions address net scores up to now. The new χ2-statistic is a straightforward approach to the enhancement of decision trees for net scoring. Furthermore, this paper is the first to the application of net scoring with regard to different strategical objectives.

Download Full-text

Stimuli Selection Criteria for the Experiment “Visual Perception of Imitative Words in Native and Non-Native Language by the Method Lexical Decision”

Discourse ◽

10.32603/2412-8562-2020-6-5-97-112 ◽

2020 ◽

Vol 6 (5) ◽

pp. 97-112

Author(s):

M. A. Flaksman ◽

Yu. V. Lavitskaya ◽

Yu. G. Sedelkina ◽

L. O. Tkacheva

Keyword(s):

Visual Perception ◽

Lexical Decision ◽

Native Language ◽

Material Selection ◽

Selection Procedure ◽

Parts Of Speech ◽

Clear Cut ◽

Gradual Loss ◽

Etymological Dictionary ◽

Selection For

Introduction. The present article is aiming to describe the procedure of stimuli selection for the psycho-semantic experiment on visual perception of imitative words in native (Russian) and non-native language (English). The methodology of the experiment is predominantly based on the implementation of the “lexical decision” method. Thus, the aim of the article is to verify the procedure of and to define clear-cut criteria for the material selection. In particular, we introduce indicating de-iconization stage of imitative words as an important criterion for data pre-selection. De-iconization is a gradual loss of an iconic sound-sense link in an imitative word due to the parallel impact of regular sound changes and semantic shifts.Methodology and sources. The research methodology is based on the works ofS. V. Voronin who is the founder of phonosemantics as a linguistic discipline inRussia, as well as on works of his followers (including a co-author of this paper, M. A. Flaksman). The article is also based on the methodology of research on phonotactics. The authors also use psycho-semantic methods such as the method of lexical decision. The main sources of stimuli selection are The Russia Etymological Dictionary by M. Vasmer, The Oxford English Dictionary, the frequency dictionaries by O. N. Liashevskaya and S. A. Sharov. The classification of imitative words according to their de-iconization stages was done by the method of the diachronic evaluation of the imitative lexicon.Results and discussion. As a result of a rigorous selection procedure described in the article the authors arrived on 128 stimuli (an even number (64 + 64) of words and quasiwords). The quasi-words are coined according to phonotactic rules and made according to the same pattern as the corresponding words. The group of real words is constituted of two sub-groups: 32 imitative words and 32 non-imitative words. The words from these two subgroups are homomorphous – they have the same number of syllables, frequency and belong to the same parts of speech. Imitative words include onomatopoeic and soundsymbolic words of different sub-classes and de-iconization stages. The combination of the material selection methods discussed in this paper (especially, the introduction of the distinction of imitative words according to their de-iconization stage) is aiming at facilitating the experiment procedure as well as eliminating the chance factors.Conclusion. The stimuli selection for the psycholinguistic experiment based on the procedure introduced in this paper allows to establish the existing patterns of the systematic function of human brain in the process of visual perception of imitative words on different de-iconization stages.

Download Full-text

ESTIMATING EXTERIOR ORIENTATION PARAMETERS OF HYPERSPECTRAL BANDS BASED ON POLYNOMIAL MODELS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-w3-19-2017 ◽

2017 ◽

Vol XLII-3/W3 ◽

pp. 19-25 ◽

Cited By ~ 1

Author(s):

A. Berveglieri ◽

A. M. G. Tommaselli ◽

E. Honkavaara

Keyword(s):

Real Data ◽

Bundle Adjustment ◽

Time Dependent ◽

Camera Model ◽

Exterior Orientation ◽

Spectral Bands ◽

Acquisition Mode ◽

Polynomial Models ◽

Orientation Parameters ◽

Sequential Acquisition

Hyperspectral camera operating in sequential acquisition mode produces spectral bands that are not recorded at the same instant, thus having different exterior orientation parameters (EOPs) for each band. The study presents experiments on bundle adjustment with time-dependent polynomial models for band orientation of hyperspectral cubes sequentially collected. The technique was applied to a Rikola camera model. The purpose was to investigate the behaviour of the estimated polynomial parameters and the feasibility of using a minimum of bands to estimate EOPs. Simulated and real data were produced for the analysis of parameters and accuracy in ground points. The tests considered conventional bundle adjustment and the polynomial models. The results showed that both techniques were comparable, indicating that the time-dependent polynomial model can be used to estimate the EOPs of all spectral bands, without requiring a bundle adjustment of each band. The accuracy of the block adjustment was analysed based on the discrepancy obtained from checkpoints. The root mean square error (RMSE) indicated an accuracy of 1&thinsp;GSD in planimetry and 1.5&thinsp;GSD in altimetry, when using a minimum of four bands per cube.

Download Full-text

Optimal designs for health risk assessments using fractional polynomial models

Stochastic Environmental Research and Risk Assessment ◽

10.1007/s00477-021-02155-1 ◽

2022 ◽

Author(s):

Víctor Casero-Alonso ◽

Jesús López–Fidalgo ◽

Weng Kee Wong

Keyword(s):

Health Risk ◽

Risk Assessments ◽

Optimal Designs ◽

Fractional Polynomial ◽

Polynomial Models ◽

Health Risk Assessments

Download Full-text

Bayesian splines versus fractional polynomials in network meta-analysis

BMC Medical Research Methodology ◽

10.1186/s12874-020-01113-9 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Andreas Heinecke ◽

Marta Tallarita ◽

Maria De Iorio

Keyword(s):

Meta Analysis ◽

Indirect Comparison ◽

Real Data ◽

Multiple Time ◽

Computationally Efficient ◽

Fractional Polynomials ◽

Time Points ◽

Model Based ◽

Multiple Treatments ◽

Indirect Comparisons

Abstract Background Network meta-analysis (NMA) provides a powerful tool for the simultaneous evaluation of multiple treatments by combining evidence from different studies, allowing for direct and indirect comparisons between treatments. In recent years, NMA is becoming increasingly popular in the medical literature and underlying statistical methodologies are evolving both in the frequentist and Bayesian framework. Traditional NMA models are often based on the comparison of two treatment arms per study. These individual studies may measure outcomes at multiple time points that are not necessarily homogeneous across studies. Methods In this article we present a Bayesian model based on B-splines for the simultaneous analysis of outcomes across time points, that allows for indirect comparison of treatments across different longitudinal studies. Results We illustrate the proposed approach in simulations as well as on real data examples available in the literature and compare it with a model based on P-splines and one based on fractional polynomials, showing that our approach is flexible and overcomes the limitations of the latter. Conclusions The proposed approach is computationally efficient and able to accommodate a large class of temporal treatment effect patterns, allowing for direct and indirect comparisons of widely varying shapes of longitudinal profiles.

Download Full-text

Meta-Learning PAC-Bayes Priors in Model Averaging

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5841 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4198-4205

Author(s):

Yimin Huang ◽

Weiran Huang ◽

Liang Li ◽

Zhenguo Li

Keyword(s):

Model Averaging ◽

Selection Procedure ◽

Real Data ◽

Poor Quality ◽

Quality Data ◽

Main Challenge ◽

Meta Learning ◽

Model Set ◽

Base Learner ◽

Proper Priors

Nowadays model uncertainty has become one of the most important problems in both academia and industry. In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model's uncertainty in order to improve reliability and accuracy of inferences. Here one main challenge is to learn the prior over the model set. To tackle this problem, we propose two data-based algorithms to get proper priors for model averaging. One is for meta-learner, the analysts should use historical similar tasks to extract the information about the prior. The other one is for base-learner, a subsampling method is used to deal with the data step by step. Theoretically, an upper bound of risk for our algorithm is presented to guarantee the performance of the worst situation. In practice, both methods perform well in simulations and real data studies, especially with poor quality data.

Download Full-text