out of sample prediction
Recently Published Documents


TOTAL DOCUMENTS

77
(FIVE YEARS 50)

H-INDEX

10
(FIVE YEARS 4)

2021 ◽  
pp. jnnp-2021-327211
Author(s):  
Anna K Bonkhoff ◽  
Tom Hope ◽  
Danilo Bzdok ◽  
Adrian G Guggisberg ◽  
Rachel L Hawe ◽  
...  

IntroductionStroke causes different levels of impairment and the degree of recovery varies greatly between patients. The majority of recovery studies are biased towards patients with mild-to-moderate impairments, challenging a unified recovery process framework. Our aim was to develop a statistical framework to analyse recovery patterns in patients with severe and non-severe initial impairment and concurrently investigate whether they recovered differently.MethodsWe designed a Bayesian hierarchical model to estimate 3–6 months upper limb Fugl-Meyer (FM) scores after stroke. When focusing on the explanation of recovery patterns, we addressed confounds affecting previous recovery studies and considered patients with FM-initial scores <45 only. We systematically explored different FM-breakpoints between severe/non-severe patients (FM-initial=5–30). In model comparisons, we evaluated whether impairment-level-specific recovery patterns indeed existed. Finally, we estimated the out-of-sample prediction performance for patients across the entire initial impairment range.ResultsRecovery data was assembled from eight patient cohorts (n=489). Data were best modelled by incorporating two subgroups (breakpoint: FM-initial=10). Both subgroups recovered a comparable constant amount, but with different proportional components: severely affected patients recovered more the smaller their impairment, while non-severely affected patients recovered more the larger their initial impairment. Prediction of 3–6 months outcomes could be done with an R2=63.5% (95% CI=51.4% to 75.5%).ConclusionsOur work highlights the benefit of simultaneously modelling recovery of severely-to-non-severely impaired patients and demonstrates both shared and distinct recovery patterns. Our findings provide evidence that the severe/non-severe subdivision in recovery modelling is not an artefact of previous confounds. The presented out-of-sample prediction performance may serve as benchmark to evaluate promising biomarkers of stroke recovery.


2021 ◽  
pp. 095679762110159
Author(s):  
Sophie Van Der Zee ◽  
Ronald Poppe ◽  
Alice Havrileck ◽  
Aurélien Baillon

Language use differs between truthful and deceptive statements, but not all differences are consistent across people and contexts, complicating the identification of deceit in individuals. By relying on fact-checked tweets, we showed in three studies (Study 1: 469 tweets; Study 2: 484 tweets; Study 3: 24 models) how well personalized linguistic deception detection performs by developing the first deception model tailored to an individual: the 45th U.S. president. First, we found substantial linguistic differences between factually correct and factually incorrect tweets. We developed a quantitative model and achieved 73% overall accuracy. Second, we tested out-of-sample prediction and achieved 74% overall accuracy. Third, we compared our personalized model with linguistic models previously reported in the literature. Our model outperformed existing models by 5 percentage points, demonstrating the added value of personalized linguistic analysis in real-world settings. Our results indicate that factually incorrect tweets by the U.S. president are not random mistakes of the sender.


2021 ◽  
Author(s):  
Atsushi Ueshima ◽  
Hiroki Takikawa

Most human societies conduct a high degree of division of labor based on occupation. However, determining the occupational field that should be allocated a scarce resource such as vaccine is a topic of debate, especially considering the COVID-19 situation. Though it is crucial that we understand and anticipate people’s judgments on resource allocation prioritization, quantifying the concept of occupation is a difficult task. In this study, we investigated how well people’s judgments on vaccination prioritization for different occupations could be modeled by quantifying their knowledge representation of occupations as word vectors in a vector space. The results showed that the model that quantified occupations as word vectors indicated high out-of-sample prediction accuracy, enabling us to explore the psychological dimension underlying the participants’ judgments. These results indicated that using word vectors for modeling human judgments about everyday concepts allowed prediction of performance and understanding of judgment mechanisms.


2021 ◽  
Author(s):  
Lia Talozzi ◽  
Stephanie Forkel ◽  
Valentina Pacella ◽  
Victor Nozais ◽  
Maurizio Corbetta ◽  
...  

Abstract Stroke significantly impacts quality of life. However, the long-term cognitive evolution in stroke is poorly predictable at the individual level. There is an urgent need for a better prediction of long-term symptoms based on acute clinical neuroimaging data. Previous works have demonstrated a strong relationship between the location of white matter disconnections and clinical symptoms. However, rendering the entire space of possible disconnections-deficit associations optimally surveyable will allow for a systematic association between brain disconnections and cognitive-behavioural measures at the individual level. Here we present the most comprehensive framework, a composite morphospace to predict neuropsychological scores one year after stroke. Linking the latent disconnectome morphospace to neuropsychological outcomes yields biological insights available as the first comprehensive atlas of disconnectome-deficit relations across 86 neuropsychological scores. Out-of-sample prediction derived from this atlas achieved average accuracy over 80%, which is higher than any other framework. Our novel predictive framework is available as an interactive web application, the disconnectome symptoms discoverer (http://disconnectomestudio.bcblab.com), to provide the foundations for a new and practical approach to modelling cognition in stroke. Our atlas and web application will reduce the burden of cognitive deficits on patients, their families, and wider society while also helping to tailor personalized treatment programs and discover new targets for treatments. We expect the range of assessments and the predictive power of our framework to increase even further through future crowdsourcing.


2021 ◽  
pp. 1471082X2110576
Author(s):  
Laura Vana ◽  
Kurt Hornik

In this article, we propose a longitudinal multivariate model for binary and ordinal outcomes to describe the dynamic relationship among firm defaults and credit ratings from various raters. The latent probability of default is modelled as a dynamic process which contains additive firm-specific effects, a latent systematic factor representing the business cycle and idiosyncratic observed and unobserved factors. The joint set-up also facilitates the estimation of a bias for each rater which captures changes in the rating standards of the rating agencies. Bayesian estimation techniques are employed to estimate the parameters of interest. Several models are compared based on their out-of-sample prediction ability and we find that the proposed model outperforms simpler specifications. The joint framework is illustrated on a sample of publicly traded US corporates which are rated by at least one of the credit rating agencies S&P, Moody's and Fitch during the period 1995–2014.


Author(s):  
Seong D. Yun ◽  
Benjamin M. Gramig

Abstract This study scrutinizes spatial econometric models and specifications of crop yield response functions to provide a robust evaluation of empirical alternatives available to researchers. We specify 14 competing panel regression models of crop yield response to weather and site characteristics. Using county corn yields in the US, this study implements in-sample, out-of-sample, and bootstrapped out-of-sample prediction performance comparisons. Descriptive propositions and empirical results demonstrate the importance of spatial correlation and empirically support the fixed effects model with spatially dependent error structures. This study also emphasizes the importance of extensive model specification testing and evaluation of selection criteria for prediction.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Hana Šinkovec ◽  
Georg Heinze ◽  
Rok Blagus ◽  
Angelika Geroldinger

Abstract Background For finite samples with binary outcomes penalized logistic regression such as ridge logistic regression has the potential of achieving smaller mean squared errors (MSE) of coefficients and predictions than maximum likelihood estimation. There is evidence, however, that ridge logistic regression can result in highly variable calibration slopes in small or sparse data situations. Methods In this paper, we elaborate this issue further by performing a comprehensive simulation study, investigating the performance of ridge logistic regression in terms of coefficients and predictions and comparing it to Firth’s correction that has been shown to perform well in low-dimensional settings. In addition to tuned ridge regression where the penalty strength is estimated from the data by minimizing some measure of the out-of-sample prediction error or information criterion, we also considered ridge regression with pre-specified degree of shrinkage. We included ‘oracle’ models in the simulation study in which the complexity parameter was chosen based on the true event probabilities (prediction oracle) or regression coefficients (explanation oracle) to demonstrate the capability of ridge regression if truth was known. Results Performance of ridge regression strongly depends on the choice of complexity parameter. As shown in our simulation and illustrated by a data example, values optimized in small or sparse datasets are negatively correlated with optimal values and suffer from substantial variability which translates into large MSE of coefficients and large variability of calibration slopes. In contrast, in our simulations pre-specifying the degree of shrinkage prior to fitting led to accurate coefficients and predictions even in non-ideal settings such as encountered in the context of rare outcomes or sparse predictors. Conclusions Applying tuned ridge regression in small or sparse datasets is problematic as it results in unstable coefficients and predictions. In contrast, determining the degree of shrinkage according to some meaningful prior assumptions about true effects has the potential to reduce bias and stabilize the estimates.


Author(s):  
Francesco Bloise ◽  
Paolo Brunori ◽  
Patrizio Piraino

AbstractMuch of the global evidence on intergenerational income mobility is based on sub-optimal data. In particular, two-stage techniques are widely used to impute parental incomes for analyses of lower-income countries and for estimating long-run trends across multiple generations and historical periods. We propose applying machine learning methods to improve the reliability and comparability of such estimates. Supervised learning algorithms minimize the out-of-sample prediction error in the parental income imputation and provide an objective criterion for choosing across different specifications of the first-stage equation. We use our approach on data from the United States and South Africa to show that under common conditions it can limit the bias generally associated to mobility estimates based on imputed parental income.


Author(s):  
Moisés Arce ◽  
Sofía Vera

The Peruvian political landscape is dominated by the weakness of party organizations, the continuous rotation of political personalities, and, in turn, high electoral volatility and uncertainty. Nevertheless, we observe patterns of electoral competition that suggest candidates learn to capture the political center and compete over the continuation of an economic model that has sustained growth. We use this information to record the vote intention for the candidate viewed as the lesser evil. Our forecasting results predict a good share of the variation in political support for this candidate. The out-of-sample prediction also comes fairly close to the real electoral results. These findings provide some degree of electoral certainty in an area that, to date, remains understudied.


Sign in / Sign up

Export Citation Format

Share Document