scholarly journals A Two-Stage Joint Model for Nonlinear Longitudinal Response and a Time-to-Event with Application in Transplantation Studies

2012 ◽  
Vol 2012 ◽  
pp. 1-18 ◽  
Author(s):  
Magdalena Murawska ◽  
Dimitris Rizopoulos ◽  
Emmanuel Lesaffre

In transplantation studies, often longitudinal measurements are collected for important markers prior to the actual transplantation. Using only the last available measurement as a baseline covariate in a survival model for the time to graft failure discards the whole longitudinal evolution. We propose a two-stage approach to handle this type of data sets using all available information. At the first stage, we summarize the longitudinal information with nonlinear mixed-effects model, and at the second stage, we include the Empirical Bayes estimates of the subject-specific parameters as predictors in the Cox model for the time to allograft failure. To take into account that the estimated subject-specific parameters are included in the model, we use a Monte Carlo approach and sample from the posterior distribution of the random effects given the observed data. Our proposal is exemplified on a study of the impact of renal resistance evolution on the graft survival.

2013 ◽  
Vol 1 (2) ◽  
pp. 209-234 ◽  
Author(s):  
Pengyuan Wang ◽  
Mikhail Traskin ◽  
Dylan S. Small

AbstractThe before-and-after study with multiple unaffected control groups is widely applied to study treatment effects. The current methods usually assume that the control groups’ differences between the before and after periods, i.e. the group time effects, follow a normal distribution. However, there is usually no strong a priori evidence for the normality assumption, and there are not enough control groups to check the assumption. We propose to use a flexible skew-t distribution family to model group time effects, and consider a range of plausible skew-t distributions. Based on the skew-t distribution assumption, we propose a robust-t method to guarantee nominal significance level under a wide range of skew-t distributions, and hence make the inference robust to misspecification of the distribution of group time effects. We also propose a two-stage approach, which has lower power compared to the robust-t method, but provides an opportunity to conduct sensitivity analysis. Hence, the overall method of analysis is to use the robust-t method to test for the overall hypothesized range of shapes of group variation; if the test fails to reject, use the two-stage method to conduct a sensitivity analysis to see if there is a subset of group variation parameters for which we can be confident that there is a treatment effect. We apply the proposed methods to two datasets. One dataset is from the Current Population Survey (CPS) to study the impact of the Mariel Boatlift on Miami unemployment rates between 1979 and 1982.The other dataset contains the student enrollment and grade repeating data in West Germany in the 1960s with which we study the impact of the short school year in 1966–1967 on grade repeating rates.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yahya Albalawi ◽  
Jim Buckley ◽  
Nikola S. Nikolov

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.


2021 ◽  
pp. 000276422110216
Author(s):  
Kazimierz M. Slomczynski ◽  
Irina Tomescu-Dubrow ◽  
Ilona Wysmulek

This article proposes a new approach to analyze protest participation measured in surveys of uneven quality. Because single international survey projects cover only a fraction of the world’s nations in specific periods, researchers increasingly turn to ex-post harmonization of different survey data sets not a priori designed as comparable. However, very few scholars systematically examine the impact of the survey data quality on substantive results. We argue that the variation in source data, especially deviations from standards of survey documentation, data processing, and computer files—proposed by methodologists of Total Survey Error, Survey Quality Monitoring, and Fitness for Intended Use—is important for analyzing protest behavior. In particular, we apply the Survey Data Recycling framework to investigate the extent to which indicators of attending demonstrations and signing petitions in 1,184 national survey projects are associated with measures of data quality, controlling for variability in the questionnaire items. We demonstrate that the null hypothesis of no impact of measures of survey quality on indicators of protest participation must be rejected. Measures of survey documentation, data processing, and computer records, taken together, explain over 5% of the intersurvey variance in the proportions of the populations attending demonstrations or signing petitions.


1994 ◽  
Vol 33 (04) ◽  
pp. 390-396 ◽  
Author(s):  
J. G. Stewart ◽  
W. G. Cole

Abstract:Metaphor graphics are data displays designed to look like corresponding variables in the real world, but in a non-literal sense of “look like”. Evaluation of the impact of these graphics on human problem solving has twice been carried out, but with conflicting results. The present experiment attempted to clarify the discrepancies between these findings by using a complex task in which expert subjects interpreted respiratory data. The metaphor graphic display led to interpretations twice as fast as a tabular (flowsheet) format, suggesting that conflict between earlier studies is due either to differences in training or to differences in goodness of metaphor, Findings to date indicate that metaphor graphics work with complex as well as simple data sets, pattern detection as well as single number reporting tasks, and with expert as well as novice subjects.


2015 ◽  
Vol 8 (1) ◽  
pp. 421-434 ◽  
Author(s):  
M. P. Jensen ◽  
T. Toto ◽  
D. Troyan ◽  
P. E. Ciesielski ◽  
D. Holdridge ◽  
...  

Abstract. The Midlatitude Continental Convective Clouds Experiment (MC3E) took place during the spring of 2011 centered in north-central Oklahoma, USA. The main goal of this field campaign was to capture the dynamical and microphysical characteristics of precipitating convective systems in the US Central Plains. A major component of the campaign was a six-site radiosonde array designed to capture the large-scale variability of the atmospheric state with the intent of deriving model forcing data sets. Over the course of the 46-day MC3E campaign, a total of 1362 radiosondes were launched from the enhanced sonde network. This manuscript provides details on the instrumentation used as part of the sounding array, the data processing activities including quality checks and humidity bias corrections and an analysis of the impacts of bias correction and algorithm assumptions on the determination of convective levels and indices. It is found that corrections for known radiosonde humidity biases and assumptions regarding the characteristics of the surface convective parcel result in significant differences in the derived values of convective levels and indices in many soundings. In addition, the impact of including the humidity corrections and quality controls on the thermodynamic profiles that are used in the derivation of a large-scale model forcing data set are investigated. The results show a significant impact on the derived large-scale vertical velocity field illustrating the importance of addressing these humidity biases.


Radiocarbon ◽  
2012 ◽  
Vol 54 (3-4) ◽  
pp. 449-474 ◽  
Author(s):  
Sturt W Manning ◽  
Bernd Kromer

The debate over the dating of the Santorini (Thera) volcanic eruption has seen sustained efforts to criticize or challenge the radiocarbon dating of this time horizon. We consider some of the relevant areas of possible movement in the14C dating—and, in particular, any plausible mechanisms to support as late (most recent) a date as possible. First, we report and analyze data investigating the scale of apparent possible14C offsets (growing season related) in the Aegean-Anatolia-east Mediterranean region (excluding the southern Levant and especially pre-modern, pre-dam Egypt, which is a distinct case), and find no evidence for more than very small possible offsets from several cases. This topic is thus not an explanation for current differences in dating in the Aegean and at best provides only a few years of latitude. Second, we consider some aspects of the accuracy and precision of14C dating with respect to the Santorini case. While the existing data appear robust, we nonetheless speculate that examination of the frequency distribution of the14C data on short-lived samples from the volcanic destruction level at Akrotiri on Santorini (Thera) may indicate that the average value of the overall data sets is not necessarily the most appropriate14C age to use for dating this time horizon. We note the recent paper of Soter (2011), which suggests that in such a volcanic context some (small) age increment may be possible from diffuse CO2emissions (the effect is hypothetical at this stage and hasnotbeen observed in the field), and that "if short-lived samples from the same stratigraphic horizon yield a wide range of14C ages, the lower values may be the least altered by old CO2." In this context, it might be argued that a substantive “low” grouping of14C ages observable within the overall14C data sets on short-lived samples from the Thera volcanic destruction level centered about 3326–3328 BP is perhaps more representative of the contemporary atmospheric14C age (without any volcanic CO2contamination). This is a subjective argument (since, in statistical terms, the existing studies using the weighted average remain valid) that looks to support as late a date as reasonable from the14C data. The impact of employing this revised14C age is discussed. In general, a late 17th century BC date range is found (to remain) to be most likelyeven ifsuch a late-dating strategy is followed—a late 17th century BC date range is thus a robust finding from the14C evidence even allowing for various possible variation factors. However, the possibility of a mid-16th century BC date (within ∼1593–1530 cal BC) is increased when compared against previous analyses if the Santorini data are considered in isolation.


2012 ◽  
Vol 98 (4) ◽  
pp. 428-433 ◽  
Author(s):  
Mahmood Reza Gohari ◽  
Reza Khodabakhshi ◽  
Javad Shahidi ◽  
Zeinab Moghadami Fard ◽  
Hossein Foadzi ◽  
...  

Minerals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 621
Author(s):  
Elaheh Talebi ◽  
W. Pratt Rogers ◽  
Tyler Morgan ◽  
Frank A. Drews

Mine workers operate heavy equipment while experiencing varying psychological and physiological impacts caused by fatigue. These impacts vary in scope and severity across operators and unique mine operations. Previous studies show the impact of fatigue on individuals, raising substantial concerns about the safety of operation. Unfortunately, while data exist to illustrate the risks, the mechanisms and complex pattern of contributors to fatigue are not understood sufficiently, illustrating the need for new methods to model and manage the severity of fatigue’s impact on performance and safety. Modern technology and computational intelligence can provide tools to improve practitioners’ understanding of workforce fatigue. Many mines have invested in fatigue monitoring technology (PERCLOS, EEG caps, etc.) as a part of their health and safety control system. Unfortunately, these systems provide “lagging indicators” of fatigue and, in many instances, only provide fatigue alerts too late in the worker fatigue cycle. Thus, the following question arises: can other operational technology systems provide leading indicators that managers and front-line supervisors can use to help their operators to cope with fatigue levels? This paper explores common data sets available at most modern mines and how these operational data sets can be used to model fatigue. The available data sets include operational, health and safety, equipment health, fatigue monitoring and weather data. A machine learning (ML) algorithm is presented as a tool to process and model complex issues such as fatigue. Thus, ML is used in this study to identify potential leading indicators that can help management to make better decisions. Initial findings confirm existing knowledge tying fatigue to time of day and hours worked. These are the first generation of models and future models will be forthcoming.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ying Li ◽  
Yung-Ho Chiu ◽  
Tai-Yu Lin ◽  
Hongyi Cen

Purpose As more women are now being appointed to senior and top management positions and invited to sit on boards of directors, they are now directly participating in strategic company decision-making. As female directors have been found to provide new ideas, increase company competitiveness, efficiency and performance and bring a greater number of external resources to a company than male directors, this paper aims to put female directors as a variable into the data envelopment analysis (DEA) and statistical models to explore the effect of female directors on operating performances. The DEA first quantified and measured the company efficiencies, after which the statistical model analyzed the correlations between the variables to specifically identify the impact of female decision makers on the operating efficiencies in state-owned and private enterprises. Design/methodology/approach A novel two-stage, meta-hybrid dynamic DEA was developed to explore Chinese cultural media company efficiencies under optimal input and output resource allocations, after which Tobit Regression was applied to determine the effect of female executives on these efficiencies. Findings From 2012 to 2016, the overall efficiencies in Chinese state-owned cultural media enterprises were better than in the private cultural media enterprises. The overall technology gaps (TGs) in the state-owned cultural media enterprises were better than in the private cultural media enterprises. Originality/value Previous research has tended to focus on the causal relationships between female senior executives and business performances; however, there have been few studies on the relationships between female executives and company performance from an efficiency perspective (optimal resource allocation). This paper, therefore, is the first to develop a novel two-stage, meta-hybrid dynamic DEA to examine Chinese cultural media enterprise efficiencies, and the first to apply Tobit Regression to assess the effect of female executives on those efficiencies.


Sign in / Sign up

Export Citation Format

Share Document