Maximum Likelihood Inference for Multiple Regression with Missing Values: A Simulation Study

Author(s):  
Roderick J. A. Little
2021 ◽  
Author(s):  
Suha Naser-Khdour ◽  
Rob Lanfear ◽  
Bui Quang Minh

Phylogenetic inference typically assumes that the data has evolved under Stationary, Reversible and Homogeneous (SRH) conditions. Many empirical and simulation studies have shown that assuming SRH conditions can lead to significant errors in phylogenetic inference when the data violates these assumptions. Yet, many simulation studies focused on extreme non-SRH conditions that represent worst-case scenarios and not the average empirical dataset. In this study, we simulate datasets under various degrees of non-SRH conditions using empirically derived parameters to mimic real data and examine the effects of incorrectly assuming SRH conditions on inferring phylogenies. Our results show that maximum likelihood inference is generally quite robust to a wide range of SRH model violations but is inaccurate under extreme convergent evolution.


2021 ◽  
Author(s):  
Jakob Raymaekers ◽  
Peter J. Rousseeuw

AbstractMany real data sets contain numerical features (variables) whose distribution is far from normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box–Cox and Yeo–Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformation parameter is highly sensitive to outliers, and will often try to move outliers inward at the expense of the normality of the central part of the data. We propose a modification of these transformations as well as an estimator of the transformation parameter that is robust to outliers, so the transformed data can be approximately normal in the center and a few outliers may deviate from it. It compares favorably to existing techniques in an extensive simulation study and on real data.


Author(s):  
Duha Hamed ◽  
Ahmad Alzaghal

AbstractA new generalized class of Lindley distribution is introduced in this paper. This new class is called the T-Lindley{Y} class of distributions, and it is generated by using the quantile functions of uniform, exponential, Weibull, log-logistic, logistic and Cauchy distributions. The statistical properties including the modes, moments and Shannon’s entropy are discussed. Three new generalized Lindley distributions are investigated in more details. For estimating the unknown parameters, the maximum likelihood estimation has been used and a simulation study was carried out. Lastly, the usefulness of this new proposed class in fitting lifetime data is illustrated using four different data sets. In the application section, the strength of members of the T-Lindley{Y} class in modeling both unimodal as well as bimodal data sets is presented. A member of the T-Lindley{Y} class of distributions outperformed other known distributions in modeling unimodal and bimodal lifetime data sets.


2014 ◽  
Vol 111 (46) ◽  
pp. 16448-16453 ◽  
Author(s):  
Yun Yu ◽  
Jianrong Dong ◽  
Kevin J. Liu ◽  
Luay Nakhleh

Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-14
Author(s):  
Ahmed A. Mahmoud ◽  
Sarat C. Dass ◽  
Mohana S. Muthuvalu ◽  
Vijanth S. Asirvadam

This article presents statistical inference methodology based on maximum likelihoods for delay differential equation models in the univariate setting. Maximum likelihood inference is obtained for single and multiple unknown delay parameters as well as other parameters of interest that govern the trajectories of the delay differential equation models. The maximum likelihood estimator is obtained based on adaptive grid and Newton-Raphson algorithms. Our methodology estimates correctly the delay parameters as well as other unknown parameters (such as the initial starting values) of the dynamical system based on simulation data. We also develop methodology to compute the information matrix and confidence intervals for all unknown parameters based on the likelihood inferential framework. We present three illustrative examples related to biological systems. The computations have been carried out with help of mathematical software: MATLAB® 8.0 R2014b.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Sae Ono ◽  
Hiroto Ogi ◽  
Masato Ogawa ◽  
Daisuke Nakamura ◽  
Teruhiko Nakamura ◽  
...  

Abstract Background Sleep problems in preschool children can stunt their health and growth. However, the factors that cause sleep problems in children are not well understood. The aim of this study was to determine the relationship between parents’ health literacy (HL) and children’s sleep problems. The study was conducted at two kindergartens, two nursery schools, and a center for early childhood education in Chitose-city, Hokkaido, Japan. Method This study used a multicenter cross-sectional design. The sample comprised 354 preschoolers (aged 3–6 years) and their parents. In families with two or more children attending the same facility, only the oldest child was asked to participate in the study. Exclusion criteria included participants whose completed questionnaires had missing values. Children’s sleep problems were assessed using the Japanese version of the Children’s Sleep Habits Questionnaire (CSHQ-J). Parents’ HL was assessed using the 14-item Health Literacy Scale (HLS-14). The parents were classified into two groups (high HL group and low HL group). Multiple regression modelling was used to determine the association between HLS-14 and CSHQ-J scores. Results Of the 354 parents, 255 (72%) were in the high HL group and 99 (28%) in the low HL group. The mean CSHQ-J score was significantly lower in the high HL group than in the low HL group (45.3 ± 6.0 points vs. 46.8 ± 5.9 points, p = 0.043). In multiple regression analyses, parents’ HL was independently associated with their CSHQ-J score after adjusting for all confounding factors (adjusted R2 = 0.22, β = − 0.11; p = 0.043). Conclusions Parents’ HL appears to affect their children’s sleep problems. This finding suggests that parents’ HL may be a target for intervention to improve children’s sleep problems.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Mar Rodríguez-Girondo ◽  
Niels van den Berg ◽  
Michel H. Hof ◽  
Marian Beekman ◽  
Eline Slagboom

Abstract Background Although human longevity tends to cluster within families, genetic studies on longevity have had limited success in identifying longevity loci. One of the main causes of this limited success is the selection of participants. Studies generally include sporadically long-lived individuals, i.e. individuals with the longevity phenotype but without a genetic predisposition for longevity. The inclusion of these individuals causes phenotype heterogeneity which results in power reduction and bias. A way to avoid sporadically long-lived individuals and reduce sample heterogeneity is to include family history of longevity as selection criterion using a longevity family score. A main challenge when developing family scores are the large differences in family size, because of real differences in sibship sizes or because of missing data. Methods We discussed the statistical properties of two existing longevity family scores: the Family Longevity Selection Score (FLoSS) and the Longevity Relatives Count (LRC) score and we evaluated their performance dealing with differential family size. We proposed a new longevity family score, the mLRC score, an extension of the LRC based on random effects modeling, which is robust for family size and missing values. The performance of the new mLRC as selection tool was evaluated in an intensive simulation study and illustrated in a large real dataset, the Historical Sample of the Netherlands (HSN). Results Empirical scores such as the FLOSS and LRC cannot properly deal with differential family size and missing data. Our simulation study showed that mLRC is not affected by family size and provides more accurate selections of long-lived families. The analysis of 1105 sibships of the Historical Sample of the Netherlands showed that the selection of long-lived individuals based on the mLRC score predicts excess survival in the validation set better than the selection based on the LRC score . Conclusions Model-based score systems such as the mLRC score help to reduce heterogeneity in the selection of long-lived families. The power of future studies into the genetics of longevity can likely be improved and their bias reduced, by selecting long-lived cases using the mLRC.


Sign in / Sign up

Export Citation Format

Share Document