statistical assumptions
Recently Published Documents


TOTAL DOCUMENTS

112
(FIVE YEARS 33)

H-INDEX

16
(FIVE YEARS 4)

2021 ◽  
Vol 21 (4) ◽  
pp. 1021-1027
Author(s):  
Brian P. Shaw

Researchers and psychometricians have long used Cronbach’s α as a measure of reliability. However, there have been growing calls to replace Cronbach’s α with measures that have more defensible assumptions. One of the most common and straightforward recommended reliability estimates is ω. After a review of reliability and its estimation in Stata, I introduce the community-contributed command omegacoef. This command reports McDonald’s ω in a format similar to the base alpha command. omegacoef provides Stata users the ability to easily compute estimates of reliability with the confidence that the necessary statistical assumptions are met.


Cureus ◽  
2021 ◽  
Author(s):  
Anthony V Christiano ◽  
Daniel A London ◽  
Joseph P Barbera ◽  
Gregory M Frechette ◽  
Stephen R Selverian ◽  
...  

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257141
Author(s):  
Corey J. A. Bradshaw ◽  
Justin M. Chalker ◽  
Stefani A. Crabtree ◽  
Bart A. Eijkelkamp ◽  
John A. Long ◽  
...  

The pursuit of simple, yet fair, unbiased, and objective measures of researcher performance has occupied bibliometricians and the research community as a whole for decades. However, despite the diversity of available metrics, most are either complex to calculate or not readily applied in the most common assessment exercises (e.g., grant assessment, job applications). The ubiquity of metrics like the h-index (h papers with at least h citations) and its time-corrected variant, the m-quotient (h-index ÷ number of years publishing) therefore reflect the ease of use rather than their capacity to differentiate researchers fairly among disciplines, career stage, or gender. We address this problem here by defining an easily calculated index based on publicly available citation data (Google Scholar) that corrects for most biases and allows assessors to compare researchers at any stage of their career and from any discipline on the same scale. Our ε′-index violates fewer statistical assumptions relative to other metrics when comparing groups of researchers, and can be easily modified to remove inherent gender biases in citation data. We demonstrate the utility of the ε′-index using a sample of 480 researchers with Google Scholar profiles, stratified evenly into eight disciplines (archaeology, chemistry, ecology, evolution and development, geology, microbiology, ophthalmology, palaeontology), three career stages (early, mid-, late-career), and two genders. We advocate the use of the ε′-index whenever assessors must compare research performance among researchers of different backgrounds, but emphasize that no single index should be used exclusively to rank researcher capability.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Leonardo Alexandre ◽  
Rafael S. Costa ◽  
Rui Henriques

Abstract Background A considerable number of data mining approaches for biomedical data analysis, including state-of-the-art associative models, require a form of data discretization. Although diverse discretization approaches have been proposed, they generally work under a strict set of statistical assumptions which are arguably insufficient to handle the diversity and heterogeneity of clinical and molecular variables within a given dataset. In addition, although an increasing number of symbolic approaches in bioinformatics are able to assign multiple items to values occurring near discretization boundaries for superior robustness, there are no reference principles on how to perform multi-item discretizations. Results In this study, an unsupervised discretization method, DI2, for variables with arbitrarily skewed distributions is proposed. Statistical tests applied to assess differences in performance confirm that DI2 generally outperforms well-established discretizations methods with statistical significance. Within classification tasks, DI2 displays either competitive or superior levels of predictive accuracy, particularly delineate for classifiers able to accommodate border values. Conclusions This work proposes a new unsupervised method for data discretization, DI2, that takes into account the underlying data regularities, the presence of outlier values disrupting expected regularities, as well as the relevance of border values. DI2 is available at https://github.com/JupitersMight/DI2


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Joseph M. Lukens ◽  
Kody J. H. Law ◽  
Ryan S. Bennink

AbstractThe method of classical shadows proposed by Huang, Kueng, and Preskill heralds remarkable opportunities for quantum estimation with limited measurements. Yet its relationship to established quantum tomographic approaches, particularly those based on likelihood models, remains unclear. In this article, we investigate classical shadows through the lens of Bayesian mean estimation (BME). In direct tests on numerical data, BME is found to attain significantly lower error on average, but classical shadows prove remarkably more accurate in specific situations—such as high-fidelity ground truth states—which are improbable in a fully uniform Hilbert space. We then introduce an observable-oriented pseudo-likelihood that successfully emulates the dimension-independence and state-specific optimality of classical shadows, but within a Bayesian framework that ensures only physical states. Our research reveals how classical shadows effect important departures from conventional thinking in quantum state estimation, as well as the utility of Bayesian methods for uncovering and formalizing statistical assumptions.


2021 ◽  
Vol 43 ◽  
pp. e53721
Author(s):  
Luiz Rafael Clóvis ◽  
Ronald José Barth Pinto ◽  
Renan Santos Uhdre ◽  
Jocimar Costa Rosa ◽  
Hugo Zeni Neto ◽  
...  

The objective of this study was to conduct a meta-analysis and test its efficiency in summarizing the heterogeneous data of heritability estimates for the traits of grain yield (GY) and popping expansion (PE), and to provide reliable estimates of selection gains in popcorn. Therefore, 97 heritability estimates ( ) for popcorn GY and PE in the broad and narrow sense were used. The main procedures underlying the estimation of the combined heritability ( ) using the technique of meta-analysis consisted of i) an exploratory analysis of the set of heritability estimates to detect outliers using a box-plot chart, ii) the verification of the required statistical assumptions, iii) testing the involved heritability estimates for homogeneity, and iv) the calculation of the estimates of combined heritability. The meta-analysis facilitated the synthesis of the information pertaining to heritability in popcorn. The combined heritability estimates ( ) in the broad sense for GY and PE were 0.5208 ± 0.0229 and 0.6356 ± 0.0209, respectively, and in the narrow sense were 0.3290 ± 0.0292 and 0.3083 ± 0.0298, respectively.


2021 ◽  
Author(s):  
Jeenu Mathai ◽  
Pradeep Mujumdar

Abstract. Streamflow indices are flow descriptors that quantify the streamflow dynamics, which are usually determined for a specific basin and are distinct from other basin features. The flow descriptors are appropriate for large-scale and comparative hydrology studies, independent of statistical assumptions and can distinguish signals that indicate basin behavior over time. In this paper, the characteristic features of the hydrograph's temporal asymmetry due to its different underlying hydrologic processes are primarily highlighted. Streamflow indices linked to each limb of the hydrograph within the time-irreversibility paradigm are distinguished with respect to its processes driving the rising and falling limbs. Various streamflow indices relating the rising and falling limbs, and the catchment attributes such as climate, topography, vegetation, geology and soil are then correlated. Finally, the key attributes governing rising and falling limbs are identified. The novelty of the work is on differentiating hydrographs by their time irreversibility property and offering an alternative way to recognize primary drivers of streamflow hydrographs. A set of streamflow indices at the catchment scale for 671 basins in the Contiguous United States (CONUS) is presented here. These streamflow indices complement the catchment attributes provided earlier (Addor et al., 2017) for the CAMELS data set. A series of spatial maps describing the streamflow indices and their regional variability over the CONUS is illustrated in this study.


2021 ◽  
Vol 48 (3) ◽  
Author(s):  
Muhammet O. Yalçin ◽  
◽  
Nevin Güler Dincer ◽  
Serdar Demir ◽  
◽  
...  

In statistical and econometric researches, three types of data are mostly used as cross-section, time series and panel data. Cross-section data are obtained by collecting the observations related to the same variables of many units at constant time. Time series data are data type consisted of observations measured at successive time points for single unit. Sometimes, the number of observations in cross-sectional or time series data is insufficient for carrying out the statistical or econometric analysis. In that cases, panel data obtained by combining cross-section and time series data are often used. Panel data analysis (PDA) has some advantages such as increasing the number of observations and freedom degree, decreasing of multicollinearity, and obtaining more efficient and consistent predictions results with more data information. However, PDA requires to satisfy some statistical assumptions such as “heteroscedasticity”, “autocorrelation”, “correlation between units”, and “stationarity”. It is too difficult to hold these assumptions in real-time applications. In this study, fuzzy panel data analysis (FPDA) is proposed in order to overcome these drawbacks of PDA. FPDA is based on predicting the parameters of panel data regression as triangular fuzzy number. In order to validate the performance of efficiency of FPDA, FPDA, and PDA are applied to panel data consisted of gross domestic production data from five country groups between the years of 2005-2013 and the prediction performances of them are compared by using three criteria such mean absolute percentage error, root mean square error, and variance accounted for. All analyses are performed in R 3.5.2. As a result of analysis, it is observed that FPDA is an efficient and practical method, especially in case required statistical assumptions are not satisfied.


2021 ◽  
Author(s):  
Matthew J Simpson ◽  
Alexander Browning ◽  
David James Warne ◽  
Oliver J Maclaren ◽  
Ruth E Baker

Sigmoid growth models, such as the logistic and Gompertz growth models, are widely used to study various population dynamics ranging from microscopic populations of cancer cells, to continental-scale human populations. Fundamental questions about model selection and precise parameter estimation are critical if these models are to be used to make useful inferences about underlying ecological mechanisms. However, the question of parameter identifiability for these models -- whether a data set contains sufficient information to give unique or sufficiently precise parameter estimates for the given model -- is often overlooked; We use a profile-likelihood approach to systematically explore practical parameter identifiability using data describing the re-growth of hard coral cover on a coral reef after some ecological disturbance. The relationship between parameter identifiability and checks of model misspecification is also explored. We work with three standard choices of sigmoid growth models, namely the logistic, Gompertz, and Richards' growth models; We find that the logistic growth model does not suffer identifiability issues for the type of data we consider whereas the Gompertz and Richards' models encounter practical non-identifiability issues, even with relatively-extensive data where we observe the full shape of the sigmoid growth curve. Identifiability issues with the Gompertz model lead us to consider a further model calibration exercise in which we fix the initial density to its observed value, neglecting its uncertainty. This is a common practice, but the results of this exercise suggest that parameter estimates and fundamental statistical assumptions are extremely sensitive under these conditions; Different sigmoid growth models are used within subdisciplines within the biology and ecology literature without necessarily considering whether parameters are identifiable or checking statistical assumptions underlying model family adequacy. Standard practices that do not consider parameter identifiability can lead to unreliable or imprecise parameter estimates and hence potentially misleading interpretations of the underlying mechanisms of interest. While tools in this work focus on three standard sigmoid growth models and one particular data set, our theoretical developments are applicable to any sigmoid growth model and any relevant data set. MATLAB implementations of all software available on GitHub.


2021 ◽  
Vol 5 ◽  
Author(s):  
Matt N Williams

Most researchers and students in psychology learn of S. S. Stevens’ scales or “levels” of measurement (nominal, ordinal, interval, and ratio), and of his rules setting out which statistical analyses are admissible with each measurement level. Many are nevertheless left confused about the basis of these rules, and whether they should be rigidly followed. In this article, I attempt to provide an accessible explanation of the measurement-theoretic concerns that led Stevens to argue that certain types of analyses are inappropriate with data of particular levels of measurement. I explain how these measurement-theoretic concerns are distinct from the statistical assumptions underlying data analyses, which rarely include assumptions about levels of measurement. The level of measurement of observations can nevertheless have important implications for statistical assumptions. I conclude that researchers may find it more useful to critically investigate the plausibility of the statistical assumptions underlying analyses than to limit themselves to the set of analyses that Stevens believed to be admissible with data of a given level of measurement.


Sign in / Sign up

Export Citation Format

Share Document