Disregarding Data Due Diligence Versus Checking and Communicating Parametric Statistical Testing Procedure Assumptions

Author(s):  
Phillip B. Rowles
2014 ◽  
Vol 31 (2) ◽  
pp. 184-204 ◽  
Author(s):  
Zafar Iqbal ◽  
Nigel P. Grigg ◽  
K. Govinderaju ◽  
Nicola Campbell-Allen

Purpose – Quality function deployment (QFD) is a methodology to translate the “voice of the customer” into engineering/technical specifications (HOWs) to be followed in designing of products or services. For the method to be effective, QFD practitioners need to be able to accurately differentiate between the final weights (FWs) that have been assigned to HOWs in the house of quality matrix. The paper aims to introduce a statistical testing procedure to determine whether the FWs of HOWs are significantly different and investigate the robustness of different rating scales used in QFD practice in contributing to these differences. Design/methodology/approach – Using a range of published QFD examples, the paper uses a parametric bootstrap testing procedure to test the significance of the differences between the FWs by generating simulated random samples based on a theoretical probability model. The paper then determines the significance or otherwise of the differences between: the two most extreme FWs and all pairs of FWs. Finally, the paper checks the robustness of different attribute rating scales (linear vs non-linear) in the context of these testing procedures. Findings – The paper demonstrates that not all of the differences that exist between the FWs of HOW attributes are in fact significant. In the absence of such a procedure, there is no reliable analytical basis for QFD practitioners to determine whether FWs are significantly different, and they may wrongly prioritise one engineering attribute over another. Originality/value – This is the first article to test the significance of the differences between FWs of HOWs and to determine the robustness of different strength of scales used in relationship matrix.


Author(s):  
I. Artico ◽  
I. Smolyarenko ◽  
V. Vinciotti ◽  
E. C. Wit

The putative scale-free nature of real-world networks has generated a lot of interest in the past 20 years: if networks from many different fields share a common structure, then perhaps this suggests some underlying ‘network law’. Testing the degree distribution of networks for power-law tails has been a topic of considerable discussion. Ad hoc statistical methodology has been used both to discredit power-laws as well as to support them. This paper proposes a statistical testing procedure that considers the complex issues in testing degree distributions in networks that result from observing a finite network, having dependent degree sequences and suffering from insufficient power. We focus on testing whether the tail of the empirical degrees behaves like the tail of a de Solla Price model, a two-parameter power-law distribution. We modify the well-known Kolmogorov–Smirnov test to achieve even sensitivity along the tail, considering the dependence between the empirical degrees under the null distribution, while guaranteeing sufficient power of the test. We apply the method to many empirical degree distributions. Our results show that power-law network degree distributions are not rare, classifying almost 65% of the tested networks as having a power-law tail with at least 80% power.


2015 ◽  
Vol 2015 ◽  
pp. 1-10
Author(s):  
Kai Wang ◽  
Qing Zhao ◽  
Jianwei Lu ◽  
Tianwei Yu

With modern technologies such as microarray, deep sequencing, and liquid chromatography-mass spectrometry (LC-MS), it is possible to measure the expression levels of thousands of genes/proteins simultaneously to unravel important biological processes. A very first step towards elucidating hidden patterns and understanding the massive data is the application of clustering techniques. Nonlinear relations, which were mostly unutilized in contrast to linear correlations, are prevalent in high-throughput data. In many cases, nonlinear relations can model the biological relationship more precisely and reflect critical patterns in the biological systems. Using the general dependency measure, Distance Based on Conditional Ordered List (DCOL) that we introduced before, we designed the nonlinearK-profiles clustering method, which can be seen as the nonlinear counterpart of theK-means clustering algorithm. The method has a built-in statistical testing procedure that ensures genes not belonging to any cluster do not impact the estimation of cluster profiles. Results from extensive simulation studies showed thatK-profiles clustering not only outperformed traditional linearK-means algorithm, but also presented significantly better performance over our previous General Dependency Hierarchical Clustering (GDHC) algorithm. We further analyzed a gene expression dataset, on whichK-profile clustering generated biologically meaningful results.


2020 ◽  
Author(s):  
Teague R Henry ◽  
Donald Robinaugh ◽  
Eiko I Fried

The combination of network theory and network psychometric methods have opened up a variety of new ways to conceptualize and study psychological disorders. The idea of psychological disorders as dynamic systems has sparked interest in developing interventions based on results of network analytic tools. However, estimating a network model is not sufficient for determining which symptoms might be most effective to intervene upon, and is not sufficient for determining the potential efficacy of any given intervention. In this paper, we attempt to remedy this gap by introducing fundamental concepts in control theory to both methodologists and applied psychologists. We show how two controllability measures, average and modal controllability, can be used to select the best set of intervention targets. We provide a statistical testing procedure for determining if the dynamical systems of different people have the same optimal intervention targets. Following that, we show how intervention scientists can probe the effects of both theoretical and empirical interventions on networks derived from real data; demonstrate how simulations can be used to account for intervention cost and the desire to reduce specific symptoms; introduce a metric for evaluating intervention efficacy, the intervention efficacy ratio (IER); and showcase how between subject heterogeneity in intervention response can be evaluated. Every step is illustrated using rich clinical EMA data from a sample of subjects undergoing treatment for complicated grief, with a focus on the outcome `Suicidal Ideation'. All methods are implemented in an open-source R package ''netcontrol'', and complete code for replicating the analyses in this manuscript is available at https://osf.io/f268v/.


Kybernetes ◽  
2014 ◽  
Vol 43 (1) ◽  
pp. 82-91 ◽  
Author(s):  
Daniel Peter Berrar ◽  
Alfons Schuster

Purpose – The purpose of this paper is to investigate the relevance and the appropriateness of Turing-style tests for computational creativity. Design/methodology/approach – The Turing test is both a milestone and a stumbling block in artificial intelligence (AI). For more than half a century, the “grand goal of passing the test” has taught the authors many lessons. Here, the authors analyze the relevance of these lessons for computational creativity. Findings – Like the burgeoning AI, computational creativity concerns itself with fundamental questions such as “Can machines be creative?” It is indeed possible to frame such questions as empirical, Turing-style tests. However, such tests entail a number of intricate and possibly unsolvable problems, which might easily lead the authors into old and new blind alleys. The authors propose an outline of an alternative testing procedure that is fundamentally different from Turing-style tests. This new procedure focuses on the unfolding of creativity over time, and – unlike Turing-style tests – it is amenable to a more meaningful statistical testing. Research limitations/implications – This paper argues against Turing-style tests for computational creativity. Practical implications – This paper opens a new avenue for viable and more meaningful testing procedures. Originality/value – The novel contributions are: an analysis of seven lessons from the Turing test for computational creativity; an argumentation against Turing-style tests; and a proposal of a new testing procedure.


1980 ◽  
Vol 19 (02) ◽  
pp. 83-87 ◽  
Author(s):  
B. Page

This paper deals with the application of mathematical methods to the inventory of blood units in a regional supply system.A complex simulation model was designed and implemented in the discrete simulation language SIMSCRIPT to investigate alternative inventory and distribution policies for blood units by means of defined objective criteria. These procedures include a method for inventory forecasting and blood donor call-up and a heuristic allocation model for blood units.Empirical data was collected from the Berlin Blood Donor Service for model construction and validation.The simulation results suggest that the inventory procedures presented may lead to a considerable cost reduction in a regional blood banking system.A specific statistical testing procedure was employed to justify the simulation results statistically. In the final section it is suggested that the inventory methods described in this paper can be implemented within a computer information system providing the blood bank management with decision aids.


2002 ◽  
Vol 18 (1) ◽  
pp. 78-84 ◽  
Author(s):  
Eva Ullstadius ◽  
Jan-Eric Gustafsson ◽  
Berit Carlstedt

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.


Sign in / Sign up

Export Citation Format

Share Document