scholarly journals Detection of data fabrication using statistical tools

2019 ◽  
Author(s):  
Chris Hubertus Joseph Hartgerink ◽  
Jan G. Voelkel ◽  
Jelte M. Wicherts ◽  
Marcel A. L. M. van Assen

Scientific misconduct potentially invalidates findings in many scientific fields. Improved detection of unethical practices like data fabrication is considered to deter such practices. In two studies, we investigated the diagnostic performance of various statistical methods to detect fabricated quantitative data from psychological research. In Study 1, we tested the validity of statistical methods to detect fabricated data at the study level using summary statistics. Using (arguably) genuine data from the Many Labs 1 project on the anchoring effect (k=36) and fabricated data for the same effect by our participants (k=39), we tested the validity of our newly proposed 'reversed Fisher method', variance analyses, and extreme effect sizes, and a combination of these three indicators using the original Fisher method. Results indicate that the variance analyses perform fairly well when the homogeneity of population variances is accounted for and that extreme effect sizes perform similarly well in distinguishing genuine from fabricated data. The performance of the 'reversed Fisher method' was poor and depended on the types of tests included. In Study 2, we tested the validity of statistical methods to detect fabricated data using raw data. Using (arguably) genuine data from the Many Labs 3 project on the classic Stroop task (k=21) and fabricated data for the same effect by our participants (k=28), we investigated the performance of digit analyses, variance analyses, multivariate associations, and extreme effect sizes, and a combination of these four methods using the original Fisher method. Results indicate that variance analyses, extreme effect sizes, and multivariate associations perform fairly well to excellent in detecting fabricated data using raw data, while digit analyses perform at chance levels. The two studies provide mixed results on how the use of random number generators affects the detection of data fabrication. Ultimately, we consider the variance analyses, effect sizes, and multivariate associations valuable tools to detect potential data anomalies in empirical (summary or raw) data. However, we argue against widespread (possible automatic) application of these tools, because some fabricated data may be irregular in one aspect but not in another. Considering how violations of the assumptions of fabrication detection methods may yield high false positive or false negative probabilities, we recommend comparing potentially fabricated data to genuine data on the same topic.

Methodology ◽  
2019 ◽  
Vol 15 (3) ◽  
pp. 97-105
Author(s):  
Rodrigo Ferrer ◽  
Antonio Pardo

Abstract. In a recent paper, Ferrer and Pardo (2014) tested several distribution-based methods designed to assess when test scores obtained before and after an intervention reflect a statistically reliable change. However, we still do not know how these methods perform from the point of view of false negatives. For this purpose, we have simulated change scenarios (different effect sizes in a pre-post-test design) with distributions of different shapes and with different sample sizes. For each simulated scenario, we generated 1,000 samples. In each sample, we recorded the false-negative rate of the five distribution-based methods with the best performance from the point of view of the false positives. Our results have revealed unacceptable rates of false negatives even with effects of very large size, starting from 31.8% in an optimistic scenario (effect size of 2.0 and a normal distribution) to 99.9% in the worst scenario (effect size of 0.2 and a highly skewed distribution). Therefore, our results suggest that the widely used distribution-based methods must be applied with caution in a clinical context, because they need huge effect sizes to detect a true change. However, we made some considerations regarding the effect size and the cut-off points commonly used which allow us to be more precise in our estimates.


2017 ◽  
Author(s):  
Nicholas Alvaro Coles ◽  
Jeff T. Larsen ◽  
Heather Lench

The facial feedback hypothesis suggests that an individual’s experience of emotion is influenced by feedback from their facial movements. To evaluate the cumulative evidence for this hypothesis, we conducted a meta-analysis on 286 effect sizes derived from 138 studies that manipulated facial feedback and collected emotion self-reports. Using random effects meta-regression with robust variance estimates, we found that the overall effect of facial feedback was significant, but small. Results also indicated that feedback effects are stronger in some circumstances than others. We examined 12 potential moderators, and three were associated with differences in effect sizes. 1. Type of emotional outcome: Facial feedback influenced emotional experience (e.g., reported amusement) and, to a greater degree, affective judgments of a stimulus (e.g., the objective funniness of a cartoon). Three publication bias detection methods did not reveal evidence of publication bias in studies examining the effects of facial feedback on emotional experience, but all three methods revealed evidence of publication bias in studies examining affective judgments. 2. Presence of emotional stimuli: Facial feedback effects on emotional experience were larger in the absence of emotionally evocative stimuli (e.g., cartoons). 3. Type of stimuli: When participants were presented with emotionally evocative stimuli, facial feedback effects were larger in the presence of some types of stimuli (e.g., emotional sentences) than others (e.g., pictures). The available evidence supports the facial feedback hypothesis’ central claim that facial feedback influences emotional experience, although these effects tend to be small and heterogeneous.


2003 ◽  
Vol 218 (1) ◽  
pp. 1-3 ◽  
Author(s):  
Stefania M. Mojon-Azzi ◽  
Daniel S. Mojon

PEDIATRICS ◽  
1956 ◽  
Vol 18 (3) ◽  
pp. 362-368
Author(s):  
Ruth T. Gross ◽  
Lincoln E. Moses

Four hundred seven healthy, full-term infants were divided into three groups and fed, respectively, a formula of evaporated milk and water with 5% carbohydrate; human milk; and a special modified evaporated milk designed to simulate human milk. No other foods were added to the diet. A comparison of the three groups was made, based on weight gains from birth to the end of the first 4 weeks. The conclusions refer only to weight gains; no attempt was made to determine the superiority of any particular diet. The data show no significant differences in the 4-week weight gains among the three groups of infants, although sensitive statistical methods could be validly applied to the problem. These methods are explained. The authors wish to emphasize the many variables which must be taken into account in a study of this sort; the necessity for careful selection of valid statistical methods; the importance of critical clinical judgement in the evaluation of the results.


2014 ◽  
Vol 56 (5) ◽  
pp. 447-450 ◽  
Author(s):  
Pablo O. A. Acosta ◽  
Fabiana Granja ◽  
Cátia A. Meneses ◽  
Ismael A. S. Nascimento ◽  
Débora D. Sousa ◽  
...  

Serum samples from 150 NS1-negative (Platelia ELISA) patients presumptively diagnosed with dengue were analyzed by the TaqMan probed real-time reverse transcription PCR (TaqMan qRT-PCR) method. The qRT-PCR positive samples were tested for serotype by semi-nested RT-PCR and a qualitative immunochromatographic assay for IgG and IgM. Molecular detection methods showed 33 (22%) positive samples out of 150 NS1-antigen negative samples. Of these, 72% were collected up to day 2 after the onset of symptoms, when diagnostic sensitivity of NS1-antigen test assays is significantly enhanced. Most of the cases were not characterized as secondary infection. Twenty-eight samples were successfully serotyped, 75% of which for DENV-4, 14% for DENV-2, 7% for DENV-3 and 4% for DENV-1. These findings reaffirm the hyperendemic situation of the state of Roraima and suggest a lower sensitivity of the NS1 test, mainly when DENV-4 is the predominant serotype. Health care providers should therefore be aware of samples tested negative by NS1 antigen assays, especially when clinical symptoms and other laboratory data results show evidence of dengue infection.


2018 ◽  
Vol 30 (1) ◽  
pp. 25-41 ◽  
Author(s):  
Clara R. Grabitz ◽  
Katherine S. Button ◽  
Marcus R. Munafò ◽  
Dianne F. Newbury ◽  
Cyril R. Pernet ◽  
...  

Genetics and neuroscience are two areas of science that pose particular methodological problems because they involve detecting weak signals (i.e., small effects) in noisy data. In recent years, increasing numbers of studies have attempted to bridge these disciplines by looking for genetic factors associated with individual differences in behavior, cognition, and brain structure or function. However, different methodological approaches to guarding against false positives have evolved in the two disciplines. To explore methodological issues affecting neurogenetic studies, we conducted an in-depth analysis of 30 consecutive articles in 12 top neuroscience journals that reported on genetic associations in nonclinical human samples. It was often difficult to estimate effect sizes in neuroimaging paradigms. Where effect sizes could be calculated, the studies reporting the largest effect sizes tended to have two features: (i) they had the smallest samples and were generally underpowered to detect genetic effects, and (ii) they did not fully correct for multiple comparisons. Furthermore, only a minority of studies used statistical methods for multiple comparisons that took into account correlations between phenotypes or genotypes, and only nine studies included a replication sample or explicitly set out to replicate a prior finding. Finally, presentation of methodological information was not standardized and was often distributed across Methods sections and Supplementary Material, making it challenging to assemble basic information from many studies. Space limits imposed by journals could mean that highly complex statistical methods were described in only a superficial fashion. In summary, methods that have become standard in the genetics literature—stringent statistical standards, use of large samples, and replication of findings—are not always adopted when behavioral, cognitive, or neuroimaging phenotypes are used, leading to an increased risk of false-positive findings. Studies need to correct not just for the number of phenotypes collected but also for the number of genotypes examined, genetic models tested, and subsamples investigated. The field would benefit from more widespread use of methods that take into account correlations between the factors corrected for, such as spectral decomposition, or permutation approaches. Replication should become standard practice; this, together with the need for larger sample sizes, will entail greater emphasis on collaboration between research groups. We conclude with some specific suggestions for standardized reporting in this area.


2018 ◽  
Vol 8 (1) ◽  
pp. 3-19 ◽  
Author(s):  
Yuanyuan Zhou ◽  
Susan Troncoso Skidmore

Historically, ANOVA has been the most prevalent statistical method used in educational and psychological research and today ANOVA continues to be widely used.  A comprehensive review published in 1998 examined several APA journals and discovered persistent concerns in ANOVA reporting practices.  The present authors examined all articles published in 2012 in three APA journals (Journal of Applied Psychology, Journal of Counseling Psychology, and Journal of Personality and Social Psychology) to review ANOVA reporting practices including p values and effect sizes.  Results indicated that ANOVA continues to be prevalent in the reviewed journals as a test of the primary research question, as well as to test conditional assumptions prior to the primary analysis.  Still, ANOVA reporting practices are essentially unchanged from what was previously reported.  However, effect size reporting has improved.


Open Biology ◽  
2018 ◽  
Vol 8 (9) ◽  
pp. 180121 ◽  
Author(s):  
Anna Ovcharenko ◽  
Andrea Rentmeister

RNA methylations play a significant regulatory role in diverse biological processes. Although the transcriptome-wide discovery of unknown RNA methylation sites is essential to elucidate their function, the development of a bigger variety of detection approaches is desirable for multiple reasons. Many established detection methods for RNA modifications heavily rely on the specificity of the respective antibodies. Thus, the development of antibody-independent transcriptome-wide methods is beneficial. Even the antibody-independent high-throughput sequencing-based methods are liable to produce false-positive or false-negative results. The development of an independent method for each modification could help validate the detected modification sites. Apart from the transcriptome-wide methods for methylation detection de novo , methods for monitoring the presence of a single methylation at a determined site are also needed. In contrast to the transcriptome-wide detection methods, the techniques used for monitoring purposes need to be cheap, fast and easy to perform. This review considers modern approaches for site-specific detection of methylated nucleotides in RNA. We also discuss the potential of third-generation sequencing methods for direct detection of RNA methylations.


2016 ◽  
Vol 20 (4) ◽  
pp. 639-664 ◽  
Author(s):  
Christopher D. Nye ◽  
Paul R. Sackett

Moderator hypotheses involving categorical variables are prevalent in organizational and psychological research. Despite their importance, current methods of identifying and interpreting these moderation effects have several limitations that may result in misleading conclusions about their implications. This issue has been particularly salient in the literature on differential prediction where recent research has suggested that these limitations have had a significant impact on past research. To help address these issues, we propose several new effect size indices that provide additional information about categorical moderation analyses. The advantages of these indices are then illustrated in two large databases of respondents by examining categorical moderation in the prediction of psychological well-being and the extent of differential prediction in a large sample of job incumbents.


2002 ◽  
Vol 85 (3) ◽  
pp. 787-791 ◽  
Author(s):  
Gert van Duijn ◽  
Ria van Biert ◽  
Henriette Bleeker-Marcelis ◽  
Ineke van Boeijen ◽  
Abdi Jama Adan ◽  
...  

Abstract According to European Commission (EC) Regulation 1139/98, foods and food ingredients that are to be delivered to the final consumer in which either protein or DNA resulting from genetic modification is present, shall be subject to additional specific labeling requirements. Since 1994, genetically altered tomatoes, squash, potatoes, canola, cotton, and soy have been on the market. Recently, insect-resistant and herbicide-tolerant maize varieties have been introduced. Soy and maize are 2 of the most important vegetable crops in the world. During the past 4 years, both protein- and DNA-based methods have been developed and applied for detection of transgenic soy and maize, and their derivatives. For protein-based detection, specific monoclonal and polyclonal antibodies have been developed; for immunochemical detection, Western blot analysis and enzyme-linked immunosorbent assays are the most prominent examples. For detection of genetically modified organisms (GMOs) at the level of DNA, polymerase chain reaction-based methods are mainly used. For these reactions, highly specific primer sets are needed. This study compares the principally different methods. Specificity of methods and the possible risks of false-positive or false-negative results are considered in relation to sampling, matrix effects, and food processing procedures. In addition, quantitative aspects of protein- and DNA-based GM detection methods are presented and discussed. This is especially relevant as EC regulation 49/2000, which defines a threshold for an unintentional comingling of 1%, came into force on April 10, 2000.


Sign in / Sign up

Export Citation Format

Share Document