Prediction and inference diverge in biomedicine: Simulations and real-world data

AbstractIn the 20th century many advances in biological knowledge and evidence-based medicine were supported by p-values and accompanying methods. In the beginning 21st century, ambitions towards precision medicine put a premium on detailed predictions for single individuals. The shift causes tension between traditional methods used to infer statistically significant group differences and burgeoning machine-learning tools suited to forecast an individual’s future. This comparison applies the linear model for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how statistical inference and pattern recognition can agree and diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships. However, even statistically strong findings with very low p-values shed little light on their value for achieving accurate prediction in the same dataset. More complete understanding of different ways to define ‘important’ associations is a prerequisite for reproducible research findings that can serve to personalize clinical care.

Download Full-text

The Value of Real-World Data in Understanding Prostate Cancer Risk and Improving Clinical Care: Examples from Swedish Registries

Cancers ◽

10.3390/cancers13040875 ◽

2021 ◽

Vol 13 (4) ◽

pp. 875

Author(s):

Kerri Beckmann ◽

Hans Garmo ◽

Ingela Franck Lissbrant ◽

Pär Stattin

Keyword(s):

Prostate Cancer ◽

Real World ◽

Clinical Care ◽

Prostate Cancer Risk ◽

Real World Data ◽

Level Of Evidence ◽

World Data ◽

Cancer Data ◽

Using Data ◽

Randomised Controlled

Real-world data (RWD), that is, data from sources other than controlled clinical trials, play an increasingly important role in medical research. The development of quality clinical registers, increasing access to administrative data sources, growing computing power and data linkage capacities have contributed to greater availability of RWD. Evidence derived from RWD increases our understanding of prostate cancer (PCa) aetiology, natural history and effective management. While randomised controlled trials offer the best level of evidence for establishing the efficacy of medical interventions and making causal inferences, studies using RWD offer complementary evidence about the effectiveness, long-term outcomes and safety of interventions in real-world settings. RWD provide the only means of addressing questions about risk factors and exposures that cannot be “controlled”, or when assessing rare outcomes. This review provides examples of the value of RWD for generating evidence about PCa, focusing on studies using data from a quality clinical register, namely the National Prostate Cancer Register (NPCR) Sweden, with longitudinal data on advanced PCa in Patient-overview Prostate Cancer (PPC) and data linkages to other sources in Prostate Cancer data Base Sweden (PCBaSe).

Download Full-text

Clinical care: Real‐world data

Diabetic Medicine ◽

10.1111/dme.5_14244 ◽

2020 ◽

Vol 37 (S1) ◽

pp. 14-16

Keyword(s):

Real World ◽

Clinical Care ◽

Real World Data ◽

World Data

Download Full-text

Quality Improvement in Perinatal Medicine and Translation of Preterm Birth Research Findings into Clinical Care

Clinics in Perinatology ◽

10.1016/j.clp.2018.01.003 ◽

2018 ◽

Vol 45 (2) ◽

pp. 155-163 ◽

Cited By ~ 2

Author(s):

Tracy A. Manuck ◽

Rebecca C. Fry ◽

Barbara L. McFarlin

Keyword(s):

Quality Improvement ◽

Preterm Birth ◽

Clinical Care ◽

Perinatal Medicine ◽

Research Findings

Download Full-text

Systematic review of current natural language processing methods and applications in cardiology

Heart ◽

10.1136/heartjnl-2021-319769 ◽

2021 ◽

pp. heartjnl-2021-319769

Author(s):

Meghan Reading Turchioe ◽

Alexander Volodarskiy ◽

Jyotishman Pathak ◽

Drew N Wright ◽

James Enlou Tcheng ◽

...

Keyword(s):

Systematic Review ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Care ◽

Real World Data ◽

Clinical Text ◽

Clinical Notes ◽

Artery Disease ◽

Automated Methods

Natural language processing (NLP) is a set of automated methods to organise and evaluate the information contained in unstructured clinical notes, which are a rich source of real-world data from clinical care that may be used to improve outcomes and understanding of disease in cardiology. The purpose of this systematic review is to provide an understanding of NLP, review how it has been used to date within cardiology and illustrate the opportunities that this approach provides for both research and clinical care. We systematically searched six scholarly databases (ACM Digital Library, Arxiv, Embase, IEEE Explore, PubMed and Scopus) for studies published in 2015–2020 describing the development or application of NLP methods for clinical text focused on cardiac disease. Studies not published in English, lacking a description of NLP methods, non-cardiac focused and duplicates were excluded. Two independent reviewers extracted general study information, clinical details and NLP details and appraised quality using a checklist of quality indicators for NLP studies. We identified 37 studies developing and applying NLP in heart failure, imaging, coronary artery disease, electrophysiology, general cardiology and valvular heart disease. Most studies used NLP to identify patients with a specific diagnosis and extract disease severity using rule-based NLP methods. Some used NLP algorithms to predict clinical outcomes. A major limitation is the inability to aggregate findings across studies due to vastly different NLP methods, evaluation and reporting. This review reveals numerous opportunities for future NLP work in cardiology with more diverse patient samples, cardiac diseases, datasets, methods and applications.

Download Full-text

Teaching reproducible research for medical students and postgraduate pharmaceutical scientists

BMC Research Notes ◽

10.1186/s13104-021-05862-8 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Andreas D. Meid

Keyword(s):

Doctoral Students ◽

Scientific Community ◽

Natural Sciences ◽

Interdisciplinary Teams ◽

Reproducible Research ◽

Pharmaceutical Sciences ◽

The Public ◽

Academic Settings ◽

Research Findings ◽

Good Research

AbstractIn medicine and other academic settings, (doctoral) students often work in interdisciplinary teams together with researchers of pharmaceutical sciences, natural sciences in general, or biostatistics. They should be fundamentally taught good research practices, especially in terms of statistical analysis. This includes reproducibility as a central aspect. Acknowledging that even experienced researchers and supervisors might be unfamiliar with necessary aspects of a perfectly reproducible workflow, a lecture series on reproducible research (RR) was developed for young scientists in clinical pharmacology. The pilot series highlighted definitions of RR, reasons for RR, potential merits of RR, and ways to work accordingly. In trying to actually reproduce a published analysis, several practical obstacles arose. In this article, reproduction of a working example is commented to emphasize the manifold facets of RR, to provide possible explanations for difficulties and solutions, and to argue that harmonized curricula for (quantitative) clinical researchers should include RR principles. These experiences should raise awareness among educators and students, supervisors and young scientists. RR working habits are not only beneficial for ourselves or our students, but also for other researchers within an institution, for scientific partners, for the scientific community, and eventually for the public profiting from research findings.

Download Full-text

Mathematics and Statistics in Anesthesiology

10.2310/anes.18276 ◽

2018 ◽

Author(s):

Daniel Mortlock

Keyword(s):

Probability Theory ◽

Data Analysis ◽

Survey Design ◽

Probability Distributions ◽

Statistical Tests ◽

Real World Data ◽

Mathematical Functions ◽

P Values ◽

Bayesian Data Analysis ◽

Probability And Statistics

Mathematics is the language of quantitative science, and probability and statistics are the extension of classical logic to real world data analysis and experimental design. The basics of mathematical functions and probability theory are summarized here, providing the tools for statistical modeling and assessment of experimental results. There is a focus on the Bayesian approach to such problems (ie, Bayesian data analysis); therefore, the basic laws of probability are stated, along with several standard probability distributions (eg, binomial, Poisson, Gaussian). A number of standard classical tests (eg, p values, the t-test) are also defined and, to the degree possible, linked to the underlying principles of probability theory. This review contains 5 figures, 1 table, and 15 references. Keywords: Bayesian data analysis, mathematical models, power analysis, probability, p values, statistical tests, statistics, survey design

Download Full-text

Digital Storytelling as a Self-Regulated Learning Tool

Advances in Media, Entertainment, and the Arts - Handbook of Research on Transmedia Storytelling and Narrative Strategies ◽

10.4018/978-1-5225-5357-1.ch011 ◽

2019 ◽

pp. 209-232 ◽

Cited By ~ 1

Author(s):

Sinan Kaya

Keyword(s):

Learning Process ◽

Digital Storytelling ◽

Learning Processes ◽

Learning Tools ◽

Learning Tool ◽

Self Regulated Learning ◽

Regulated Learning ◽

Digital Stories ◽

Research Findings ◽

Made In

The purpose of this chapter is, as a self-regulated learning tool, to focus on digital storytelling by uncovering relationship between digital storytelling and self-regulated learning process/based on research findings made in the its field. Within this focus, firstly, concept of digital storytelling was theoretically addressed; researches made in learning-teaching for use have been presented; later, self-regulated learning processes and strategies have been defined and given examples. Finally, research findings on the use of digital stories as self-regulated learning tools have been shared.

Download Full-text

Future of Small Business E-Commerce

Electronic Commerce ◽

10.4018/978-1-59904-943-4.ch166 ◽

2011 ◽

pp. 2159-2163 ◽

Cited By ~ 1

Author(s):

Simpson Poon

Keyword(s):

Small Business ◽

Small Businesses ◽

Electronic Mail ◽

The Internet ◽

File Transfer Protocol ◽

File Transfer ◽

Research Findings ◽

Use Of The Internet ◽

In The Beginning ◽

E Mail

The use of the Internet for business purposes among small businesses started quite early in the e-commerce evolution. In the beginning, innovative and entrepreneurial owners of small businesses attempted to use rudimentary Internet tools such as electronic mail (e-mail) and file transfer protocol (FTP) to exchange messages and documents. While primitive, it fulfilled much of the business needs at the time. Even to date, e-mail and document exchange, according to some of the latest research findings, are still the most commonly used tools despite the fact that tools themselves have become more sophisticated.

Download Full-text

Resting EEG Measures of Brain Arousal in a Multisite Study of Major Depression

Clinical EEG and Neuroscience ◽

10.1177/1550059418795578 ◽

2018 ◽

Vol 50 (1) ◽

pp. 3-12 ◽

Cited By ~ 11

Author(s):

Christine Ulke ◽

Craig E. Tenke ◽

Jürgen Kayser ◽

Christian Sander ◽

Daniel Böttger ◽

...

Keyword(s):

Low Voltage ◽

Clinical Care ◽

The United States ◽

Alpha Activity ◽

Healthy Controls ◽

Significant Group ◽

Depressed Patients ◽

Eyes Closed ◽

Eeg Recordings ◽

Brain Arousal

Several studies have found upregulated brain arousal during 15-minute EEG recordings at rest in depressed patients. However, studies based on shorter EEG recording intervals are lacking. Here we aimed to compare measures of brain arousal obtained from 2-minute EEGs at rest under eyes-closed condition in depressed patients and healthy controls in a multisite project—Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care (EMBARC). We expected that depressed patients would show stable and elevated brain arousal relative to controls. Eighty-seven depressed patients and 36 healthy controls from four research sites in the United States were included in the analyses. The Vigilance Algorithm Leipzig (VIGALL) was used for the fully automatic classification of EEG-vigilance stages (indicating arousal states) of 1-second EEG segments; VIGALL-derived measures of brain arousal were calculated. We found that depressed patients scored higher on arousal stability ( Z = −2.163, P = .015) and A stages (dominant alpha activity; P = .027) but lower on B1 stages (low-voltage non-alpha activity, P = .008) compared with healthy controls. No significant group differences were observed in Stage B2/3. In summary, we were able to demonstrate stable and elevated brain arousal during brief 2-minute recordings at rest in depressed patients. Results set the stage for examining the value of these measures for predicting clinical response to antidepressants in the entire EMBARC sample and evaluating whether an upregulated brain arousal is particularly characteristic for responders to antidepressants.

Download Full-text

Activities for Students: Using Graphing Calculators to Model Real-World Data

Mathematics Teacher ◽

10.5951/mt.97.5.0328 ◽

2004 ◽

Vol 97 (5) ◽

pp. 328-342

Author(s):

Berchie W. Holliday ◽

Lauren R. Duff

Keyword(s):

Beginning Teachers ◽

Linear Equations ◽

Graphing Calculators ◽

Graphing Calculator ◽

Line Graphs ◽

Data Sets ◽

Linear Modeling ◽

Real World Data ◽

In The Beginning ◽

Best Fit

Mathematics teachers understand that calculators have revolutionized the teaching of secondary school mathematics. After students have demonstrated their abilities to perform such computations without calculators, calculators can free students and teachers from performing redundant computations. Graphing calculators, in particular, free students from computing dependent values needed to construct line graphs, for example. But one problem is how to teach students to use a graphing calculator to plot, calculate, and graph linear equations of best fit from realworld data. Another problem is getting students to engage in the task and construct an increasingly useful conceptualization of linear modeling. In the beginning, teachers should, perhaps, provide direct instruction, followed by modeling how to enter and graph data sets efficiently.

Download Full-text