The Utility of Item-Level Analyses in Model Evaluation: A Reply to Seidenberg and Plaut

Seidenberg and Plaut (this issue) argue that the implications of our analyses (Spieler & Balota, 1997) for the two extant connectionist models of word naming are limited by two factors. First, variables outside the scope of these models influence naming performance, so it is not surprising that the models do not account for much of the variance at the item level. Second, there is error variance associated with large item-level data sets that obviously should not be captured by these models. We point out that there are a number of variables that have been incorporated within the targeted connectionist models that should provide these models an advantage over the simple predictor variables that we selected as a baseline to evaluate the efficacy of the models (e.g., log frequency, length in letters, and number of orthographic neighbors). We also point out that there is considerable consistency across four large-scale studies of item means. Finally, we provide evidence that even under conditions of a standard word-naming study (with a small set of items), simple word frequency, orthographic neighborhoods, and length accounted for more variance than the extant connectionist models. We conclude that item-level analyses provide an important source of evidence in the evaluation of current models and the development of future models of visual word recognition.

Download Full-text

Bringing Computational Models of Word Naming Down to the Item Level

Psychological Science ◽

10.1111/j.1467-9280.1997.tb00453.x ◽

1997 ◽

Vol 8 (6) ◽

pp. 411-416 ◽

Cited By ~ 112

Author(s):

Daniel H. Spieler ◽

David A. Balota

Keyword(s):

Word Recognition ◽

Computational Models ◽

Predictive Power ◽

Neighborhood Density ◽

Word Naming ◽

Connectionist Models ◽

Item Level ◽

The Individual ◽

Level Performance ◽

Naming Latencies

Early noncomputational models of word recognition have typically attempted to account for effects of categorical factors such as word frequency (high vs low) and spelling-to-sound regularity (regular vs irregular) More recent computational models that adhere to general connectionist principles hold the promise of being sensitive to underlying item differences that are only approximated by these categorical factors In contrast to earlier models, these connectionist models provide predictions of performance for individual items In the present study, we used the item-level estimates from two connectionist models (Plaut, McClelland, Seidenberg, & Patterson, 1996, Seidenberg & McClelland, 1989) to predict naming latencies on the individual items on which the models were trained The results indicate that the models capture, at best, slightly more variance than simple log frequency and substantially less than the combined predictive power of log frequency, neighborhood density, and orthographic length. The discussion focuses on the importance of examining the item-level performance of word-naming models and possible approaches that may improve the models' sensitivity to such item differences

Download Full-text

Evaluating Word-Reading Models at the Item Level: Matching the Grain of Theory and Data

Psychological Science ◽

10.1111/1467-9280.00046 ◽

1998 ◽

Vol 9 (3) ◽

pp. 234-237 ◽

Cited By ~ 28

Author(s):

Mark S. Seidenberg ◽

David C. Plaut

Keyword(s):

Human Performance ◽

Word Reading ◽

Error Variance ◽

Future Research ◽

Simulation Data ◽

Item Level ◽

Two Factors ◽

Item Data ◽

Theoretical Issues ◽

Models Of Reading

Spieler and Balota (1997) showed that connectionist models of reading account for relatively little item-specific variance. In assessing this finding, it is important to recognize two factors that limit how much variance such models can possibly explain. First, item means are affected by several factors that are not addressed in existing models, including processes involved in recognizing letters and producing articulatory output. These limitations point to important areas for future research but have little bearing on existing theoretical claims. Second, the item data include a substantial amount of error variance that would be inappropriate to model. Issues concerning comparisons between simulation data and human performance are discussed with an emphasis on the importance of evaluating models at a level of specificity (“grain”) appropriate to the theoretical issues being addressed.

Download Full-text

Faculty Opinions recommendation of Comparative assessment of large-scale data sets of protein-protein interactions.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1006598.82257 ◽

2002 ◽

Author(s):

Rob Russell

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Comparative Assessment ◽

Data Sets ◽

Protein Protein Interactions ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Integrative Data Analysis from a Unifying Research Synthesis Perspective

10.1093/oso/9780190676001.003.0020 ◽

2018 ◽

Author(s):

Eun-Young Mun ◽

Anne E. Ray

Keyword(s):

Data Analysis ◽

Large Scale ◽

Research Synthesis ◽

Alcohol Intervention ◽

Data Set ◽

Integrative Data Analysis ◽

Level Data ◽

Model Complex ◽

Wide Range ◽

Individual Participant

Integrative data analysis (IDA) is a promising new approach in psychological research and has been well received in the field of alcohol research. This chapter provides a larger unifying research synthesis framework for IDA. Major advantages of IDA of individual participant-level data include better and more flexible ways to examine subgroups, model complex relationships, deal with methodological and clinical heterogeneity, and examine infrequently occurring behaviors. However, between-study heterogeneity in measures, designs, and samples and systematic study-level missing data are significant barriers to IDA and, more broadly, to large-scale research synthesis. Based on the authors’ experience working on the Project INTEGRATE data set, which combined individual participant-level data from 24 independent college brief alcohol intervention studies, it is also recognized that IDA investigations require a wide range of expertise and considerable resources and that some minimum standards for reporting IDA studies may be needed to improve transparency and quality of evidence.

Download Full-text

Galaxy spin direction distribution in HST and SDSS show similar large-scale asymmetry

Publications of the Astronomical Society of Australia ◽

10.1017/pasa.2020.46 ◽

2020 ◽

Vol 37 ◽

Author(s):

Lior Shamir

Keyword(s):

Large Scale ◽

Spiral Galaxies ◽

Hubble Space Telescope ◽

Gravitational Interaction ◽

Large Data ◽

Sloan Digital Sky Survey ◽

Data Sets ◽

Dipole Axis ◽

Data Set ◽

The Asymmetry

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .

Download Full-text

Amalgamation of cloud-based colonoscopy videos with patient-level metadata to facilitate large-scale machine learning

Endoscopy International Open ◽

10.1055/a-1326-1289 ◽

2021 ◽

Vol 09 (02) ◽

pp. E233-E238

Author(s):

Rajesh N. Keswani ◽

Daniel Byrd ◽

Florencia Garcia Vicente ◽

J. Alex Heller ◽

Matthew Klug ◽

...

Keyword(s):

Machine Learning ◽

Large Scale ◽

Medical Center ◽

Academic Medical Center ◽

Procedure Time ◽

Image Capture ◽

Common Procedure ◽

Patient Level Data ◽

Patient Level ◽

Level Data

Abstract Background and study aims Storage of full-length endoscopic procedures is becoming increasingly popular. To facilitate large-scale machine learning (ML) focused on clinical outcomes, these videos must be merged with the patient-level data in the electronic health record (EHR). Our aim was to present a method of accurately linking patient-level EHR data with cloud stored colonoscopy videos. Methods This study was conducted at a single academic medical center. Most procedure videos are automatically uploaded to the cloud server but are identified only by procedure time and procedure room. We developed and then tested an algorithm to match recorded videos with corresponding exams in the EHR based upon procedure time and room and subsequently extract frames of interest. Results Among 28,611 total colonoscopies performed over the study period, 21,170 colonoscopy videos in 20,420 unique patients (54.2 % male, median age 58) were matched to EHR data. Of 100 randomly sampled videos, appropriate matching was manually confirmed in all. In total, these videos represented 489,721 minutes of colonoscopy performed by 50 endoscopists (median 214 colonoscopies per endoscopist). The most common procedure indications were polyp screening (47.3 %), surveillance (28.9 %) and inflammatory bowel disease (9.4 %). From these videos, we extracted procedure highlights (identified by image capture; mean 8.5 per colonoscopy) and surrounding frames. Conclusions We report the successful merging of a large database of endoscopy videos stored with limited identifiers to rich patient-level data in a highly accurate manner. This technique facilitates the development of ML algorithms based upon relevant patient outcomes.

Download Full-text

Influence of the Upstream Terrain on the Formation of a Cold Frontal Snowband in Northeast China

Asia-Pacific Journal of Atmospheric Sciences ◽

10.1007/s13143-021-00243-4 ◽

2021 ◽

Author(s):

Na Li ◽

Baofeng Jiao ◽

Lingkun Ran ◽

Zongting Gao ◽

Shouting Gao

Keyword(s):

Gravity Waves ◽

Northeast China ◽

Large Scale ◽

Control Experiment ◽

Vertical Motion ◽

Near Surface ◽

Inertial Instability ◽

Two Factors ◽

Negative Effect ◽

Conditional Instability

AbstractWe investigated the influence of upstream terrain on the formation of a cold frontal snowband in Northeast China. We conducted numerical sensitivity experiments that gradually removed the upstream terrain and compared the results with a control experiment. Our results indicate a clear negative effect of upstream terrain on the formation of snowbands, especially over large-scale terrain. By thoroughly examining the ingredients necessary for snowfall (instability, lifting and moisture), we found that the release of mid-level conditional instability, followed by the release of low-level or near surface instabilities (inertial instability, conditional instability or conditional symmetrical instability), contributed to formation of the snowband in both experiments. The lifting required for the release of these instabilities was mainly a result of frontogenetic forcing and upper gravity waves. However, the snowband in the control experiment developed later and was weaker than that in the experiment without upstream terrain. Two factors contributed to this negative topographic effect: (1) the mountain gravity waves over the upstream terrain, which perturbed the frontogenetic circulation by rapidly changing the vertical motion and therefore did not favor the release of instabilities in the absence of persistent ascending motion; and (2) the decrease in the supply of moisture as a result of blocking of the upstream terrain, which changed both the moisture and instability structures leeward of the mountains. A conceptual model is presented that shows the effects of the instabilities and lifting on the development of cold frontal snowbands in downstream mountains.

Download Full-text

Accelerating In-Transit Co-Processing for Scientific Simulations Using Region-Based Data-Driven Analysis

Algorithms ◽

10.3390/a14050154 ◽

2021 ◽

Vol 14 (5) ◽

pp. 154

Author(s):

Marcus Walldén ◽

Masao Okita ◽

Fumihiko Ino ◽

Dimitris Drikakis ◽

Ioannis Kokkinakis

Keyword(s):

Large Scale ◽

Data Driven ◽

Data Sets ◽

Output Constraints ◽

Data Driven Approach ◽

Scientific Simulations ◽

Multiple Metrics ◽

In Transit ◽

Multiple Compression ◽

Large Scale Simulations

Increasing processing capabilities and input/output constraints of supercomputers have increased the use of co-processing approaches, i.e., visualizing and analyzing data sets of simulations on the fly. We present a method that evaluates the importance of different regions of simulation data and a data-driven approach that uses the proposed method to accelerate in-transit co-processing of large-scale simulations. We use the importance metrics to simultaneously employ multiple compression methods on different data regions to accelerate the in-transit co-processing. Our approach strives to adaptively compress data on the fly and uses load balancing to counteract memory imbalances. We demonstrate the method’s efficiency through a fluid mechanics application, a Richtmyer–Meshkov instability simulation, showing how to accelerate the in-transit co-processing of simulations. The results show that the proposed method expeditiously can identify regions of interest, even when using multiple metrics. Our approach achieved a speedup of 1.29× in a lossless scenario. The data decompression time was sped up by 2× compared to using a single compression method uniformly.

Download Full-text

A large-scale retrospective study of opioid poisoning in New York State with implications for targeted interventions

Scientific Reports ◽

10.1038/s41598-021-84148-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Xin Chen ◽

Wei Hou ◽

Sina Rashidian ◽

Yu Wang ◽

Xia Zhao ◽

...

Keyword(s):

New York ◽

Race And Ethnicity ◽

Large Scale ◽

New York State ◽

Temporal Trends ◽

Opioid Overdose ◽

York State ◽

Targeted Interventions ◽

Level Data ◽

Zip Code

AbstractOpioid overdose related deaths have increased dramatically in recent years. Combating the opioid epidemic requires better understanding of the epidemiology of opioid poisoning (OP). To discover trends and patterns of opioid poisoning and the demographic and regional disparities, we analyzed large scale patient visits data in New York State (NYS). Demographic, spatial, temporal and correlation analyses were performed for all OP patients extracted from the claims data in the New York Statewide Planning and Research Cooperative System (SPARCS) from 2010 to 2016, along with Decennial US Census and American Community Survey zip code level data. 58,481 patients with at least one OP diagnosis and a valid NYS zip code address were included. Main outcome and measures include OP patient counts and rates per 100,000 population, patient level factors (gender, age, race and ethnicity, residential zip code), and zip code level social demographic factors. The results showed that the OP rate increased by 364.6%, and by 741.5% for the age group > 65 years. There were wide disparities among groups by race and ethnicity on rates and age distributions of OP. Heroin and non-heroin based OP rates demonstrated distinct temporal trends as well as major geospatial variation. The findings highlighted strong demographic disparity of OP patients, evolving patterns and substantial geospatial variation.

Download Full-text

Developing individualized feedback for listening assessment: Combining standard setting and cognitive diagnostic assessment approaches

Language Testing ◽

10.1177/0265532221995475 ◽

2021 ◽

pp. 026553222199547

Author(s):

Shangchao Min ◽

Lianzhen He

Keyword(s):

Large Scale ◽

Standard Setting ◽

Diagnostic Assessment ◽

Efl Learners ◽

Cognitive Diagnostic Assessment ◽

Item Level ◽

Language Assessments ◽

Individualized Feedback ◽

Proficiency Classifications ◽

Acceptable Reliability

In this study, we present the development of individualized feedback for a large-scale listening assessment by combining standard setting and cognitive diagnostic assessment (CDA) approaches. We used the performance data from 3358 students’ item-level responses to a field test of a national EFL test primarily intended for tertiary-level EFL learners. The results showed that proficiency classifications and subskill mastery classifications were generally of acceptable reliability, and the two kinds of classifications were in alignment with each other at individual and group levels. The outcome of the study is a set of descriptors that describe each test taker’s ability to understand certain level of oral texts and his or her cognitive performance. The current study, by illustrating the feasibility of combining standard setting and CDA approaches to produce individualized feedback, contributes to the enhancement of score reporting and addresses the long-standing criticism that large-scale language assessments fail to provide individualized feedback to link assessment with instruction.

Download Full-text