Towards objective measures of algorithm performance across instance space

This article presents a method for the objective assessment of an algorithm’s strengths and weaknesses. Instead of examining the performance of only one or more algorithms on a benchmark set, or generating custom problems that maximize the performance difference between two algorithms, our method quantifies both the nature of the test instances and the algorithm performance. Our aim is to gather information about possible phase transitions in performance, that is, the points in which a small change in problem structure produces algorithm failure. The method is based on the accurate estimation and characterization of the algorithm footprints, that is, the regions of instance space in which good or exceptional performance is expected from an algorithm. A footprint can be estimated for each algorithm and for the overall portfolio. Therefore, we select a set of features to generate a common instance space, which we validate by constructing a sufficiently accurate prediction model. We characterize the footprints by their area and density. Our method identifies complementary performance between algorithms, quantifies the common features of hard problems, and locates regions where a phase transition may lie.

Download Full-text

An Instance Space Analysis of Regression Problems

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3436893 ◽

2021 ◽

Vol 15 (2) ◽

pp. 1-25

Author(s):

Mario Andrés Muñoz ◽

Tao Yan ◽

Matheus R. Leal ◽

Kate Smith-Miles ◽

Ana Carolina Lorena ◽

...

Keyword(s):

Visual Analytics ◽

Visual Analysis ◽

Predictive Performance ◽

Test Problems ◽

Algorithm Performance ◽

Space Analysis ◽

Regression Algorithms ◽

Regression Techniques ◽

Regression Problems ◽

Instance Space

The quest for greater insights into algorithm strengths and weaknesses, as revealed when studying algorithm performance on large collections of test problems, is supported by interactive visual analytics tools. A recent advance is Instance Space Analysis, which presents a visualization of the space occupied by the test datasets, and the performance of algorithms across the instance space. The strengths and weaknesses of algorithms can be visually assessed, and the adequacy of the test datasets can be scrutinized through visual analytics. This article presents the first Instance Space Analysis of regression problems in Machine Learning, considering the performance of 14 popular algorithms on 4,855 test datasets from a variety of sources. The two-dimensional instance space is defined by measurable characteristics of regression problems, selected from over 26 candidate features. It enables the similarities and differences between test instances to be visualized, along with the predictive performance of regression algorithms across the entire instance space. The purpose of creating this framework for visual analysis of an instance space is twofold: one may assess the capability and suitability of various regression techniques; meanwhile the bias, diversity, and level of difficulty of the regression problems popularly used by the community can be visually revealed. This article shows the applicability of the created regression instance space to provide insights into the strengths and weaknesses of regression algorithms, and the opportunities to diversify the benchmark test instances to support greater insights.

Download Full-text

Relationships Between Subjective Ratings and Objective Measures of Performance in Speechreading Sentences

Journal of Speech Language and Hearing Research ◽

10.1044/jslhr.4004.900 ◽

1997 ◽

Vol 40 (4) ◽

pp. 900-911 ◽

Cited By ~ 6

Author(s):

Marilyn E. Demorest ◽

Lynne E. Bernstein

Keyword(s):

Hearing Impairment ◽

Performance Evaluations ◽

Objective Measures ◽

Large Individual ◽

Subjective Ratings ◽

Objective Performance ◽

Phonetic Distance ◽

Stimulus Length ◽

Measures Of Performance ◽

Confidence Scale

Ninety-six participants with normal hearing and 63 with severe-to-profound hearing impairment viewed 100 CID Sentences (Davis & Silverman, 1970) and 100 B-E Sentences (Bernstein & Eberhardt, 1986b). Objective measures included words correct, phonemes correct, and visual-phonetic distance between the stimulus and response. Subjective ratings were made on a 7-point confidence scale. Magnitude of validity coefficients ranged from .34 to .76 across materials, measures, and groups. Participants with hearing impairment had higher levels of objective performance, higher subjective ratings, and higher validity coefficients, although there were large individual differences. Regression analyses revealed that subjective ratings are predictable from stimulus length, response length, and objective performance. The ability of speechreaders to make valid performance evaluations was interpreted in terms of contemporary word recognition models.

Download Full-text