A Case Study
The purpose of this chapter is to evaluate TEDIUM. Evaluation is similar to correctness in that both are always with respect to some external criteria. what criteria should be used for evaluating an environment that develops and maintains software applications using a new paradigm? Clearly, the criteria of the old paradigm (e.g., lines of code, measures of complexity, effort distributed among phases) are irrelevant. In the early days of medical computing, Barnett playfully suggested the following three criteria for evaluating automated medical systems: . . . will people use it? will people pay for it? will people steal it? . . . At the time, the answers to first two questions frequently were negative, and Barnett’s pragmatic approach was intended to prod the field from theory to practice. TEDIUM is used and paid for, but its techniques have not been transported to other environments (i.e., it has not yet been stolen). I console myself by observing that a lack of recognition need not imply an absence of value. The transfer of ideas often is a product of the marketplace, where acceptance depends more on perception than on quantification. As we have seen throughout this book, there can be vast differences between what we care about and what is measurable. Real projects tend to be large, difficult to structure for comparative studies, and highly dependent on local conditions. In contrast, toy studies are easy to control and analyze, but they seldom scale up or have much creditability. How then should I evaluate TEDIUM? I have tried a number of strategies. I have analyzed small projects in detail, I have reported on standard problems comparing TEDIUM data with published results, I have presented and interpreted summary data taken from large projects, I have extracted evaluation criteria from other sources, and I have examined how TEDIUM alters the software process. All of this was summed up in TEDIUM and the Software Process (1990a).