The new FMVSS 208 Federal Regulation requires restraint systems to focus on occupants other than the 50th percentile male. The new focus includes small adults and children. As a result, restraint systems may need to perform differently for several occupant classes, thereby creating a need for occupant classification systems (OCS). A typical regulation compliance strategy is to suppress the restraint system when a child occupies the front passenger seat and to enable the restraints when an adult occupies the seat. The regulation provides specific weight and height ranges to define these classes of seat occupants. The evolution of OCS technologies produced a need for test methodologies and objective metrics to measure classification system capability. The application of the statistical one-sided tolerance interval to OCS systems has proven invaluable in measuring classification performance and driving system improvements. The one-sided tolerance method is based on a single continuous variable, such as weight. A single common threshold, or tolerance limit, is used to compare two competing populations, such as 6-year-old versus 5th percentile female populations. Output of the method produces graphics demonstrating reliability as a function of potential threshold that objectively characterizes a system’s classification performance level. This paper also discusses the importance of applying the one-sided tolerance interval method to performance data that captures the noise sources that impact system performance. For occupant classification systems, noise sources include differences in test subjects’ sizes, how they sit in the seat, and how the seat is set-up. This paper also discusses the importance of sample size selection. Two methods of determining a sample size are presented. The first method uses the one-sided tolerance interval method equation directly. The second method simulates a noise source and selects a sample size where the noise standard deviation converges to its population variance. Once the mean, standard deviation, and sample size for each test case is known, the proposed method computes the reliability of each test case evaluated for a range of potential thresholds. A review of the resulting reliability curves characterizes classification performance. If an acceptable range of thresholds exists, the resulting range is referred to as a “threshold window.” System improvements can be directed toward those test cases that constrain the “threshold window.” This paper proposes a statistical method that can provide a solid measure of the robust capability of an OCS that classifies based on a single continuous variable (such as weight) to distinguish between occupant classes. This statistical method enables the careful balance necessary in setting thresholds.