Improving Fidelity of Energy Management Software Testing Through Hierarchical Clustering of Train Consist Data
Abstract Energy management systems, such as New York Air Brake’s LEADER [1], are real-time control technologies that optimize train performance as Level 2 Autonomy systems under the SAE’s “Levels of Driving Automation” classification system [2], and are now commonly used by many railroads. Such systems require extensive testing due to varying requirements of speed and fuel efficiency, compatibility with the wide variation in consists actually marshalled in the field, as well as the potential for the systems to cause break-in-twos or other undesirable situations. Devising accurate test cases that translate well to real-world usage is a common obstacle in the software development process. Using empirical data gathered from sampling field observations and an unsupervised machine learning model, we have created a simple but effective software system capable of performing automated statistical analysis on train consists and recommending a small number of consists which best capture the variation observed on-track. The data produced by such a system is demonstrably useful in developing truly representative test cases for train control systems/energy management software. In this investigation, we first applied such an algorithm to a population of train consists from some arbitrary segment of North American track to identify the most representative sample. We then evaluated the performance of the LEADER driving strategy for the sample set of consists with one of two consists that had previously been used for ad-hoc development testing of the software. Our findings from these simulations indicate that the consists identified by the clustering algorithm display greater variation in LEADER-controlled performance across several features than the ad-hoc testing consists do. Such metrics are transit time, fuel consumption, speed limit adherence, and air brake usage. Application of the algorithm is therefore beneficial in that it allows for more efficient and more thorough testing and characterization of energy management software.