Learning Classifier Systems (LCSs) are a paradigm of rule-based evolutionary computation (EC). LCSs excel in data-mining tasks regarding helping humans to understand the explored problem, often through visualizing the discovered patterns linking features to classes. Due to the stochastic nature of EC, LCSs unavoidably produce and keep redundant rules, which obscure the patterns. Thus, rule compaction methods are invoked to produce a better population by removing problematic rules. Previously, compaction methods have neither been tested on large-scale problems nor been assessed on the performance of capturing patterns. We review and test the most popular compaction algorithms, finding that across multiple LCSs’ populations for the same task, although the redundant rules can be different, the accurate rules are common. Furthermore, the patterns contained consistently refer to the nature of the explored domain, e.g., the data distribution or the importance of features for determining actions. This extends the [
O
] set hypothesis proposed by Butz et al. [1], in which an LCS is expected to evolve a minimal number of non-overlapped rules to represent an addressed domain. Two new compaction algorithms are introduced to search at the rule level and the population level by compacting multiple LCSs’ populations. Two visualization methods are employed for verifying the interpretability of these populations. Successful compaction is demonstrated on complex and real problems with clean datasets, e.g., the 11-bits Majority-On problem that requires 924 different interacting rules in the optimal solution to be uniquely identified to enable correct visualization. For the first time, the patterns contained in learned models for the large-scale 70-bits Multiplexer problem are visualized successfully.