Automated software architecture recovery of module views from source code is a challenging research issue. Different similarity measures are used to evaluate clustering algorithms in the software architecture recovery of module views. However, few studies seek to evaluate whether such measures accurately capture the similarities between two clusterings. This work presents an evaluation of six clustering similarity measures through the use of intrinsic quality and stability measures and the use of ground truth architectures proposed by developers. The results suggest that the MeCl metric is the most adequate to measure similarity in the context of comparison with ground truth models provided by developers. However, when the architectural models do not exist, the Purity metric shows the best results, as measured by the correlation with the intrinsic Silhouette coefficient.