The Y-chromosome short tandem repeat (Y-STR) data are mainly collected for a performance benchmarking result in clustering methods. There are six Y-STR dataset items, divided into two categories: Y-STR surname and Y-haplogroup data presented here. The Y-STR data are categorical, unique, and different from the other categorical data. They are composed of a lot of similar and almost similar objects. This characteristic of the Y-STR data has caused certain problems of the existing clustering algorithms in clustering them.