State-of-the-art artificial neural networks (ANNs) require enormous amounts of data to learn object categories. By contrast, human learning is fast and efficient. Most impressive is our capacity for ‘one-shot learning’, in which experience with a single exemplar permits inferences about a larger class of objects. This remarkable feat of categorization is integral to decision making but, surprisingly, remains poorly understood. Here we tested whether invariant object structure—namely, an object’s internal skeleton—supports one-shot category learning in human infants, a population with limited object experience and language. Across two experiments, 6- to 12-month-olds (Mage = 9.29 months; N = 82) were habituated to a single, never-before-seen object. They were then tested with objects that differed from the habituated object in their external features and either matched or mismatched in their skeletal structure. We found that infants only dishabituated to objects with different skeletons, as predicted if objects with the same skeleton belonged to the same class of objects. By contrast, two different ANN architectures (AlexNet and ResNet-50), trained with millions of either curated (ImageNet) or variable (Stylized-ImageNet) images, failed to categorize objects under the same conditions. Taken together, these findings suggest that single exemplar categorization reflects an early-developing sensitivity of the human visual system to perceptually invariant object structure.