In the first year of life, infants’ word learning is slow, laborious, and requires long, repeated exposure to word-referent co-occurrences. In contrast, by 14-18 months, infants learn words from just a few labeling events, use joint attention and eye-gaze to decipher word meaning, and begin to use speech to communicate about absent things. We propose that this remarkable advancement in word learning results from attaining verbal reference–a property of words (or other signals) that are linked to mental representations and used intentionally to communicate about real-world referents. We argue that verbal reference is supported by co-developing conceptual, social, representational, and statistical learning capacities. We also propose that infants’ recognition of this tri-directional link between words, referents, and mental representations is fueled by their experience participating in and observing socially contingent interactions. Verbal reference signals a qualitative shift in infants’ word learning. This shift enables infants to bootstrap word meanings from syntax and semantics, learn novel words and facts from non-ostensive communication, and even make inferences about speakers’ epistemic competence based on their language production. In this paper, we review empirical findings across multiple facets of infant cognition, propose a novel developmental theory of verbal reference, and reconcile a long-standing debate on the mechanisms of early word learning. Finally, we propose new directions of empirical research that may provide stronger and more direct evidence for our theory and contribute to our understanding of the development of verbal reference and language-mediated learning in infancy and beyond.