Associating names to faces can be challenging, but it is an important task that we engage in throughout our lives. An interesting feature of this task is the lack of an inherent, semantic relationship between a face and name. Previous scientific research, as well as common lay theories, offer strategies that can aid in this task (e.g., mnemonics, semantic associations). However, these strategies are either impractical (e.g., spaced repetition) or cumbersome (e.g., mnemonics). The current study seeks to understand whether bolstering names with cross-modal cues—specifically, name tags—may aid memory for face and name pairings. In a series of five experiments, we investigated whether the presentation of congruent auditory (vocal) and written names at encoding might benefit subsequent cued recall and recognition memory tasks. The first experiment consisted of short video clips of individuals verbally introducing themselves (auditory cue), presented with or without a name tag (visual cue). The results showed that participants, cued with a picture of a face, were more likely to recall the associated name when those names were encoded with a name tag (i.e. a congruent visual cue) compared to when no supporting cross-modal cue was available. Subsequent experiments probed the underlying mechanism for this facilitation of memory. The findings were consistent with a benefit of multisensory encoding, above and beyond any effect from the availability of multiple independent unisensory traces. Overall, these results extend previous findings of a benefit of multisensory encoding in learning and memory, to a naturalistic associative memory task.