We examined incidental learning of road signs under divided attention in a simulated naturalistic environment. We tested whether word-based versus symbol-based road signs were differentially maintained in working memory by dividing attention during encoding and measuring the effect on long-term memory. Participants in a lab watched a video from the point of view of a car driving the streets of a small town. Participants were instructed to indicate whether passing road signs in the video were on the left or right side of the street while either singing the Star-Spangled Banner (phonological divided attention) or describing familiar locations (visuospatial divided attention). For purposes of analysis, road signs were categorized as word signs if they contained words (e.g., a STOP sign) or as symbol signs if they contained illustrations or symbols (e.g., a pedestrian crosswalk sign). A surprise free recall test of the road signs indicated greater recall for word signs than symbol signs, and greater recall of signs for the phonological divided attention group than the visuospatial divided attention group. Critically, the proportion of correct recall of symbol signs was significantly lower for the visuospatial divided attention group than the phonological divided attention group, p = .02, d = 0.63, but recall for word signs was not significantly different between phonological and visuospatial groups, p = .09, d = 0.44. Results supported the hypothesis that visuospatial information—but not phonological information—is stored in working memory in a simulated naturalistic environment that involved incidental learning.