Prior studies investigating the effects of playing action video games on attentional control have demonstrated improvements on a variety of basic psychophysical tasks. However, as of yet, there is little evidence indicating that the cognitive benefits of playing action video games generalize to naturalistic multisensory scenes - a fundamental characteristic of our natural, everyday life environments. The present study addressed the generalization of attentional control enhancement due to AVGP experience to real-life like scenarios by comparing the performance of action video-game players (AVGPs) with non-players (NVGPs) on a visual search task using naturalistic, dynamic audio-visual scenes. To this end, a questionnaire collecting data on gaming habits and sociodemographic data as well as a visual search task was administered online to a gender-balanced sample of 60 participants of age 18 to 30 years. According to the standard hypothesis, AVGPs outperformed NVGPs in the search task overall, showing faster reaction times without sacrificing accuracy. In addition, in replication of previous findings, semantically congruent cross-modal cues benefited performance overall. However, according to our results, despite the overall advantage in search, and the multisensory congruence benefit, AVGPs did not exploit multisensory cues more efficiently than NVGPs. Exploratory analyses with gender as a variable indicated that the advantage of AVG experience to both genders should be done with caution.