The presentation of an auditory stimulus semantically-congruent with a visual element of a multi-objects display can enhance processing of that element. Here we used multisensory objects (MO) as non-informative cues in a spatial cueing paradigm, aiming to directly assess the interplay between MO integration and spatial attention. We presented two pictures (e.g., left — dog, right — cat) plus a central sound (e.g., a dog’s bark) that defined the location of the MO (left, in this example). This was followed by a target (a Gabor patch) either at the MO location or in the opposite hemifield. Subjects discriminated the orientation of the Gabor, while ignoring all task-irrelevant pictures and sounds. Further, we manipulated the task requirements including ‘easy’ or ‘difficult’ discrimination (Gabor tilt = ±5° or ±10°), and by presenting either a single unilateral Gabor (Exp. 1, ‘low’ competition) or two Gabors bilaterally (red and blue, with the target now defined by colour; Exp. 2, ‘high’ competition). Functional imaging data revealed activation of frontal regions when the target was presented on the opposite side of the MO (invalid trials). The frontal eye-fields activated irrespective of task requirements, while the inferior frontal gyrus activated only when the MO-cue was invalid and competition was low (Exp. 1 only). These findings show that MOs automatically affect the distribution of spatial attention, and that re-orienting operations on invalid trials activate dorsal and ventral frontal areas depending on top-down task constraints. Overall, the results are consistent with the hypothesis linking the integration of multisensory objects with biases of spatial attention.