Research on attentional control has largely focused on single senses and the importance of one s behavioural goals in controlling attention. However, everyday situations are multisensory and contain regularities, both likely influencing attention. We investigated how visual attentional capture is simultaneously impacted by top-down goals, multisensory nature of stimuli, and contextual factors of stimulus semantic relationship and predictability. Participants performed a multisensory version of the Folk et al., (1992) spatial cueing paradigm, searching for a target of a predefined colour (e.g. a red bar) within an array preceded by a distractor. We manipulated: 1) stimulus goal-relevance via distractor s colour (matching vs. mismatching the target), 2) stimulus multisensory nature (colour distractors appearing alone vs. with tones), 3) relationship between the distractor sound and colour (arbitrary vs. semantically congruent) and 4) predictability of the distractor onset. Reaction-time spatial cueing served as a behavioural measure of attentional selection. We also recorded 129-channel event-related potentials (ERPs), analysing the distractor elicited N2pc component both canonically and using a multivariate electrical neuroimaging (EN) framework. Behaviourally, arbitrary target-matching distractors captured attention more strongly than semantically congruent ones, with no evidence for context modulating multisensory enhancements of capture. Notably, EN analyses revealed context-based influences on attention to both visual and multisensory distractors, in how strongly they activated the brain and type of activated brain networks. In both cases, these context-driven brain response modulations occurred long before the N2pc timewindow, with network-based modulations at app. 30ms, followed by strength-based modulations at app. 100ms post-distractor. This points to meaning being a second source, next to predictions, of contextual information facilitating goal-directed behaviour. More broadly, in everyday situations, attentional is controlled by an interplay between one s goals, stimulus perceptual salience and stimulus meaning and predictability. Our study calls for a revision of attentional control theories to account for the role of contextual and multisensory control.