Selective listening to cocktail-party speech involves a network of auditory and inferior frontal cortical regions. However, cognitive and motor cortical regions are differentially activated depending on whether the task emphasizes semantic or phonological aspects of speech. Here we tested whether processing of cocktail-party speech differs when participants perform a shadowing (immediate speech repetition) task compared to an attentive listening task in the presence of irrelevant speech. Participants viewed audiovisual dialogues with concurrent distracting speech during functional imaging. Participants either attentively listened to the dialogue, overtly repeated (i.e., shadowed) attended speech, or performed visual or speech motor control tasks where they did not attend to speech and responses were not related to the speech input. Dialogues were presented with good or poor auditory and visual quality. As a novel result, we show that attentive processing of speech activated the same network of sensory and frontal regions during listening and shadowing. However, in the superior temporal gyrus (STG), peak activations during shadowing were posterior to those during listening, suggesting that an anterior–posterior distinction is present for motor vs. perceptual processing of speech already at the level of the auditory cortex. We also found that activations along the dorsal auditory processing stream were specifically associated with the shadowing task. These activations are likely to be due to complex interactions between perceptual, attention dependent speech processing and motor speech generation that matches the heard speech. Our results suggest that interactions between perceptual and motor processing of speech relies on a distributed network of temporal and motor regions rather than any specific anatomical landmark as suggested by some previous studies.