Embodiment can enhance conversational agents, such as increasing their perceived presence. This is typically achieved through visual representations of a virtual body; however, visual modalities are not always available, such as when users interact with agents through headphones or display-less glasses. In social contexts, even when people cannot see those around them, their perception is shaped by both the spatial location of voices and the sounds arising from bodily interactions with the physical environment. Drawing on this, we investigate whether auditory embodiment – via spatialization and situated audio cues – influences how users perceive conversational agents. To this end, we conducted a 2 (spatialization: mono vs. binaural) × 2 (situated audio cues: none vs. present) within-subjects experiment, where participants (n=24) engaged in an open-ended conversation with conversational agents. Our results show that both spatialization and situated audio cues positively influence co-presence, but reduce attention and other social factors.