TL:DR A new paper from researchers at Saarland University and CISPA discussed "indirect prompt injection", through which attackers could strategically place malicious instructions into sources that are likely to be ingested by the model at inference time. If retrieved, these poisoned prompts can then indirectly control the LLM and manipulate its behavior without any direct access.