Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Kai Greshake from Saaland University and his colleagues note that while LLMs can produce convincing scams, when integrated into applications they could not only enable the creation of scams but also act as automated social engineers. As this is a new territory without previous experience and awareness of such attacks, users might now trust a search engine’s output over a phishing email. LLMs could be prompted to facilitate fraudulent attempts by, e.g., suggesting phishing or scam websites as trusted or directly asking users for their accounts’ credentials. It is important to note that ChatGPT can create hyperlinks from the users’ input (i.e., the malicious indirect prompt), which attackers could use to add legitimacy and hide the malicious URL itself.